several attstattarget-related improvements

Started by Peter Eisentrautover 2 years ago4 messages
#1Peter Eisentraut
peter@eisentraut.org
3 attachment(s)

Here are a few patches related to attstattarget:

- 0001: Change type of pg_statistic_ext.stxstattarget, to match
attstattarget. Maybe this should go into PG16, for consistency?

- 0002: Add macro for maximum statistics target, instead of hardcoding
it everywhere.

- 0003: Take pg_attribute out of VacAttrStats. This simplifies some
code, especially for extended statistics, which had to have weird
workarounds.

Attachments:

0001-Change-type-of-pg_statistic_ext.stxstattarget.patchtext/plain; charset=UTF-8; name=0001-Change-type-of-pg_statistic_ext.stxstattarget.patchDownload
From f01d4c9aba44cc5b9028200b21a6d1df9770c5a2 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 27 Jun 2023 16:38:00 +0200
Subject: [PATCH 1/3] Change type of pg_statistic_ext.stxstattarget

Change from int32 to int16, to match attstattarget (changed in
90189eefc1).
---
 doc/src/sgml/catalogs.sgml             | 2 +-
 src/backend/commands/statscmds.c       | 4 ++--
 src/include/catalog/pg_statistic_ext.h | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index ed32ca0349..852cb30ae1 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -7622,7 +7622,7 @@ <title><structname>pg_statistic_ext</structname> Columns</title>
 
      <row>
       <entry role="catalog_table_entry"><para role="column_definition">
-       <structfield>stxstattarget</structfield> <type>int4</type>
+       <structfield>stxstattarget</structfield> <type>int2</type>
       </para>
       <para>
        <structfield>stxstattarget</structfield> controls the level of detail
diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 26ebd0819d..c2dab20bdc 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -498,7 +498,7 @@ CreateStatistics(CreateStatsStmt *stmt)
 	values[Anum_pg_statistic_ext_stxrelid - 1] = ObjectIdGetDatum(relid);
 	values[Anum_pg_statistic_ext_stxname - 1] = NameGetDatum(&stxname);
 	values[Anum_pg_statistic_ext_stxnamespace - 1] = ObjectIdGetDatum(namespaceId);
-	values[Anum_pg_statistic_ext_stxstattarget - 1] = Int32GetDatum(-1);
+	values[Anum_pg_statistic_ext_stxstattarget - 1] = Int16GetDatum(-1);
 	values[Anum_pg_statistic_ext_stxowner - 1] = ObjectIdGetDatum(stxowner);
 	values[Anum_pg_statistic_ext_stxkeys - 1] = PointerGetDatum(stxkeys);
 	values[Anum_pg_statistic_ext_stxkind - 1] = PointerGetDatum(stxkind);
@@ -676,7 +676,7 @@ AlterStatistics(AlterStatsStmt *stmt)
 
 	/* replace the stxstattarget column */
 	repl_repl[Anum_pg_statistic_ext_stxstattarget - 1] = true;
-	repl_val[Anum_pg_statistic_ext_stxstattarget - 1] = Int32GetDatum(newtarget);
+	repl_val[Anum_pg_statistic_ext_stxstattarget - 1] = Int16GetDatum(newtarget);
 
 	newtup = heap_modify_tuple(oldtup, RelationGetDescr(rel),
 							   repl_val, repl_null, repl_repl);
diff --git a/src/include/catalog/pg_statistic_ext.h b/src/include/catalog/pg_statistic_ext.h
index 53eec9025a..1e64aa8f16 100644
--- a/src/include/catalog/pg_statistic_ext.h
+++ b/src/include/catalog/pg_statistic_ext.h
@@ -43,7 +43,7 @@ CATALOG(pg_statistic_ext,3381,StatisticExtRelationId)
 														 * object's namespace */
 
 	Oid			stxowner BKI_LOOKUP(pg_authid); /* statistics object's owner */
-	int32		stxstattarget BKI_DEFAULT(-1);	/* statistics target */
+	int16		stxstattarget BKI_DEFAULT(-1);	/* statistics target */
 
 	/*
 	 * variable-length fields start here, but we allow direct access to
-- 
2.41.0

0002-Add-macro-for-maximum-statistics-target.patchtext/plain; charset=UTF-8; name=0002-Add-macro-for-maximum-statistics-target.patchDownload
From a473c258577f9f9b28435feef27e8cbc25721d5c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 27 Jun 2023 14:09:39 +0200
Subject: [PATCH 2/3] Add macro for maximum statistics target

The number of places where 10000 was hardcoded had grown a bit beyond
the comfort level.  Introduce a macro MAX_STATISTICS_TARGET instead.
---
 src/backend/commands/statscmds.c        | 4 ++--
 src/backend/commands/tablecmds.c        | 5 +++--
 src/backend/statistics/extended_stats.c | 2 +-
 src/backend/utils/misc/guc_tables.c     | 2 +-
 src/include/catalog/pg_attribute.h      | 2 +-
 src/include/commands/vacuum.h           | 7 +++++++
 src/include/statistics/statistics.h     | 4 ++--
 7 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index c2dab20bdc..36bc8c33ba 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -619,9 +619,9 @@ AlterStatistics(AlterStatsStmt *stmt)
 				 errmsg("statistics target %d is too low",
 						newtarget)));
 	}
-	else if (newtarget > 10000)
+	else if (newtarget > MAX_STATISTICS_TARGET)
 	{
-		newtarget = 10000;
+		newtarget = MAX_STATISTICS_TARGET;
 		ereport(WARNING,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("lowering statistics target to %d",
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d985278ac6..fce5e6f220 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -61,6 +61,7 @@
 #include "commands/trigger.h"
 #include "commands/typecmds.h"
 #include "commands/user.h"
+#include "commands/vacuum.h"
 #include "executor/executor.h"
 #include "foreign/fdwapi.h"
 #include "foreign/foreign.h"
@@ -8180,9 +8181,9 @@ ATExecSetStatistics(Relation rel, const char *colName, int16 colNum, Node *newVa
 				 errmsg("statistics target %d is too low",
 						newtarget)));
 	}
-	else if (newtarget > 10000)
+	else if (newtarget > MAX_STATISTICS_TARGET)
 	{
-		newtarget = 10000;
+		newtarget = MAX_STATISTICS_TARGET;
 		ereport(WARNING,
 				(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 				 errmsg("lowering statistics target to %d",
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 28b52d8aa1..513ddf540a 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -379,7 +379,7 @@ statext_compute_stattarget(int stattarget, int nattrs, VacAttrStats **stats)
 		stattarget = default_statistics_target;
 
 	/* As this point we should have a valid statistics target. */
-	Assert((stattarget >= 0) && (stattarget <= 10000));
+	Assert((stattarget >= 0) && (stattarget <= MAX_STATISTICS_TARGET));
 
 	return stattarget;
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 71e27f8eb0..f8ef87d26d 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -2035,7 +2035,7 @@ struct config_int ConfigureNamesInt[] =
 						 "column-specific target set via ALTER TABLE SET STATISTICS.")
 		},
 		&default_statistics_target,
-		100, 1, 10000,
+		100, 1, MAX_STATISTICS_TARGET,
 		NULL, NULL, NULL
 	},
 	{
diff --git a/src/include/catalog/pg_attribute.h b/src/include/catalog/pg_attribute.h
index f8b4861b94..f00df488ce 100644
--- a/src/include/catalog/pg_attribute.h
+++ b/src/include/catalog/pg_attribute.h
@@ -165,7 +165,7 @@ CATALOG(pg_attribute,1249,AttributeRelationId) BKI_BOOTSTRAP BKI_ROWTYPE_OID(75,
 	 * that no value has been explicitly set for this column, so ANALYZE
 	 * should use the default setting.
 	 *
-	 * int16 is sufficient because the max value is currently 10000.
+	 * int16 is sufficient for the current max value (MAX_STATISTICS_TARGET).
 	 */
 	int16		attstattarget BKI_DEFAULT(-1);
 
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index cb5b11ab31..bda7792770 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -305,6 +305,13 @@ extern PGDLLIMPORT int vacuum_multixact_freeze_table_age;
 extern PGDLLIMPORT int vacuum_failsafe_age;
 extern PGDLLIMPORT int vacuum_multixact_failsafe_age;
 
+/*
+ * Maximum value for default_statistics_target and per-column statistics
+ * targets.  This is fairly arbitrary, mainly to prevent users from creating
+ * unreasonably large statistics that the system cannot handle well.
+ */
+#define MAX_STATISTICS_TARGET 10000
+
 /* Variables for cost-based parallel vacuum */
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumSharedCostBalance;
 extern PGDLLIMPORT pg_atomic_uint32 *VacuumActiveNWorkers;
diff --git a/src/include/statistics/statistics.h b/src/include/statistics/statistics.h
index 17e3e7f881..5e538fec32 100644
--- a/src/include/statistics/statistics.h
+++ b/src/include/statistics/statistics.h
@@ -66,8 +66,8 @@ typedef struct MVDependencies
 #define STATS_MCV_MAGIC			0xE1A651C2	/* marks serialized bytea */
 #define STATS_MCV_TYPE_BASIC	1	/* basic MCV list type */
 
-/* max items in MCV list (should be equal to max default_statistics_target) */
-#define STATS_MCVLIST_MAX_ITEMS        10000
+/* max items in MCV list */
+#define STATS_MCVLIST_MAX_ITEMS        MAX_STATISTICS_TARGET
 
 /*
  * Multivariate MCV (most-common value) lists
-- 
2.41.0

0003-Take-pg_attribute-out-of-VacAttrStats.patchtext/plain; charset=UTF-8; name=0003-Take-pg_attribute-out-of-VacAttrStats.patchDownload
From d01f498ac24f2f8116357776b585fc401536df35 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 27 Jun 2023 16:29:13 +0200
Subject: [PATCH 3/3] Take pg_attribute out of VacAttrStats

The VacAttrStats structure contained the whole Form_pg_attribute for a
column, but it actually only needs attstattarget from there.  So
remove the Form_pg_attribute field and make a separate field for
attstattarget.  This simplifies some code for extended statistics that
doesn't deal with a column but an expression, which had to fake up
pg_attribute rows to satisfy internal APIs.  Also, we can remove some
comments that essentially said "don't look at pg_attribute directly".
---
 src/backend/commands/analyze.c                | 33 ++++++-------
 src/backend/statistics/extended_stats.c       | 48 +++++--------------
 src/backend/tsearch/ts_typanalyze.c           | 10 ++--
 src/backend/utils/adt/array_typanalyze.c      |  4 +-
 src/backend/utils/adt/rangetypes_typanalyze.c | 16 +++----
 src/include/commands/vacuum.h                 | 12 ++---
 6 files changed, 43 insertions(+), 80 deletions(-)

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index fc9a371f9b..9c8413eef2 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -572,7 +572,7 @@ do_analyze_rel(Relation onerel, VacuumParams *params,
 			 * If the appropriate flavor of the n_distinct option is
 			 * specified, override with the corresponding value.
 			 */
-			aopt = get_attribute_options(onerel->rd_id, stats->attr->attnum);
+			aopt = get_attribute_options(onerel->rd_id, stats->tupattnum);
 			if (aopt != NULL)
 			{
 				float8		n_distinct;
@@ -927,7 +927,7 @@ compute_index_stats(Relation onerel, double totalrows,
 				for (i = 0; i < attr_cnt; i++)
 				{
 					VacAttrStats *stats = thisdata->vacattrstats[i];
-					int			attnum = stats->attr->attnum;
+					int			attnum = stats->tupattnum;
 
 					if (isnull[attnum - 1])
 					{
@@ -1014,12 +1014,10 @@ examine_attribute(Relation onerel, int attnum, Node *index_expr)
 		return NULL;
 
 	/*
-	 * Create the VacAttrStats struct.  Note that we only have a copy of the
-	 * fixed fields of the pg_attribute tuple.
+	 * Create the VacAttrStats struct.
 	 */
 	stats = (VacAttrStats *) palloc0(sizeof(VacAttrStats));
-	stats->attr = (Form_pg_attribute) palloc(ATTRIBUTE_FIXED_PART_SIZE);
-	memcpy(stats->attr, attr, ATTRIBUTE_FIXED_PART_SIZE);
+	stats->attstattarget = attr->attstattarget;
 
 	/*
 	 * When analyzing an expression index, believe the expression tree's type
@@ -1086,7 +1084,6 @@ examine_attribute(Relation onerel, int attnum, Node *index_expr)
 	if (!ok || stats->compute_stats == NULL || stats->minrows <= 0)
 	{
 		heap_freetuple(typtuple);
-		pfree(stats->attr);
 		pfree(stats);
 		return NULL;
 	}
@@ -1659,7 +1656,7 @@ update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
 		}
 
 		values[Anum_pg_statistic_starelid - 1] = ObjectIdGetDatum(relid);
-		values[Anum_pg_statistic_staattnum - 1] = Int16GetDatum(stats->attr->attnum);
+		values[Anum_pg_statistic_staattnum - 1] = Int16GetDatum(stats->tupattnum);
 		values[Anum_pg_statistic_stainherit - 1] = BoolGetDatum(inh);
 		values[Anum_pg_statistic_stanullfrac - 1] = Float4GetDatum(stats->stanullfrac);
 		values[Anum_pg_statistic_stawidth - 1] = Int32GetDatum(stats->stawidth);
@@ -1725,7 +1722,7 @@ update_attstats(Oid relid, bool inh, int natts, VacAttrStats **vacattrstats)
 		/* Is there already a pg_statistic tuple for this attribute? */
 		oldtup = SearchSysCache3(STATRELATTINH,
 								 ObjectIdGetDatum(relid),
-								 Int16GetDatum(stats->attr->attnum),
+								 Int16GetDatum(stats->tupattnum),
 								 BoolGetDatum(inh));
 
 		/* Open index information when we know we need it */
@@ -1860,15 +1857,13 @@ static int	analyze_mcv_list(int *mcv_counts,
 bool
 std_typanalyze(VacAttrStats *stats)
 {
-	Form_pg_attribute attr = stats->attr;
 	Oid			ltopr;
 	Oid			eqopr;
 	StdAnalyzeData *mystats;
 
 	/* If the attstattarget column is negative, use the default value */
-	/* NB: it is okay to scribble on stats->attr since it's a copy */
-	if (attr->attstattarget < 0)
-		attr->attstattarget = default_statistics_target;
+	if (stats->attstattarget < 0)
+		stats->attstattarget = default_statistics_target;
 
 	/* Look for default "<" and "=" operators for column's type */
 	get_sort_group_operators(stats->attrtypid,
@@ -1909,21 +1904,21 @@ std_typanalyze(VacAttrStats *stats)
 		 * know it at this point.
 		 *--------------------
 		 */
-		stats->minrows = 300 * attr->attstattarget;
+		stats->minrows = 300 * stats->attstattarget;
 	}
 	else if (OidIsValid(eqopr))
 	{
 		/* We can still recognize distinct values */
 		stats->compute_stats = compute_distinct_stats;
 		/* Might as well use the same minrows as above */
-		stats->minrows = 300 * attr->attstattarget;
+		stats->minrows = 300 * stats->attstattarget;
 	}
 	else
 	{
 		/* Can't do much but the trivial stuff */
 		stats->compute_stats = compute_trivial_stats;
 		/* Might as well use the same minrows as above */
-		stats->minrows = 300 * attr->attstattarget;
+		stats->minrows = 300 * stats->attstattarget;
 	}
 
 	return true;
@@ -2051,7 +2046,7 @@ compute_distinct_stats(VacAttrStatsP stats,
 	TrackItem  *track;
 	int			track_cnt,
 				track_max;
-	int			num_mcv = stats->attr->attstattarget;
+	int			num_mcv = stats->attstattarget;
 	StdAnalyzeData *mystats = (StdAnalyzeData *) stats->extra_data;
 
 	/*
@@ -2392,8 +2387,8 @@ compute_scalar_stats(VacAttrStatsP stats,
 	int		   *tupnoLink;
 	ScalarMCVItem *track;
 	int			track_cnt = 0;
-	int			num_mcv = stats->attr->attstattarget;
-	int			num_bins = stats->attr->attstattarget;
+	int			num_mcv = stats->attstattarget;
+	int			num_bins = stats->attstattarget;
 	StdAnalyzeData *mystats = (StdAnalyzeData *) stats->extra_data;
 
 	values = (ScalarItem *) palloc(samplerows * sizeof(ScalarItem));
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 513ddf540a..9f67a57724 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -366,8 +366,8 @@ statext_compute_stattarget(int stattarget, int nattrs, VacAttrStats **stats)
 	for (i = 0; i < nattrs; i++)
 	{
 		/* keep the maximum statistics target */
-		if (stats[i]->attr->attstattarget > stattarget)
-			stattarget = stats[i]->attr->attstattarget;
+		if (stats[i]->attstattarget > stattarget)
+			stattarget = stats[i]->attstattarget;
 	}
 
 	/*
@@ -534,14 +534,10 @@ examine_attribute(Node *expr)
 	bool		ok;
 
 	/*
-	 * Create the VacAttrStats struct.  Note that we only have a copy of the
-	 * fixed fields of the pg_attribute tuple.
+	 * Create the VacAttrStats struct.
 	 */
 	stats = (VacAttrStats *) palloc0(sizeof(VacAttrStats));
-
-	/* fake the attribute */
-	stats->attr = (Form_pg_attribute) palloc0(ATTRIBUTE_FIXED_PART_SIZE);
-	stats->attr->attstattarget = -1;
+	stats->attstattarget = -1;
 
 	/*
 	 * When analyzing an expression, believe the expression tree's type not
@@ -595,7 +591,6 @@ examine_attribute(Node *expr)
 	if (!ok || stats->compute_stats == NULL || stats->minrows <= 0)
 	{
 		heap_freetuple(typtuple);
-		pfree(stats->attr);
 		pfree(stats);
 		return NULL;
 	}
@@ -624,6 +619,13 @@ examine_expression(Node *expr, int stattarget)
 	 */
 	stats = (VacAttrStats *) palloc0(sizeof(VacAttrStats));
 
+	/*
+	 * We can't have statistics target specified for the expression, so we
+	 * could use either the default_statistics_target, or the target computed
+	 * for the extended statistics. The second option seems more reasonable.
+	 */
+	stats->attstattarget = stattarget;
+
 	/*
 	 * When analyzing an expression, believe the expression tree's type.
 	 */
@@ -638,25 +640,6 @@ examine_expression(Node *expr, int stattarget)
 	 */
 	stats->attrcollid = exprCollation(expr);
 
-	/*
-	 * We don't have any pg_attribute for expressions, so let's fake something
-	 * reasonable into attstattarget, which is the only thing std_typanalyze
-	 * needs.
-	 */
-	stats->attr = (Form_pg_attribute) palloc(ATTRIBUTE_FIXED_PART_SIZE);
-
-	/*
-	 * We can't have statistics target specified for the expression, so we
-	 * could use either the default_statistics_target, or the target computed
-	 * for the extended statistics. The second option seems more reasonable.
-	 */
-	stats->attr->attstattarget = stattarget;
-
-	/* initialize some basic fields */
-	stats->attr->attrelid = InvalidOid;
-	stats->attr->attnum = InvalidAttrNumber;
-	stats->attr->atttypid = stats->attrtypid;
-
 	typtuple = SearchSysCacheCopy1(TYPEOID,
 								   ObjectIdGetDatum(stats->attrtypid));
 	if (!HeapTupleIsValid(typtuple))
@@ -747,12 +730,6 @@ lookup_var_attr_stats(Relation rel, Bitmapset *attrs, List *exprs,
 			return NULL;
 		}
 
-		/*
-		 * Sanity check that the column is not dropped - stats should have
-		 * been removed in this case.
-		 */
-		Assert(!stats[i]->attr->attisdropped);
-
 		i++;
 	}
 
@@ -2237,8 +2214,7 @@ compute_expr_stats(Relation onerel, double totalrows,
 		if (tcnt > 0)
 		{
 			AttributeOpts *aopt =
-				get_attribute_options(stats->attr->attrelid,
-									  stats->attr->attnum);
+				get_attribute_options(onerel->rd_id, stats->tupattnum);
 
 			stats->exprvals = exprvals;
 			stats->exprnulls = exprnulls;
diff --git a/src/backend/tsearch/ts_typanalyze.c b/src/backend/tsearch/ts_typanalyze.c
index 75ecd72efe..6594322733 100644
--- a/src/backend/tsearch/ts_typanalyze.c
+++ b/src/backend/tsearch/ts_typanalyze.c
@@ -58,16 +58,14 @@ Datum
 ts_typanalyze(PG_FUNCTION_ARGS)
 {
 	VacAttrStats *stats = (VacAttrStats *) PG_GETARG_POINTER(0);
-	Form_pg_attribute attr = stats->attr;
 
 	/* If the attstattarget column is negative, use the default value */
-	/* NB: it is okay to scribble on stats->attr since it's a copy */
-	if (attr->attstattarget < 0)
-		attr->attstattarget = default_statistics_target;
+	if (stats->attstattarget < 0)
+		stats->attstattarget = default_statistics_target;
 
 	stats->compute_stats = compute_tsvector_stats;
 	/* see comment about the choice of minrows in commands/analyze.c */
-	stats->minrows = 300 * attr->attstattarget;
+	stats->minrows = 300 * stats->attstattarget;
 
 	PG_RETURN_BOOL(true);
 }
@@ -169,7 +167,7 @@ compute_tsvector_stats(VacAttrStats *stats,
 	 * the number of individual lexeme values tracked in pg_statistic ought to
 	 * be more than the number of values for a simple scalar column.
 	 */
-	num_mcelem = stats->attr->attstattarget * 10;
+	num_mcelem = stats->attstattarget * 10;
 
 	/*
 	 * We set bucket width equal to (num_mcelem + 10) / 0.007 as per the
diff --git a/src/backend/utils/adt/array_typanalyze.c b/src/backend/utils/adt/array_typanalyze.c
index 52e160d6bb..04b3764b68 100644
--- a/src/backend/utils/adt/array_typanalyze.c
+++ b/src/backend/utils/adt/array_typanalyze.c
@@ -263,7 +263,7 @@ compute_array_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc,
 	 * the number of individual elements tracked in pg_statistic ought to be
 	 * more than the number of values for a simple scalar column.
 	 */
-	num_mcelem = stats->attr->attstattarget * 10;
+	num_mcelem = stats->attstattarget * 10;
 
 	/*
 	 * We set bucket width equal to num_mcelem / 0.007 as per the comment
@@ -575,7 +575,7 @@ compute_array_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc,
 		count_items_count = hash_get_num_entries(count_tab);
 		if (count_items_count > 0)
 		{
-			int			num_hist = stats->attr->attstattarget;
+			int			num_hist = stats->attstattarget;
 			DECountItem **sorted_count_items;
 			int			j;
 			int			delta;
diff --git a/src/backend/utils/adt/rangetypes_typanalyze.c b/src/backend/utils/adt/rangetypes_typanalyze.c
index 86810a1a6e..d13d8d2c53 100644
--- a/src/backend/utils/adt/rangetypes_typanalyze.c
+++ b/src/backend/utils/adt/rangetypes_typanalyze.c
@@ -47,18 +47,17 @@ range_typanalyze(PG_FUNCTION_ARGS)
 {
 	VacAttrStats *stats = (VacAttrStats *) PG_GETARG_POINTER(0);
 	TypeCacheEntry *typcache;
-	Form_pg_attribute attr = stats->attr;
 
 	/* Get information about range type; note column might be a domain */
 	typcache = range_get_typcache(fcinfo, getBaseType(stats->attrtypid));
 
-	if (attr->attstattarget < 0)
-		attr->attstattarget = default_statistics_target;
+	if (stats->attstattarget < 0)
+		stats->attstattarget = default_statistics_target;
 
 	stats->compute_stats = compute_range_stats;
 	stats->extra_data = typcache;
 	/* same as in std_typanalyze */
-	stats->minrows = 300 * attr->attstattarget;
+	stats->minrows = 300 * stats->attstattarget;
 
 	PG_RETURN_BOOL(true);
 }
@@ -74,18 +73,17 @@ multirange_typanalyze(PG_FUNCTION_ARGS)
 {
 	VacAttrStats *stats = (VacAttrStats *) PG_GETARG_POINTER(0);
 	TypeCacheEntry *typcache;
-	Form_pg_attribute attr = stats->attr;
 
 	/* Get information about multirange type; note column might be a domain */
 	typcache = multirange_get_typcache(fcinfo, getBaseType(stats->attrtypid));
 
-	if (attr->attstattarget < 0)
-		attr->attstattarget = default_statistics_target;
+	if (stats->attstattarget < 0)
+		stats->attstattarget = default_statistics_target;
 
 	stats->compute_stats = compute_range_stats;
 	stats->extra_data = typcache;
 	/* same as in std_typanalyze */
-	stats->minrows = 300 * attr->attstattarget;
+	stats->minrows = 300 * stats->attstattarget;
 
 	PG_RETURN_BOOL(true);
 }
@@ -136,7 +134,7 @@ compute_range_stats(VacAttrStats *stats, AnalyzeAttrFetchFunc fetchfunc,
 	int			empty_cnt = 0;
 	int			range_no;
 	int			slot_idx;
-	int			num_bins = stats->attr->attstattarget;
+	int			num_bins = stats->attstattarget;
 	int			num_hist;
 	float8	   *lengths;
 	RangeBound *lowers,
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index bda7792770..b027fb2c67 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -116,16 +116,12 @@ typedef struct VacAttrStats
 {
 	/*
 	 * These fields are set up by the main ANALYZE code before invoking the
-	 * type-specific typanalyze function.
-	 *
-	 * Note: do not assume that the data being analyzed has the same datatype
-	 * shown in attr, ie do not trust attr->atttypid, attlen, etc.  This is
-	 * because some index opclasses store a different type than the underlying
-	 * column/expression.  Instead use attrtypid, attrtypmod, and attrtype for
+	 * type-specific typanalyze function.  They don't necessarily match what
+	 * is in pg_attribute, because some index opclasses store a different type
+	 * than the underlying column/expression.  Therefore, use these fields for
 	 * information about the datatype being fed to the typanalyze function.
-	 * Likewise, use attrcollid not attr->attcollation.
 	 */
-	Form_pg_attribute attr;		/* copy of pg_attribute row for column */
+	int			attstattarget;
 	Oid			attrtypid;		/* type of data being analyzed */
 	int32		attrtypmod;		/* typmod of data being analyzed */
 	Form_pg_type attrtype;		/* copy of pg_type row for attrtypid */
-- 
2.41.0

#2Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Peter Eisentraut (#1)
Re: several attstattarget-related improvements

On 6/28/23 16:52, Peter Eisentraut wrote:

Here are a few patches related to attstattarget:

- 0001: Change type of pg_statistic_ext.stxstattarget, to match
attstattarget.  Maybe this should go into PG16, for consistency?

- 0002: Add macro for maximum statistics target, instead of hardcoding
it everywhere.

- 0003: Take pg_attribute out of VacAttrStats.  This simplifies some
code, especially for extended statistics, which had to have weird
workarounds.

+1 to all three patches

Not sure about 0001 vs PG16, it'd require catversion bump.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tomas Vondra (#2)
Re: several attstattarget-related improvements

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

On 6/28/23 16:52, Peter Eisentraut wrote:

- 0001: Change type of pg_statistic_ext.stxstattarget, to match
attstattarget.  Maybe this should go into PG16, for consistency?

Not sure about 0001 vs PG16, it'd require catversion bump.

Yeah, past beta1 I think we should be conservative about bumping
catversion. Suggest you hold this for now, and if we find some
more-compelling reason for a catversion bump in v16, we can sneak
it in at that time. Otherwise, I won't cry if it waits for v17.

regards, tom lane

#4Peter Eisentraut
peter@eisentraut.org
In reply to: Tomas Vondra (#2)
Re: several attstattarget-related improvements

On 28.06.23 23:30, Tomas Vondra wrote:

On 6/28/23 16:52, Peter Eisentraut wrote:

Here are a few patches related to attstattarget:

- 0001: Change type of pg_statistic_ext.stxstattarget, to match
attstattarget.  Maybe this should go into PG16, for consistency?

- 0002: Add macro for maximum statistics target, instead of hardcoding
it everywhere.

- 0003: Take pg_attribute out of VacAttrStats.  This simplifies some
code, especially for extended statistics, which had to have weird
workarounds.

+1 to all three patches

Not sure about 0001 vs PG16, it'd require catversion bump.

committed (to PG17 for now)