clarify equalTupleDescs()
In a recent patch thread it was discussed[0]/messages/by-id/202401101316.k4s3fomwjx52@alvherre.pgsql which fields should be
compared by equalTupleDescs() and whether it is ok to remove a field
from tuple descriptors and how that should affect their equality
(attstattarget in that case).
After analyzing all the callers, I have noticed that there are two
classes of callers of equalTupleDescs():
The first want to compare what I call row-type equality, which means
they want to check specifically for equal number of attributes, and the
same attribute names, types, and typmods for each attribute. Most
callers actually want that behavior. The remaining callers just want to
compare the tuple descriptors as they are, they don't care why the
fields are in there, they just want to compare all of them.
In the attached patch, I add a new function equalRowTypes() that is
effectively a subset of equalTupleDescs() and switch most callers to that.
The purpose of this patch is to make the semantics less uncertain.
Questions like the one in [0]/messages/by-id/202401101316.k4s3fomwjx52@alvherre.pgsql about attstattarget now have a clear
answer for both functions. I think this would be useful to have, as we
are thinking about more changes in pg_attribute and tuple descriptors.
[0]: /messages/by-id/202401101316.k4s3fomwjx52@alvherre.pgsql
/messages/by-id/202401101316.k4s3fomwjx52@alvherre.pgsql
Attachments:
v0-0001-Separate-equalRowTypes-from-equalTupleDescs.patchtext/plain; charset=UTF-8; name=v0-0001-Separate-equalRowTypes-from-equalTupleDescs.patchDownload
From 7212a249b56eb5ac339dcd029526e7fb380932fe Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 6 Feb 2024 13:02:41 +0100
Subject: [PATCH v0] Separate equalRowTypes() from equalTupleDescs()
This introduces a new function equalRowTypes() that is effectively a
subset of equalTupleDescs() but only compares the number of attributes
and attribute name, type, and typmod. This is enough for most
existing uses of equalTupleDescs(), which are changed to use the new
function. The only remaining callers of equalTupleDescs() are those
that really want to check the tuple descriptor as such, without
concern about record to row or record type semantics.
The existing function hashTupleDesc() is renamed to hashRowType(),
because it now corresponds more to equalRowTypes().
The purpose of this change is to be clearer about the semantics of
equality asked for by each caller. (At least one caller had a comment
that questioned whether equalTupleDescs() was too restrictive.) For
example, 4f622503d6d removed attstattarget from the tuple descriptor
structure. It was not fully clear at the time how this should affect
equalTupleDescs(). Now the answer is clear: By their own definitions,
equalRowTypes() does not care, and equalTupleDescs() just compares
whatever is in the tuple descriptor but does not care why it is in
there.
---
src/backend/access/common/tupdesc.c | 58 +++++++++++++++++++++++------
src/backend/catalog/pg_proc.c | 2 +-
src/backend/commands/analyze.c | 4 +-
src/backend/commands/view.c | 13 +++----
src/backend/utils/cache/plancache.c | 5 +--
src/backend/utils/cache/typcache.c | 10 ++---
src/include/access/tupdesc.h | 4 +-
7 files changed, 64 insertions(+), 32 deletions(-)
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index bc80caee117..22d6708708e 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -414,11 +414,6 @@ DecrTupleDescRefCount(TupleDesc tupdesc)
/*
* Compare two TupleDesc structures for logical equality
- *
- * Note: we deliberately do not check the attrelid and tdtypmod fields.
- * This allows typcache.c to use this routine to see if a cached record type
- * matches a requested type, and is harmless for relcache.c's uses.
- * We don't compare tdrefcount, either.
*/
bool
equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
@@ -431,6 +426,8 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
if (tupdesc1->tdtypeid != tupdesc2->tdtypeid)
return false;
+ /* tdtypmod and tdrefcount are not checked */
+
for (i = 0; i < tupdesc1->natts; i++)
{
Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i);
@@ -561,17 +558,54 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
}
/*
- * hashTupleDesc
- * Compute a hash value for a tuple descriptor.
+ * equalRowTypes
*
- * If two tuple descriptors would be considered equal by equalTupleDescs()
- * then their hash value will be equal according to this function.
+ * This determines whether two tuple descriptors have equal row types. This
+ * means specifically:
+ *
+ * - same number of attributes
+ * - same composite type ID (but could both be zero)
+ * - corresponding attributes (in order) have the same name, type, and typmod
*
- * Note that currently contents of constraint are not hashed - it'd be a bit
- * painful to do so, and conflicts just due to constraints are unlikely.
+ * This is used to check whether two record types are compatible, whether
+ * function return row types are the same, and other similar situations.
+ *
+ * Note: We deliberately do not check the tdtypmod field. This allows
+ * typcache.c to use this routine to see if a cached record type matches a
+ * requested type.
+ */
+bool
+equalRowTypes(TupleDesc tupdesc1, TupleDesc tupdesc2)
+{
+ if (tupdesc1->natts != tupdesc2->natts)
+ return false;
+ if (tupdesc1->tdtypeid != tupdesc2->tdtypeid)
+ return false;
+
+ for (int i = 0; i < tupdesc1->natts; i++)
+ {
+ Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i);
+ Form_pg_attribute attr2 = TupleDescAttr(tupdesc2, i);
+
+ if (strcmp(NameStr(attr1->attname), NameStr(attr2->attname)) != 0)
+ return false;
+ if (attr1->atttypid != attr2->atttypid)
+ return false;
+ if (attr1->atttypmod != attr2->atttypmod)
+ return false;
+ }
+
+ return true;
+}
+
+/*
+ * hashRowType
+ *
+ * If two tuple descriptors would be considered equal by equalRowTypes()
+ * then their hash value will be equal according to this function.
*/
uint32
-hashTupleDesc(TupleDesc desc)
+hashRowType(TupleDesc desc)
{
uint32 s;
int i;
diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c
index b581d334d3a..542160f8dc3 100644
--- a/src/backend/catalog/pg_proc.c
+++ b/src/backend/catalog/pg_proc.c
@@ -439,7 +439,7 @@ ProcedureCreate(const char *procedureName,
if (olddesc == NULL && newdesc == NULL)
/* ok, both are runtime-defined RECORDs */ ;
else if (olddesc == NULL || newdesc == NULL ||
- !equalTupleDescs(olddesc, newdesc))
+ !equalRowTypes(olddesc, newdesc))
ereport(ERROR,
(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
errmsg("cannot change return type of existing function"),
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index a03495d6c95..b9cbe26c585 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1591,8 +1591,8 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
/* We may need to convert from child's rowtype to parent's */
if (childrows > 0 &&
- !equalTupleDescs(RelationGetDescr(childrel),
- RelationGetDescr(onerel)))
+ !equalRowTypes(RelationGetDescr(childrel),
+ RelationGetDescr(onerel)))
{
TupleConversionMap *map;
diff --git a/src/backend/commands/view.c b/src/backend/commands/view.c
index dce898c751c..ac7b31f9473 100644
--- a/src/backend/commands/view.c
+++ b/src/backend/commands/view.c
@@ -35,7 +35,7 @@
#include "utils/rel.h"
#include "utils/syscache.h"
-static void checkViewTupleDesc(TupleDesc newdesc, TupleDesc olddesc);
+static void checkViewColumns(TupleDesc newdesc, TupleDesc olddesc);
/*---------------------------------------------------------------------
* DefineVirtualRelation
@@ -135,7 +135,7 @@ DefineVirtualRelation(RangeVar *relation, List *tlist, bool replace,
* column list.
*/
descriptor = BuildDescForRelation(attrList);
- checkViewTupleDesc(descriptor, rel->rd_att);
+ checkViewColumns(descriptor, rel->rd_att);
/*
* If new attributes have been added, we must add pg_attribute entries
@@ -263,13 +263,12 @@ DefineVirtualRelation(RangeVar *relation, List *tlist, bool replace,
}
/*
- * Verify that tupledesc associated with proposed new view definition
- * matches tupledesc of old view. This is basically a cut-down version
- * of equalTupleDescs(), with code added to generate specific complaints.
- * Also, we allow the new tupledesc to have more columns than the old.
+ * Verify that the columns associated with proposed new view definition match
+ * the columns of the old view. But we allow the new view to have more
+ * columns than the old.
*/
static void
-checkViewTupleDesc(TupleDesc newdesc, TupleDesc olddesc)
+checkViewColumns(TupleDesc newdesc, TupleDesc olddesc)
{
int i;
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5194cbf2cc8..3b8ef571934 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -728,8 +728,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
PopActiveSnapshot();
/*
- * Check or update the result tupdesc. XXX should we use a weaker
- * condition than equalTupleDescs() here?
+ * Check or update the result tupdesc.
*
* We assume the parameter types didn't change from the first time, so no
* need to update that.
@@ -740,7 +739,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
/* OK, doesn't return tuples */
}
else if (resultDesc == NULL || plansource->resultDesc == NULL ||
- !equalTupleDescs(resultDesc, plansource->resultDesc))
+ !equalRowTypes(resultDesc, plansource->resultDesc))
{
/* can we give a better error message? */
if (plansource->fixed_result)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 84fc83cb11f..91948c9621a 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -147,7 +147,7 @@ typedef struct TypeCacheEnumData
* We use a separate table for storing the definitions of non-anonymous
* record types. Once defined, a record type will be remembered for the
* life of the backend. Subsequent uses of the "same" record type (where
- * sameness means equalTupleDescs) will refer to the existing table entry.
+ * sameness means equalRowTypes) will refer to the existing table entry.
*
* Stored record types are remembered in a linear array of TupleDescs,
* which can be indexed quickly with the assigned typmod. There is also
@@ -231,7 +231,7 @@ shared_record_table_compare(const void *a, const void *b, size_t size,
else
t2 = k2->u.local_tupdesc;
- return equalTupleDescs(t1, t2) ? 0 : 1;
+ return equalRowTypes(t1, t2) ? 0 : 1;
}
/*
@@ -249,7 +249,7 @@ shared_record_table_hash(const void *a, size_t size, void *arg)
else
t = k->u.local_tupdesc;
- return hashTupleDesc(t);
+ return hashRowType(t);
}
/* Parameters for SharedRecordTypmodRegistry's TupleDesc table. */
@@ -1925,7 +1925,7 @@ record_type_typmod_hash(const void *data, size_t size)
{
RecordCacheEntry *entry = (RecordCacheEntry *) data;
- return hashTupleDesc(entry->tupdesc);
+ return hashRowType(entry->tupdesc);
}
/*
@@ -1937,7 +1937,7 @@ record_type_typmod_compare(const void *a, const void *b, size_t size)
RecordCacheEntry *left = (RecordCacheEntry *) a;
RecordCacheEntry *right = (RecordCacheEntry *) b;
- return equalTupleDescs(left->tupdesc, right->tupdesc) ? 0 : 1;
+ return equalRowTypes(left->tupdesc, right->tupdesc) ? 0 : 1;
}
/*
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 3d767dde65d..8930a28d660 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -126,8 +126,8 @@ extern void DecrTupleDescRefCount(TupleDesc tupdesc);
} while (0)
extern bool equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2);
-
-extern uint32 hashTupleDesc(TupleDesc desc);
+extern bool equalRowTypes(TupleDesc tupdesc1, TupleDesc tupdesc2);
+extern uint32 hashRowType(TupleDesc desc);
extern void TupleDescInitEntry(TupleDesc desc,
AttrNumber attributeNumber,
base-commit: b83033c3cff556d520281aaec399e47b4f11edbe
--
2.43.0
Peter Eisentraut <peter@eisentraut.org> writes:
The first want to compare what I call row-type equality, which means
they want to check specifically for equal number of attributes, and the
same attribute names, types, and typmods for each attribute. Most
callers actually want that behavior.
Should compare attcollation too, no?
+1 for the general idea, but it seems like "row type equality"
might still be a slightly fuzzy concept.
regards, tom lane
On Tue, Feb 6, 2024 at 8:59 PM Peter Eisentraut <peter@eisentraut.org> wrote:
In a recent patch thread it was discussed[0] which fields should be
compared by equalTupleDescs() and whether it is ok to remove a field
from tuple descriptors and how that should affect their equality
(attstattarget in that case).After analyzing all the callers, I have noticed that there are two
classes of callers of equalTupleDescs():The first want to compare what I call row-type equality, which means
they want to check specifically for equal number of attributes, and the
same attribute names, types, and typmods for each attribute. Most
callers actually want that behavior. The remaining callers just want to
compare the tuple descriptors as they are, they don't care why the
fields are in there, they just want to compare all of them.In the attached patch, I add a new function equalRowTypes() that is
effectively a subset of equalTupleDescs() and switch most callers to that.The purpose of this patch is to make the semantics less uncertain.
Questions like the one in [0] about attstattarget now have a clear
answer for both functions. I think this would be useful to have, as we
are thinking about more changes in pg_attribute and tuple descriptors.[0]:
/messages/by-id/202401101316.k4s3fomwjx52@alvherre.pgsql
function name record_type_typmod_hash imply that
hashRowType should also hash atttypmod field?
also:
bool
equalRowTypes(TupleDesc tupdesc1, TupleDesc tupdesc2)
{
if (tupdesc1->natts != tupdesc2->natts)
return false;
if (tupdesc1->tdtypeid != tupdesc2->tdtypeid)
return false;
for (int i = 0; i < tupdesc1->natts; i++)
{
Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i);
Form_pg_attribute attr2 = TupleDescAttr(tupdesc2, i);
if (strcmp(NameStr(attr1->attname), NameStr(attr2->attname)) != 0)
return false;
if (attr1->atttypid != attr2->atttypid)
return false;
if (attr1->atttypmod != attr2->atttypmod)
return false;
}
return true;
}
/*
* hashRowType
*
* If two tuple descriptors would be considered equal by equalRowTypes()
* then their hash value will be equal according to this function.
*/
uint32
hashRowType(TupleDesc desc)
{
uint32 s;
int i;
s = hash_combine(0, hash_uint32(desc->natts));
s = hash_combine(s, hash_uint32(desc->tdtypeid));
for (i = 0; i < desc->natts; ++i)
s = hash_combine(s, hash_uint32(TupleDescAttr(desc, i)->atttypid));
return s;
}
from the hashRowType comment, should we also hash attname and atttypmod?
On 07.02.24 04:06, jian he wrote:
/*
* hashRowType
*
* If two tuple descriptors would be considered equal by equalRowTypes()
* then their hash value will be equal according to this function.
*/
uint32
hashRowType(TupleDesc desc)
{
uint32 s;
int i;s = hash_combine(0, hash_uint32(desc->natts));
s = hash_combine(s, hash_uint32(desc->tdtypeid));
for (i = 0; i < desc->natts; ++i)
s = hash_combine(s, hash_uint32(TupleDescAttr(desc, i)->atttypid));return s;
}from the hashRowType comment, should we also hash attname and atttypmod?
In principle, hashRowType() could process all the fields that
equalRowTypes() does. But since it's only a hash function, it doesn't
have to be perfect. (This is also the case for the current
hashTupleDesc().) I'm not sure where the best tradeoff is.
On 06.02.24 16:14, Tom Lane wrote:
Peter Eisentraut <peter@eisentraut.org> writes:
The first want to compare what I call row-type equality, which means
they want to check specifically for equal number of attributes, and the
same attribute names, types, and typmods for each attribute. Most
callers actually want that behavior.Should compare attcollation too, no?
+1 for the general idea, but it seems like "row type equality"
might still be a slightly fuzzy concept.
I did another pass across the callers to check what pg_attribute fields
might be relevant.
Collation definitely needs to be added, certainly for plancache.c, maybe
for typcache.c, the other callers don't care.
Record types can have attisdropped fields, so it's probably good to
check those.
I'm suspicious about attndims. Maybe one could create a test case where
record types differ only in that. Support for attndims throughout the
system is weak, but maybe there is something to check there.
On a conceptual level, I figured pg_attribute rows can be divided up
into three categories:
1. "row type" stuff: attname, atttypid, atttypmod, attndims,
attisdropped, attcollation
2. physical layout stuff: attlen, attcacheoff, attbyval, attalign
3. table metadata stuff (everything else)
It's not perfect, and sometimes it's not clear whether these categories
inform the implementation or the other way around, but I think it helps
conceptualize it.
Attachments:
v1-0001-Separate-equalRowTypes-from-equalTupleDescs.patchtext/plain; charset=UTF-8; name=v1-0001-Separate-equalRowTypes-from-equalTupleDescs.patchDownload
From 7798fb0c9b57e94f8edc9c8c808395097e541954 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Mon, 12 Feb 2024 12:42:11 +0100
Subject: [PATCH v1] Separate equalRowTypes() from equalTupleDescs()
This introduces a new function equalRowTypes() that is effectively a
subset of equalTupleDescs() but only compares the number of attributes
and attribute name, type, and typmod. This is enough for most
existing uses of equalTupleDescs(), which are changed to use the new
function. The only remaining callers of equalTupleDescs() are those
that really want to check the full tuple descriptor as such, without
concern about record or row or record type semantics.
The existing function hashTupleDesc() is renamed to hashRowType(),
because it now corresponds more to equalRowTypes().
The purpose of this change is to be clearer about the semantics of
equality asked for by each caller. (At least one caller had a comment
that questioned whether equalTupleDescs() was too restrictive.) For
example, 4f622503d6d removed attstattarget from the tuple descriptor
structure. It was not fully clear at the time how this should affect
equalTupleDescs(). Now the answer is clear: By their own definitions,
equalRowTypes() does not care, and equalTupleDescs() just compares
whatever is in the tuple descriptor but does not care why it is in
there.
Discussion: https://www.postgresql.org/message-id/flat/f656d6d9-6660-4518-a006-2f65cafbebd1%40eisentraut.org
---
src/backend/access/common/tupdesc.c | 71 ++++++++++++++++++++++++-----
src/backend/catalog/pg_proc.c | 2 +-
src/backend/commands/analyze.c | 4 +-
src/backend/commands/view.c | 13 +++---
src/backend/utils/cache/plancache.c | 5 +-
src/backend/utils/cache/typcache.c | 10 ++--
src/include/access/tupdesc.h | 4 +-
7 files changed, 77 insertions(+), 32 deletions(-)
diff --git a/src/backend/access/common/tupdesc.c b/src/backend/access/common/tupdesc.c
index bc80caee117..64761f050f5 100644
--- a/src/backend/access/common/tupdesc.c
+++ b/src/backend/access/common/tupdesc.c
@@ -414,11 +414,6 @@ DecrTupleDescRefCount(TupleDesc tupdesc)
/*
* Compare two TupleDesc structures for logical equality
- *
- * Note: we deliberately do not check the attrelid and tdtypmod fields.
- * This allows typcache.c to use this routine to see if a cached record type
- * matches a requested type, and is harmless for relcache.c's uses.
- * We don't compare tdrefcount, either.
*/
bool
equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
@@ -431,6 +426,8 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
if (tupdesc1->tdtypeid != tupdesc2->tdtypeid)
return false;
+ /* tdtypmod and tdrefcount are not checked */
+
for (i = 0; i < tupdesc1->natts; i++)
{
Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i);
@@ -561,17 +558,67 @@ equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2)
}
/*
- * hashTupleDesc
- * Compute a hash value for a tuple descriptor.
+ * equalRowTypes
*
- * If two tuple descriptors would be considered equal by equalTupleDescs()
- * then their hash value will be equal according to this function.
+ * This determines whether two tuple descriptors have equal row types. This
+ * only checks those fields in pg_attribute that are applicable for row types,
+ * while ignoring those fields that define the physical row storage or those
+ * that define table column metadata.
+ *
+ * Specifically, this checks:
+ *
+ * - same number of attributes
+ * - same composite type ID (but could both be zero)
+ * - corresponding attributes (in order) have same the name, type, typmod,
+ * collation
+ *
+ * This is used to check whether two record types are compatible, whether
+ * function return row types are the same, and other similar situations.
+ *
+ * Note: We deliberately do not check the tdtypmod field. This allows
+ * typcache.c to use this routine to see if a cached record type matches a
+ * requested type.
+ */
+bool
+equalRowTypes(TupleDesc tupdesc1, TupleDesc tupdesc2)
+{
+ if (tupdesc1->natts != tupdesc2->natts)
+ return false;
+ if (tupdesc1->tdtypeid != tupdesc2->tdtypeid)
+ return false;
+
+ for (int i = 0; i < tupdesc1->natts; i++)
+ {
+ Form_pg_attribute attr1 = TupleDescAttr(tupdesc1, i);
+ Form_pg_attribute attr2 = TupleDescAttr(tupdesc2, i);
+
+ if (strcmp(NameStr(attr1->attname), NameStr(attr2->attname)) != 0)
+ return false;
+ if (attr1->atttypid != attr2->atttypid)
+ return false;
+ if (attr1->atttypmod != attr2->atttypmod)
+ return false;
+ if (attr1->attcollation != attr2->attcollation)
+ return false;
+
+ /* TODO: check attndims??? */
+
+ /* Record types derived from tables could have dropped fields. */
+ if (attr1->attisdropped != attr2->attisdropped)
+ return false;
+ }
+
+ return true;
+}
+
+/*
+ * hashRowType
*
- * Note that currently contents of constraint are not hashed - it'd be a bit
- * painful to do so, and conflicts just due to constraints are unlikely.
+ * If two tuple descriptors would be considered equal by equalRowTypes()
+ * then their hash value will be equal according to this function.
*/
uint32
-hashTupleDesc(TupleDesc desc)
+hashRowType(TupleDesc desc)
{
uint32 s;
int i;
diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c
index b581d334d3a..542160f8dc3 100644
--- a/src/backend/catalog/pg_proc.c
+++ b/src/backend/catalog/pg_proc.c
@@ -439,7 +439,7 @@ ProcedureCreate(const char *procedureName,
if (olddesc == NULL && newdesc == NULL)
/* ok, both are runtime-defined RECORDs */ ;
else if (olddesc == NULL || newdesc == NULL ||
- !equalTupleDescs(olddesc, newdesc))
+ !equalRowTypes(olddesc, newdesc))
ereport(ERROR,
(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
errmsg("cannot change return type of existing function"),
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index a03495d6c95..b9cbe26c585 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -1591,8 +1591,8 @@ acquire_inherited_sample_rows(Relation onerel, int elevel,
/* We may need to convert from child's rowtype to parent's */
if (childrows > 0 &&
- !equalTupleDescs(RelationGetDescr(childrel),
- RelationGetDescr(onerel)))
+ !equalRowTypes(RelationGetDescr(childrel),
+ RelationGetDescr(onerel)))
{
TupleConversionMap *map;
diff --git a/src/backend/commands/view.c b/src/backend/commands/view.c
index dce898c751c..ac7b31f9473 100644
--- a/src/backend/commands/view.c
+++ b/src/backend/commands/view.c
@@ -35,7 +35,7 @@
#include "utils/rel.h"
#include "utils/syscache.h"
-static void checkViewTupleDesc(TupleDesc newdesc, TupleDesc olddesc);
+static void checkViewColumns(TupleDesc newdesc, TupleDesc olddesc);
/*---------------------------------------------------------------------
* DefineVirtualRelation
@@ -135,7 +135,7 @@ DefineVirtualRelation(RangeVar *relation, List *tlist, bool replace,
* column list.
*/
descriptor = BuildDescForRelation(attrList);
- checkViewTupleDesc(descriptor, rel->rd_att);
+ checkViewColumns(descriptor, rel->rd_att);
/*
* If new attributes have been added, we must add pg_attribute entries
@@ -263,13 +263,12 @@ DefineVirtualRelation(RangeVar *relation, List *tlist, bool replace,
}
/*
- * Verify that tupledesc associated with proposed new view definition
- * matches tupledesc of old view. This is basically a cut-down version
- * of equalTupleDescs(), with code added to generate specific complaints.
- * Also, we allow the new tupledesc to have more columns than the old.
+ * Verify that the columns associated with proposed new view definition match
+ * the columns of the old view. But we allow the new view to have more
+ * columns than the old.
*/
static void
-checkViewTupleDesc(TupleDesc newdesc, TupleDesc olddesc)
+checkViewColumns(TupleDesc newdesc, TupleDesc olddesc)
{
int i;
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5194cbf2cc8..3b8ef571934 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -728,8 +728,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
PopActiveSnapshot();
/*
- * Check or update the result tupdesc. XXX should we use a weaker
- * condition than equalTupleDescs() here?
+ * Check or update the result tupdesc.
*
* We assume the parameter types didn't change from the first time, so no
* need to update that.
@@ -740,7 +739,7 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
/* OK, doesn't return tuples */
}
else if (resultDesc == NULL || plansource->resultDesc == NULL ||
- !equalTupleDescs(resultDesc, plansource->resultDesc))
+ !equalRowTypes(resultDesc, plansource->resultDesc))
{
/* can we give a better error message? */
if (plansource->fixed_result)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 84fc83cb11f..91948c9621a 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -147,7 +147,7 @@ typedef struct TypeCacheEnumData
* We use a separate table for storing the definitions of non-anonymous
* record types. Once defined, a record type will be remembered for the
* life of the backend. Subsequent uses of the "same" record type (where
- * sameness means equalTupleDescs) will refer to the existing table entry.
+ * sameness means equalRowTypes) will refer to the existing table entry.
*
* Stored record types are remembered in a linear array of TupleDescs,
* which can be indexed quickly with the assigned typmod. There is also
@@ -231,7 +231,7 @@ shared_record_table_compare(const void *a, const void *b, size_t size,
else
t2 = k2->u.local_tupdesc;
- return equalTupleDescs(t1, t2) ? 0 : 1;
+ return equalRowTypes(t1, t2) ? 0 : 1;
}
/*
@@ -249,7 +249,7 @@ shared_record_table_hash(const void *a, size_t size, void *arg)
else
t = k->u.local_tupdesc;
- return hashTupleDesc(t);
+ return hashRowType(t);
}
/* Parameters for SharedRecordTypmodRegistry's TupleDesc table. */
@@ -1925,7 +1925,7 @@ record_type_typmod_hash(const void *data, size_t size)
{
RecordCacheEntry *entry = (RecordCacheEntry *) data;
- return hashTupleDesc(entry->tupdesc);
+ return hashRowType(entry->tupdesc);
}
/*
@@ -1937,7 +1937,7 @@ record_type_typmod_compare(const void *a, const void *b, size_t size)
RecordCacheEntry *left = (RecordCacheEntry *) a;
RecordCacheEntry *right = (RecordCacheEntry *) b;
- return equalTupleDescs(left->tupdesc, right->tupdesc) ? 0 : 1;
+ return equalRowTypes(left->tupdesc, right->tupdesc) ? 0 : 1;
}
/*
diff --git a/src/include/access/tupdesc.h b/src/include/access/tupdesc.h
index 3d767dde65d..8930a28d660 100644
--- a/src/include/access/tupdesc.h
+++ b/src/include/access/tupdesc.h
@@ -126,8 +126,8 @@ extern void DecrTupleDescRefCount(TupleDesc tupdesc);
} while (0)
extern bool equalTupleDescs(TupleDesc tupdesc1, TupleDesc tupdesc2);
-
-extern uint32 hashTupleDesc(TupleDesc desc);
+extern bool equalRowTypes(TupleDesc tupdesc1, TupleDesc tupdesc2);
+extern uint32 hashRowType(TupleDesc desc);
extern void TupleDescInitEntry(TupleDesc desc,
AttrNumber attributeNumber,
base-commit: 5fce30e77fe133e1404167c13caaf7e0d1369295
--
2.43.1
Peter Eisentraut <peter@eisentraut.org> writes:
On 06.02.24 16:14, Tom Lane wrote:
+1 for the general idea, but it seems like "row type equality"
might still be a slightly fuzzy concept.
I did another pass across the callers to check what pg_attribute fields
might be relevant.
Collation definitely needs to be added, certainly for plancache.c, maybe
for typcache.c, the other callers don't care.
+1
Record types can have attisdropped fields, so it's probably good to
check those.
Yeah, good idea. (In most cases the attname comparison would catch
that, but we shouldn't rely on it.) In a perfect world maybe a
dropped column should be invisible to this comparison, but we're
a very long way from being able to treat it that way.
I'm suspicious about attndims. Maybe one could create a test case where
record types differ only in that. Support for attndims throughout the
system is weak, but maybe there is something to check there.
There was a discussion last year[1]/messages/by-id/ZD+14YZ4IUue8Rhi@gendo.asyd.net about removing attndims
altogether, which still seems to me like possibly a good idea.
So I doubt we want to consider it as a core semantic field.
On a conceptual level, I figured pg_attribute rows can be divided up
into three categories:
1. "row type" stuff: attname, atttypid, atttypmod, attndims,
attisdropped, attcollation
2. physical layout stuff: attlen, attcacheoff, attbyval, attalign
I recall some discussion about taking attcacheoff out of this data
structure too ...
3. table metadata stuff (everything else)
It's not perfect, and sometimes it's not clear whether these categories
inform the implementation or the other way around, but I think it helps
conceptualize it.
Sure.
regards, tom lane
On Mon, Feb 12, 2024 at 7:47 PM Peter Eisentraut <peter@eisentraut.org> wrote:
In principle, hashRowType() could process all the fields that
equalRowTypes() does. But since it's only a hash function, it doesn't
have to be perfect. (This is also the case for the current
hashTupleDesc().) I'm not sure where the best tradeoff is.
That's where my confusion comes from.
hashRowType is used in record_type_typmod_hash.
record_type_typmod_hash is within assign_record_type_typmod.
in assign_record_type_typmod:
------------------------------------------------
if (RecordCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
ctl.keysize = sizeof(TupleDesc); /* just the pointer */
ctl.entrysize = sizeof(RecordCacheEntry);
ctl.hash = record_type_typmod_hash;
ctl.match = record_type_typmod_compare;
RecordCacheHash = hash_create("Record information cache", 64,
&ctl,
HASH_ELEM | HASH_FUNCTION | HASH_COMPARE);
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
}
/*
* Find a hashtable entry for this tuple descriptor. We don't use
* HASH_ENTER yet, because if it's missing, we need to make sure that all
* the allocations succeed before we create the new entry.
*/
recentry = (RecordCacheEntry *) hash_search(RecordCacheHash,
&tupDesc,
HASH_FIND, &found);
------------------------------------------------
based on the comments in hash_create. The above hash_search function
would first use
record_type_typmod_hash to find out candidate entries in a hash table
then use record_type_typmod_compare to compare the given tupDesc with
candidate entries.
Is this how the hash_search in assign_record_type_typmod works?
equalRowTypes processed more fields than hashRowType,
hashRowType comments mentioned equalRowTypes,
maybe we should have some comments in hashRowType explaining why only
hashing natts, tdtypeid, atttypid will be fine.
Hi,
I looked at this patch today. I went through all the calls switched to
equalRowTypes, and AFAIK all of them are correct - all the places
switched to equalRowTypes() only need the weaker checks.
There's only two places still calling equalTupleDescs() - relcache
certainly needs that, and so does the assert in execReplication().
As for attndims, I agree equalRowTypes() should not check that. We're
not really checking that anywhere, it'd be quite weird to start with it
here. Especially if the plan is to remove it entirely.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2/27/24 12:13, jian he wrote:
On Mon, Feb 12, 2024 at 7:47 PM Peter Eisentraut <peter@eisentraut.org> wrote:
In principle, hashRowType() could process all the fields that
equalRowTypes() does. But since it's only a hash function, it doesn't
have to be perfect. (This is also the case for the current
hashTupleDesc().) I'm not sure where the best tradeoff is.That's where my confusion comes from.
hashRowType is used in record_type_typmod_hash.
record_type_typmod_hash is within assign_record_type_typmod.in assign_record_type_typmod:
------------------------------------------------
if (RecordCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
ctl.keysize = sizeof(TupleDesc); /* just the pointer */
ctl.entrysize = sizeof(RecordCacheEntry);
ctl.hash = record_type_typmod_hash;
ctl.match = record_type_typmod_compare;
RecordCacheHash = hash_create("Record information cache", 64,
&ctl,
HASH_ELEM | HASH_FUNCTION | HASH_COMPARE);
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
}
/*
* Find a hashtable entry for this tuple descriptor. We don't use
* HASH_ENTER yet, because if it's missing, we need to make sure that all
* the allocations succeed before we create the new entry.
*/
recentry = (RecordCacheEntry *) hash_search(RecordCacheHash,
&tupDesc,
HASH_FIND, &found);
------------------------------------------------
based on the comments in hash_create. The above hash_search function
would first use
record_type_typmod_hash to find out candidate entries in a hash table
then use record_type_typmod_compare to compare the given tupDesc with
candidate entries.Is this how the hash_search in assign_record_type_typmod works?
Yes.
equalRowTypes processed more fields than hashRowType,
hashRowType comments mentioned equalRowTypes,
maybe we should have some comments in hashRowType explaining why only
hashing natts, tdtypeid, atttypid will be fine.
Not sure I understand what the confusion is - omitting fields with
little entropy is not uncommon, and collisions are inherent to hash
tables, and need to be handled anyway.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 13.03.24 19:43, Tomas Vondra wrote:
I looked at this patch today. I went through all the calls switched to
equalRowTypes, and AFAIK all of them are correct - all the places
switched to equalRowTypes() only need the weaker checks.There's only two places still calling equalTupleDescs() - relcache
certainly needs that, and so does the assert in execReplication().As for attndims, I agree equalRowTypes() should not check that. We're
not really checking that anywhere, it'd be quite weird to start with it
here. Especially if the plan is to remove it entirely.
Thanks for checking this again. I have committed the patch as it was
presented then.