type cache cleanup improvements
Hi!
I'd like to suggest two independent patches to improve performance of type cache
cleanup. I found a case where type cache cleanup was a reason for low
performance. In short, customer makes 100 thousand temporary tables in one
transaction.
1 mapRelType.patch
It just adds a local map between relation and its type as it was suggested in
comment above TypeCacheRelCallback(). Unfortunately, using syscache here was
impossible because this call back could be called outside transaction and it
makes impossible catalog lookups.
2 hash_seq_init_with_hash_value.patch
TypeCacheTypCallback() loop over type hash to find entry with given hash
value. Here there are two problems: 1) there isn't interface to dynahash to
search entry with given hash value and 2) hash value calculation algorithm is
differ from system cache. But coming hashvalue is came from system cache. Patch
is addressed to both issues. It suggests hash_seq_init_with_hash_value() call
which inits hash sequential scan over the single bucket which could contain
entry with given hash value, and hash_seq_search() will iterate only over such
entries. Anf patch changes hash algorithm to match syscache. Actually, patch
makes small refactoring of dynahash, it makes common function hash_do_lookup()
which does initial lookup in hash.
Some artificial performance test is in attachment, command to test is 'time psql
< custom_types_and_array.sql', here I show only last rollback time and total
execution time:
1) master 92d2ab7554f92b841ea71bcc72eaa8ab11aae662
Time: 33353,288 ms (00:33,353)
psql < custom_types_and_array.sql 0,82s user 0,71s system 1% cpu 1:28,36 total
2) mapRelType.patch
Time: 7455,581 ms (00:07,456)
psql < custom_types_and_array.sql 1,39s user 1,19s system 6% cpu 41,220 total
3) hash_seq_init_with_hash_value.patch
Time: 24975,886 ms (00:24,976)
psql < custom_types_and_array.sql 1,33s user 1,25s system 3% cpu 1:19,77 total
4) both
Time: 89,446 ms
psql < custom_types_and_array.sql 0,72s user 0,52s system 10% cpu 12,137 total
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
hash_seq_init_with_hash_value.patchtext/x-patch; charset=UTF-8; name=hash_seq_init_with_hash_value.patchDownload
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index f411e33b8e7..49e27ca8ba4 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -331,6 +331,14 @@ static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+/*
+ * Hash function compatible with one-arg system cache hash function.
+ */
+static uint32
+type_cache_syshash(const void *key, Size keysize)
+{
+ return GetSysCacheHashValue1(TYPEOID, ObjectIdGetDatum(*(const Oid*)key));
+}
/*
* lookup_type_cache
@@ -356,8 +364,16 @@ lookup_type_cache(Oid type_id, int flags)
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
+ /*
+ * Hash function must be compatible to make possible work
+ * hash_seq_init_with_hash_value(). Hash value in TypeEntry is taken
+ * from system cache and we use the same (compatible) to use it's hash
+ * value to speedup search by hash value instead of scanning whole hash
+ */
+ ctl.hash = type_cache_syshash;
+
TypeCacheHash = hash_create("Type information cache", 64,
- &ctl, HASH_ELEM | HASH_BLOBS);
+ &ctl, HASH_ELEM | HASH_FUNCTION);
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
@@ -408,8 +424,7 @@ lookup_type_cache(Oid type_id, int flags)
/* These fields can never change, by definition */
typentry->type_id = type_id;
- typentry->type_id_hash = GetSysCacheHashValue1(TYPEOID,
- ObjectIdGetDatum(type_id));
+ typentry->type_id_hash = get_hash_value(TypeCacheHash, &type_id);
/* Keep this part in sync with the code below */
typentry->typlen = typtup->typlen;
@@ -2359,20 +2374,21 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
TypeCacheEntry *typentry;
/* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
+ if (hashvalue == 0)
+ hash_seq_init(&status, TypeCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue);
+
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
- /* Is this the targeted type row (or it's a total cache flush)? */
- if (hashvalue == 0 || typentry->type_id_hash == hashvalue)
- {
- /*
- * Mark the data obtained directly from pg_type as invalid. Also,
- * if it's a domain, typnotnull might've changed, so we'll need to
- * recalculate its constraints.
- */
- typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
- TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
- }
+ Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
+ /*
+ * Mark the data obtained directly from pg_type as invalid. Also,
+ * if it's a domain, typnotnull might've changed, so we'll need to
+ * recalculate its constraints.
+ */
+ typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
+ TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
}
}
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index a4152080b5d..0180e096f2d 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -962,6 +962,33 @@ hash_search(HTAB *hashp,
foundPtr);
}
+/*
+ * Helper function executed initial lookup of bucket by given hashvalue
+ */
+static HASHBUCKET*
+hash_do_lookup(HTAB *hashp, uint32 hashvalue, uint32 *bucket)
+{
+ HASHHDR *hctl = hashp->hctl;
+ HASHSEGMENT segp;
+ long segment_num;
+ long segment_ndx;
+
+ /*
+ * Do the initial lookup
+ */
+ *bucket = calc_bucket(hctl, hashvalue);
+
+ segment_num = *bucket >> hashp->sshift;
+ segment_ndx = MOD(*bucket, hashp->ssize);
+
+ segp = hashp->dir[segment_num];
+
+ if (segp == NULL)
+ hash_corrupted(hashp);
+
+ return &segp[segment_ndx];
+}
+
void *
hash_search_with_hash_value(HTAB *hashp,
const void *keyPtr,
@@ -973,9 +1000,6 @@ hash_search_with_hash_value(HTAB *hashp,
int freelist_idx = FREELIST_IDX(hctl, hashvalue);
Size keysize;
uint32 bucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HashCompareFunc match;
@@ -1008,17 +1032,7 @@ hash_search_with_hash_value(HTAB *hashp,
/*
* Do the initial lookup
*/
- bucket = calc_bucket(hctl, hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_do_lookup(hashp, hashvalue, &bucket);
currBucket = *prevBucketPtr;
/*
@@ -1159,14 +1173,10 @@ hash_update_hash_key(HTAB *hashp,
const void *newKeyPtr)
{
HASHELEMENT *existingElement = ELEMENT_FROM_KEY(existingEntry);
- HASHHDR *hctl = hashp->hctl;
uint32 newhashvalue;
Size keysize;
uint32 bucket;
uint32 newbucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HASHBUCKET *oldPrevPtr;
@@ -1187,17 +1197,7 @@ hash_update_hash_key(HTAB *hashp,
* this to be able to unlink it from its hash chain, but as a side benefit
* we can verify the validity of the passed existingEntry pointer.
*/
- bucket = calc_bucket(hctl, existingElement->hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_do_lookup(hashp, existingElement->hashvalue, &bucket);
currBucket = *prevBucketPtr;
while (currBucket != NULL)
@@ -1219,18 +1219,7 @@ hash_update_hash_key(HTAB *hashp,
* chain we want to put the entry into.
*/
newhashvalue = hashp->hash(newKeyPtr, hashp->keysize);
-
- newbucket = calc_bucket(hctl, newhashvalue);
-
- segment_num = newbucket >> hashp->sshift;
- segment_ndx = MOD(newbucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_do_lookup(hashp, newhashvalue, &newbucket);
currBucket = *prevBucketPtr;
/*
@@ -1423,10 +1412,27 @@ hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp)
status->hashp = hashp;
status->curBucket = 0;
status->curEntry = NULL;
+ status->hasHashvalue = false;
if (!hashp->frozen)
register_seq_scan(hashp);
}
+/*
+ * The same as previous but init sequentially search through hash table and
+ * return all the elements one by one with given hashvalue.
+ */
+void
+hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue)
+{
+ hash_seq_init(status, hashp);
+
+ status->hasHashvalue = true;
+ status->hashvalue = hashvalue;
+
+ status->curEntry = *hash_do_lookup(hashp, hashvalue, &status->curBucket);
+}
+
void *
hash_seq_search(HASH_SEQ_STATUS *status)
{
@@ -1440,6 +1446,24 @@ hash_seq_search(HASH_SEQ_STATUS *status)
uint32 curBucket;
HASHELEMENT *curElem;
+ if (status->hasHashvalue)
+ {
+ /*
+ * scan all entries in only one bucket because only current bucket could
+ * contain entries with given hashvalue
+ */
+ while ((curElem = status->curEntry) != NULL)
+ {
+ status->curEntry = curElem->link;
+ if (status->hashvalue != curElem->hashvalue)
+ continue;
+ return (void *) ELEMENTKEY(curElem);
+ }
+
+ hash_seq_term(status);
+ return NULL;
+ }
+
if ((curElem = status->curEntry) != NULL)
{
/* Continuing scan of curBucket... */
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index da26941f6db..c99d74625f7 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -122,6 +122,8 @@ typedef struct
HTAB *hashp;
uint32 curBucket; /* index of current bucket */
HASHELEMENT *curEntry; /* current entry in bucket */
+ bool hasHashvalue; /* true if hashvalue was provided */
+ uint32 hashvalue; /* hashvalue to start seqscan over hash */
} HASH_SEQ_STATUS;
/*
@@ -141,6 +143,8 @@ extern bool hash_update_hash_key(HTAB *hashp, void *existingEntry,
const void *newKeyPtr);
extern long hash_get_num_entries(HTAB *hashp);
extern void hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp);
+extern void hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue);
extern void *hash_seq_search(HASH_SEQ_STATUS *status);
extern void hash_seq_term(HASH_SEQ_STATUS *status);
extern void hash_freeze(HTAB *hashp);
mapRelType.patchtext/x-patch; charset=UTF-8; name=mapRelType.patchDownload
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index f411e33b8e7..72c309b84c6 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -78,6 +78,15 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/* The map from relation's oid to its type oid */
+typedef struct mapRelTypeEntry
+{
+ Oid typrelid;
+ Oid type_id;
+} mapRelTypeEntry;
+
+static HTAB *mapRelType = NULL;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -359,6 +368,11 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_BLOBS);
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(mapRelTypeEntry);
+ mapRelType = hash_create("Map reloid to typeoid", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -471,6 +485,24 @@ lookup_type_cache(Oid type_id, int flags)
ReleaseSysCache(tp);
}
+ /*
+ * Add record to map relation->type. We don't bother even of type became
+ * disconnected from relation, it seems to be impossible, but anyway,
+ * storing old data is safe, in a worstest case we will just do an extra
+ * cleanup cache entry.
+ */
+ if (OidIsValid(typentry->typrelid) && typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry*) hash_search(mapRelType,
+ &typentry->typrelid,
+ HASH_ENTER, NULL);
+
+ relentry->typrelid = typentry->typrelid;
+ relentry->type_id = typentry->type_id;
+ }
+
/*
* Look up opclasses if we haven't already and any dependent info is
* requested.
@@ -2265,6 +2297,37 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+static void
+invalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount, and free the tupdesc if none remain.
+ * (Can't use DecrTupleDescRefCount because this reference is
+ * not logged in current resource owner.)
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anything watching
+ * that will realize that the tupdesc has possibly changed.
+ * (Alternatively, we could specify that to detect possible
+ * tupdesc change, one must check for tupDesc != NULL as well
+ * as tupDesc_identifier being the same as what was previously
+ * seen. That seems error-prone.)
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2274,63 +2337,42 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We don't able to use syscache to find correspoding type because of
+ * we could be called outside of transaction. So, we track a separate map.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /* mapRelType/TypeCacheHash must exist, else this callback wouldn't be registered */
+
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry *) hash_search(mapRelType,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->type_id,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ invalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2342,6 +2384,35 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid = 0, so we need to reset all composite types in cache. Also, we
+ * should reset flags for domain types, and we loop over all entries
+ * in hash, so, do it in single scan
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ invalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't bother
+ * trying to determine whether the specific base type needs a
+ * reset.) Note that if we haven't determined whether the base
+ * type is composite, we don't need to reset anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
Hi Teodor,
I'd like to suggest two independent patches to improve performance of type cache
cleanup. I found a case where type cache cleanup was a reason for low
performance. In short, customer makes 100 thousand temporary tables in one
transaction.1 mapRelType.patch
It just adds a local map between relation and its type as it was suggested in
comment above TypeCacheRelCallback(). Unfortunately, using syscache here was
impossible because this call back could be called outside transaction and it
makes impossible catalog lookups.2 hash_seq_init_with_hash_value.patch
TypeCacheTypCallback() loop over type hash to find entry with given hash
value. Here there are two problems: 1) there isn't interface to dynahash to
search entry with given hash value and 2) hash value calculation algorithm is
differ from system cache. But coming hashvalue is came from system cache. Patch
is addressed to both issues. It suggests hash_seq_init_with_hash_value() call
which inits hash sequential scan over the single bucket which could contain
entry with given hash value, and hash_seq_search() will iterate only over such
entries. Anf patch changes hash algorithm to match syscache. Actually, patch
makes small refactoring of dynahash, it makes common function hash_do_lookup()
which does initial lookup in hash.Some artificial performance test is in attachment, command to test is 'time psql
< custom_types_and_array.sql', here I show only last rollback time and total
execution time:
1) master 92d2ab7554f92b841ea71bcc72eaa8ab11aae662
Time: 33353,288 ms (00:33,353)
psql < custom_types_and_array.sql 0,82s user 0,71s system 1% cpu 1:28,36 total2) mapRelType.patch
Time: 7455,581 ms (00:07,456)
psql < custom_types_and_array.sql 1,39s user 1,19s system 6% cpu 41,220 total3) hash_seq_init_with_hash_value.patch
Time: 24975,886 ms (00:24,976)
psql < custom_types_and_array.sql 1,33s user 1,25s system 3% cpu 1:19,77 total4) both
Time: 89,446 ms
psql < custom_types_and_array.sql 0,72s user 0,52s system 10% cpu 12,137 total
These changes look very promising. Unfortunately the proposed patches
conflict with each other regardless the order of applying:
```
error: patch failed: src/backend/utils/cache/typcache.c:356
error: src/backend/utils/cache/typcache.c: patch does not apply
```
So it's difficult to confirm case 4, not to mention the fact that we
are unable to test the patches on cfbot.
Could you please rebase the patches against the recent master branch
(in any order) and submit the result of `git format-patch` ?
--
Best regards,
Aleksander Alekseev
Hi!
Thank you for interesting in it!
These changes look very promising. Unfortunately the proposed patches
conflict with each other regardless the order of applying:```
error: patch failed: src/backend/utils/cache/typcache.c:356
error: src/backend/utils/cache/typcache.c: patch does not apply
```
Try increase -F option of patch.
Anyway, union of both patches in attachment
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
mapRelType_AND_hash_seq_init_with_hash_value.patchtext/x-patch; charset=UTF-8; name=mapRelType_AND_hash_seq_init_with_hash_value.patchDownload
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 0d4d0b0a154..ccf798c440c 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,15 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/* The map from relation's oid to its type oid */
+typedef struct mapRelTypeEntry
+{
+ Oid typrelid;
+ Oid type_id;
+} mapRelTypeEntry;
+
+static HTAB *mapRelType = NULL;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -330,6 +339,14 @@ static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+/*
+ * Hash function compatible with one-arg system cache hash function.
+ */
+static uint32
+type_cache_syshash(const void *key, Size keysize)
+{
+ return GetSysCacheHashValue1(TYPEOID, ObjectIdGetDatum(*(const Oid*)key));
+}
/*
* lookup_type_cache
@@ -355,8 +372,21 @@ lookup_type_cache(Oid type_id, int flags)
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
+ /*
+ * Hash function must be compatible to make possible work
+ * hash_seq_init_with_hash_value(). Hash value in TypeEntry is taken
+ * from system cache and we use the same (compatible) to use it's hash
+ * value to speedup search by hash value instead of scanning whole hash
+ */
+ ctl.hash = type_cache_syshash;
+
TypeCacheHash = hash_create("Type information cache", 64,
- &ctl, HASH_ELEM | HASH_BLOBS);
+ &ctl, HASH_ELEM | HASH_FUNCTION);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(mapRelTypeEntry);
+ mapRelType = hash_create("Map reloid to typeoid", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
@@ -407,8 +437,7 @@ lookup_type_cache(Oid type_id, int flags)
/* These fields can never change, by definition */
typentry->type_id = type_id;
- typentry->type_id_hash = GetSysCacheHashValue1(TYPEOID,
- ObjectIdGetDatum(type_id));
+ typentry->type_id_hash = get_hash_value(TypeCacheHash, &type_id);
/* Keep this part in sync with the code below */
typentry->typlen = typtup->typlen;
@@ -470,6 +499,24 @@ lookup_type_cache(Oid type_id, int flags)
ReleaseSysCache(tp);
}
+ /*
+ * Add record to map relation->type. We don't bother even of type became
+ * disconnected from relation, it seems to be impossible, but anyway,
+ * storing old data is safe, in a worstest case we will just do an extra
+ * cleanup cache entry.
+ */
+ if (OidIsValid(typentry->typrelid) && typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry*) hash_search(mapRelType,
+ &typentry->typrelid,
+ HASH_ENTER, NULL);
+
+ relentry->typrelid = typentry->typrelid;
+ relentry->type_id = typentry->type_id;
+ }
+
/*
* Look up opclasses if we haven't already and any dependent info is
* requested.
@@ -2264,6 +2311,37 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+static void
+invalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount, and free the tupdesc if none remain.
+ * (Can't use DecrTupleDescRefCount because this reference is
+ * not logged in current resource owner.)
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anything watching
+ * that will realize that the tupdesc has possibly changed.
+ * (Alternatively, we could specify that to detect possible
+ * tupdesc change, one must check for tupDesc != NULL as well
+ * as tupDesc_identifier being the same as what was previously
+ * seen. That seems error-prone.)
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2273,63 +2351,42 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We don't able to use syscache to find correspoding type because of
+ * we could be called outside of transaction. So, we track a separate map.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /* mapRelType/TypeCacheHash must exist, else this callback wouldn't be registered */
+
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry *) hash_search(mapRelType,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->type_id,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ invalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2341,6 +2398,35 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid = 0, so we need to reset all composite types in cache. Also, we
+ * should reset flags for domain types, and we loop over all entries
+ * in hash, so, do it in single scan
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ invalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't bother
+ * trying to determine whether the specific base type needs a
+ * reset.) Note that if we haven't determined whether the base
+ * type is composite, we don't need to reset anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2358,20 +2444,21 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
TypeCacheEntry *typentry;
/* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
+ if (hashvalue == 0)
+ hash_seq_init(&status, TypeCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue);
+
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
- /* Is this the targeted type row (or it's a total cache flush)? */
- if (hashvalue == 0 || typentry->type_id_hash == hashvalue)
- {
- /*
- * Mark the data obtained directly from pg_type as invalid. Also,
- * if it's a domain, typnotnull might've changed, so we'll need to
- * recalculate its constraints.
- */
- typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
- TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
- }
+ Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
+ /*
+ * Mark the data obtained directly from pg_type as invalid. Also,
+ * if it's a domain, typnotnull might've changed, so we'll need to
+ * recalculate its constraints.
+ */
+ typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
+ TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
}
}
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index a4152080b5d..0180e096f2d 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -962,6 +962,33 @@ hash_search(HTAB *hashp,
foundPtr);
}
+/*
+ * Helper function executed initial lookup of bucket by given hashvalue
+ */
+static HASHBUCKET*
+hash_do_lookup(HTAB *hashp, uint32 hashvalue, uint32 *bucket)
+{
+ HASHHDR *hctl = hashp->hctl;
+ HASHSEGMENT segp;
+ long segment_num;
+ long segment_ndx;
+
+ /*
+ * Do the initial lookup
+ */
+ *bucket = calc_bucket(hctl, hashvalue);
+
+ segment_num = *bucket >> hashp->sshift;
+ segment_ndx = MOD(*bucket, hashp->ssize);
+
+ segp = hashp->dir[segment_num];
+
+ if (segp == NULL)
+ hash_corrupted(hashp);
+
+ return &segp[segment_ndx];
+}
+
void *
hash_search_with_hash_value(HTAB *hashp,
const void *keyPtr,
@@ -973,9 +1000,6 @@ hash_search_with_hash_value(HTAB *hashp,
int freelist_idx = FREELIST_IDX(hctl, hashvalue);
Size keysize;
uint32 bucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HashCompareFunc match;
@@ -1008,17 +1032,7 @@ hash_search_with_hash_value(HTAB *hashp,
/*
* Do the initial lookup
*/
- bucket = calc_bucket(hctl, hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_do_lookup(hashp, hashvalue, &bucket);
currBucket = *prevBucketPtr;
/*
@@ -1159,14 +1173,10 @@ hash_update_hash_key(HTAB *hashp,
const void *newKeyPtr)
{
HASHELEMENT *existingElement = ELEMENT_FROM_KEY(existingEntry);
- HASHHDR *hctl = hashp->hctl;
uint32 newhashvalue;
Size keysize;
uint32 bucket;
uint32 newbucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HASHBUCKET *oldPrevPtr;
@@ -1187,17 +1197,7 @@ hash_update_hash_key(HTAB *hashp,
* this to be able to unlink it from its hash chain, but as a side benefit
* we can verify the validity of the passed existingEntry pointer.
*/
- bucket = calc_bucket(hctl, existingElement->hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_do_lookup(hashp, existingElement->hashvalue, &bucket);
currBucket = *prevBucketPtr;
while (currBucket != NULL)
@@ -1219,18 +1219,7 @@ hash_update_hash_key(HTAB *hashp,
* chain we want to put the entry into.
*/
newhashvalue = hashp->hash(newKeyPtr, hashp->keysize);
-
- newbucket = calc_bucket(hctl, newhashvalue);
-
- segment_num = newbucket >> hashp->sshift;
- segment_ndx = MOD(newbucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_do_lookup(hashp, newhashvalue, &newbucket);
currBucket = *prevBucketPtr;
/*
@@ -1423,10 +1412,27 @@ hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp)
status->hashp = hashp;
status->curBucket = 0;
status->curEntry = NULL;
+ status->hasHashvalue = false;
if (!hashp->frozen)
register_seq_scan(hashp);
}
+/*
+ * The same as previous but init sequentially search through hash table and
+ * return all the elements one by one with given hashvalue.
+ */
+void
+hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue)
+{
+ hash_seq_init(status, hashp);
+
+ status->hasHashvalue = true;
+ status->hashvalue = hashvalue;
+
+ status->curEntry = *hash_do_lookup(hashp, hashvalue, &status->curBucket);
+}
+
void *
hash_seq_search(HASH_SEQ_STATUS *status)
{
@@ -1440,6 +1446,24 @@ hash_seq_search(HASH_SEQ_STATUS *status)
uint32 curBucket;
HASHELEMENT *curElem;
+ if (status->hasHashvalue)
+ {
+ /*
+ * scan all entries in only one bucket because only current bucket could
+ * contain entries with given hashvalue
+ */
+ while ((curElem = status->curEntry) != NULL)
+ {
+ status->curEntry = curElem->link;
+ if (status->hashvalue != curElem->hashvalue)
+ continue;
+ return (void *) ELEMENTKEY(curElem);
+ }
+
+ hash_seq_term(status);
+ return NULL;
+ }
+
if ((curElem = status->curEntry) != NULL)
{
/* Continuing scan of curBucket... */
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index da26941f6db..c99d74625f7 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -122,6 +122,8 @@ typedef struct
HTAB *hashp;
uint32 curBucket; /* index of current bucket */
HASHELEMENT *curEntry; /* current entry in bucket */
+ bool hasHashvalue; /* true if hashvalue was provided */
+ uint32 hashvalue; /* hashvalue to start seqscan over hash */
} HASH_SEQ_STATUS;
/*
@@ -141,6 +143,8 @@ extern bool hash_update_hash_key(HTAB *hashp, void *existingEntry,
const void *newKeyPtr);
extern long hash_get_num_entries(HTAB *hashp);
extern void hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp);
+extern void hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue);
extern void *hash_seq_search(HASH_SEQ_STATUS *status);
extern void hash_seq_term(HASH_SEQ_STATUS *status);
extern void hash_freeze(HTAB *hashp);
Hi,
Thank you for interesting in it!
These changes look very promising. Unfortunately the proposed patches
conflict with each other regardless the order of applying:```
error: patch failed: src/backend/utils/cache/typcache.c:356
error: src/backend/utils/cache/typcache.c: patch does not apply
```Try increase -F option of patch.
Anyway, union of both patches in attachment
Thanks for the quick update.
I tested the patch on an Intel MacBook. A release build was used with
my typical configuration, TWIMC see single-install-meson.sh [1]https://github.com/afiskon/pgscripts/. The
speedup I got on the provided benchmark is about 150 times. cfbot
seems to be happy with the patch.
I would like to tweak the patch a little bit - change some comments,
add some Asserts, etc. Don't you mind?
[1]: https://github.com/afiskon/pgscripts/
--
Best regards,
Aleksander Alekseev
I would like to tweak the patch a little bit - change some comments,
add some Asserts, etc. Don't you mind?
You are welcome!
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Hi,
I would like to tweak the patch a little bit - change some comments,
add some Asserts, etc. Don't you mind?You are welcome!
Thanks. PFA the updated patch with some tweaks by me. I added the
commit message as well.
One thing that I couldn't immediately figure out is why 0 hash value
is treated as a magic invalid value in TypeCacheTypCallback():
```
- hash_seq_init(&status, TypeCacheHash);
+ if (hashvalue == 0)
+ hash_seq_init(&status, TypeCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, TypeCacheHash,
hashvalue);
```
Is there anything that prevents the actual hash value from being zero?
I don't think so, but maybe I missed something.
If zero is indeed an invalid hash value I would like to reference the
corresponding code. If zero is a correct hash value we should either
change this by adding something like `if(!hash) hash++` or use an
additional boolean argument here.
--
Best regards,
Aleksander Alekseev
Attachments:
v3-0001-Improve-performance-of-type-cache-cleanup.patchapplication/octet-stream; name=v3-0001-Improve-performance-of-type-cache-cleanup.patchDownload
From 86989b9be129ba561d436b026268e67ec9167993 Mon Sep 17 00:00:00 2001
From: Aleksander Alekseev <aleksander@timescale.com>
Date: Tue, 5 Mar 2024 14:13:40 +0300
Subject: [PATCH v3] Improve performance of type cache cleanup
This patch significantly improves the performance in cases when user creates
several thousands of temporary tables.
Firstly, the patch adds a local map between a relation and its type as it was
previously suggested in the comments for TypeCacheRelCallback(). Unfortunately,
syscache can't be used here because TypeCacheRelCallback() can be called
outside of a transaction.
Secondly, it modifies TypeCacheTypCallback() so that it finds an entry in the
hash table by O(1). Linear scan was used previously because the hash algorithm
differed from the one used by syscache. Additionally dynahash had no interface
for finding entries with a given hash value.
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1@sigaev.ru
---
src/backend/utils/cache/typcache.c | 211 +++++++++++++++++++++--------
src/backend/utils/hash/dynahash.c | 106 +++++++++------
src/include/utils/hsearch.h | 4 +
3 files changed, 221 insertions(+), 100 deletions(-)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 0d4d0b0a15..7525d1cee4 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,15 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/* The map from relation's oid to its type oid */
+typedef struct mapRelTypeEntry
+{
+ Oid typrelid;
+ Oid type_id;
+} mapRelTypeEntry;
+
+static HTAB *mapRelType = NULL;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -330,6 +339,15 @@ static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+/*
+ * Hash function compatible with one-arg system cache hash function.
+ */
+static uint32
+type_cache_syshash(const void *key, Size keysize)
+{
+ Assert(keysize == sizeof(Oid));
+ return GetSysCacheHashValue1(TYPEOID, ObjectIdGetDatum(*(const Oid*)key));
+}
/*
* lookup_type_cache
@@ -355,8 +373,20 @@ lookup_type_cache(Oid type_id, int flags)
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
+ /*
+ * TypeEntry takes hash value from the system cache. For TypeCacheHash
+ * we use the same hash in order to speedup search by hash value. This
+ * is used by hash_seq_init_with_hash_value().
+ */
+ ctl.hash = type_cache_syshash;
+
TypeCacheHash = hash_create("Type information cache", 64,
- &ctl, HASH_ELEM | HASH_BLOBS);
+ &ctl, HASH_ELEM | HASH_FUNCTION);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(mapRelTypeEntry);
+ mapRelType = hash_create("Map reloid to typeoid", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
@@ -407,8 +437,7 @@ lookup_type_cache(Oid type_id, int flags)
/* These fields can never change, by definition */
typentry->type_id = type_id;
- typentry->type_id_hash = GetSysCacheHashValue1(TYPEOID,
- ObjectIdGetDatum(type_id));
+ typentry->type_id_hash = get_hash_value(TypeCacheHash, &type_id);
/* Keep this part in sync with the code below */
typentry->typlen = typtup->typlen;
@@ -470,6 +499,24 @@ lookup_type_cache(Oid type_id, int flags)
ReleaseSysCache(tp);
}
+ /*
+ * Add a record to the relation->type map. We don't bother if type will
+ * become disconnected from the relation. Although it seems to be impossible,
+ * storing old data is safe in any case. In the worst case scenario we will
+ * just do an extra cleanup of a cache entry.
+ */
+ if (OidIsValid(typentry->typrelid) && typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry*) hash_search(mapRelType,
+ &typentry->typrelid,
+ HASH_ENTER, NULL);
+
+ relentry->typrelid = typentry->typrelid;
+ relentry->type_id = typentry->type_id;
+ }
+
/*
* Look up opclasses if we haven't already and any dependent info is
* requested.
@@ -2264,6 +2311,33 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+static void
+invalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching
+ * it will realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2273,63 +2347,46 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * a dedicated relid->type map, mapRelType.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * mapRelType and TypeCacheHash should exist, otherwise this callback
+ * wouldn't be registered
+ */
+
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry *) hash_search(mapRelType,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->type_id,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ invalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2341,6 +2398,35 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid = 0, so we need to reset all composite types in cache. Also, we
+ * should reset flags for domain types, and we loop over all entries
+ * in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ invalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't bother
+ * trying to determine whether the specific base type needs a
+ * reset.) Note that if we haven't determined whether the base
+ * type is composite, we don't need to reset anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2357,21 +2443,28 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
+ /*
+ XXX TODO FIXME
+ Why hash == 0 below is a special value?
+ - a.alekseev
+ */
+
/* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
+ if (hashvalue == 0)
+ hash_seq_init(&status, TypeCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue);
+
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
- /* Is this the targeted type row (or it's a total cache flush)? */
- if (hashvalue == 0 || typentry->type_id_hash == hashvalue)
- {
- /*
- * Mark the data obtained directly from pg_type as invalid. Also,
- * if it's a domain, typnotnull might've changed, so we'll need to
- * recalculate its constraints.
- */
- typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
- TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
- }
+ Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
+ /*
+ * Mark the data obtained directly from pg_type as invalid. Also,
+ * if it's a domain, typnotnull might've changed, so we'll need to
+ * recalculate its constraints.
+ */
+ typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
+ TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
}
}
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index a4152080b5..3fc8b7ff5e 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -962,6 +962,33 @@ hash_search(HTAB *hashp,
foundPtr);
}
+/*
+ * Do initial lookup of a bucket by the given hash value.
+ */
+static HASHBUCKET*
+hash_initial_lookup(HTAB *hashp, uint32 hashvalue, uint32 *bucket)
+{
+ HASHHDR *hctl = hashp->hctl;
+ HASHSEGMENT segp;
+ long segment_num;
+ long segment_ndx;
+
+ /*
+ * Do the initial lookup
+ */
+ *bucket = calc_bucket(hctl, hashvalue);
+
+ segment_num = *bucket >> hashp->sshift;
+ segment_ndx = MOD(*bucket, hashp->ssize);
+
+ segp = hashp->dir[segment_num];
+
+ if (segp == NULL)
+ hash_corrupted(hashp);
+
+ return &segp[segment_ndx];
+}
+
void *
hash_search_with_hash_value(HTAB *hashp,
const void *keyPtr,
@@ -973,9 +1000,6 @@ hash_search_with_hash_value(HTAB *hashp,
int freelist_idx = FREELIST_IDX(hctl, hashvalue);
Size keysize;
uint32 bucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HashCompareFunc match;
@@ -1008,17 +1032,7 @@ hash_search_with_hash_value(HTAB *hashp,
/*
* Do the initial lookup
*/
- bucket = calc_bucket(hctl, hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, hashvalue, &bucket);
currBucket = *prevBucketPtr;
/*
@@ -1159,14 +1173,10 @@ hash_update_hash_key(HTAB *hashp,
const void *newKeyPtr)
{
HASHELEMENT *existingElement = ELEMENT_FROM_KEY(existingEntry);
- HASHHDR *hctl = hashp->hctl;
uint32 newhashvalue;
Size keysize;
uint32 bucket;
uint32 newbucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HASHBUCKET *oldPrevPtr;
@@ -1187,17 +1197,7 @@ hash_update_hash_key(HTAB *hashp,
* this to be able to unlink it from its hash chain, but as a side benefit
* we can verify the validity of the passed existingEntry pointer.
*/
- bucket = calc_bucket(hctl, existingElement->hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, existingElement->hashvalue, &bucket);
currBucket = *prevBucketPtr;
while (currBucket != NULL)
@@ -1219,18 +1219,7 @@ hash_update_hash_key(HTAB *hashp,
* chain we want to put the entry into.
*/
newhashvalue = hashp->hash(newKeyPtr, hashp->keysize);
-
- newbucket = calc_bucket(hctl, newhashvalue);
-
- segment_num = newbucket >> hashp->sshift;
- segment_ndx = MOD(newbucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, newhashvalue, &newbucket);
currBucket = *prevBucketPtr;
/*
@@ -1423,10 +1412,27 @@ hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp)
status->hashp = hashp;
status->curBucket = 0;
status->curEntry = NULL;
+ status->hasHashvalue = false;
if (!hashp->frozen)
register_seq_scan(hashp);
}
+/*
+ * Same as above but scan by the given hash value.
+ * See also hash_seq_search().
+ */
+void
+hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue)
+{
+ hash_seq_init(status, hashp);
+
+ status->hasHashvalue = true;
+ status->hashvalue = hashvalue;
+
+ status->curEntry = *hash_initial_lookup(hashp, hashvalue, &status->curBucket);
+}
+
void *
hash_seq_search(HASH_SEQ_STATUS *status)
{
@@ -1440,6 +1446,24 @@ hash_seq_search(HASH_SEQ_STATUS *status)
uint32 curBucket;
HASHELEMENT *curElem;
+ if (status->hasHashvalue)
+ {
+ /*
+ * Scan entries only in the current bucket because only this bucket can
+ * contain entries with the given hash value.
+ */
+ while ((curElem = status->curEntry) != NULL)
+ {
+ status->curEntry = curElem->link;
+ if (status->hashvalue != curElem->hashvalue)
+ continue;
+ return (void *) ELEMENTKEY(curElem);
+ }
+
+ hash_seq_term(status);
+ return NULL;
+ }
+
if ((curElem = status->curEntry) != NULL)
{
/* Continuing scan of curBucket... */
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index da26941f6d..c99d74625f 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -122,6 +122,8 @@ typedef struct
HTAB *hashp;
uint32 curBucket; /* index of current bucket */
HASHELEMENT *curEntry; /* current entry in bucket */
+ bool hasHashvalue; /* true if hashvalue was provided */
+ uint32 hashvalue; /* hashvalue to start seqscan over hash */
} HASH_SEQ_STATUS;
/*
@@ -141,6 +143,8 @@ extern bool hash_update_hash_key(HTAB *hashp, void *existingEntry,
const void *newKeyPtr);
extern long hash_get_num_entries(HTAB *hashp);
extern void hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp);
+extern void hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue);
extern void *hash_seq_search(HASH_SEQ_STATUS *status);
extern void hash_seq_term(HASH_SEQ_STATUS *status);
extern void hash_freeze(HTAB *hashp);
--
2.43.2
Aleksander Alekseev <aleksander@timescale.com> writes:
One thing that I couldn't immediately figure out is why 0 hash value
is treated as a magic invalid value in TypeCacheTypCallback():
I've not read this patch, but IIRC in some places we have a convention
that hash value zero is passed for an sinval reset event (that is,
"flush all cache entries").
regards, tom lane
Yep, exacly. One time from 2^32 we reset whole cache instead of one (or several)
entry with hash value = 0.
On 08.03.2024 18:35, Tom Lane wrote:
Aleksander Alekseev <aleksander@timescale.com> writes:
One thing that I couldn't immediately figure out is why 0 hash value
is treated as a magic invalid value in TypeCacheTypCallback():I've not read this patch, but IIRC in some places we have a convention
that hash value zero is passed for an sinval reset event (that is,
"flush all cache entries").regards, tom lane
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Hi,
Yep, exacly. One time from 2^32 we reset whole cache instead of one (or several)
entry with hash value = 0.
Got it. Here is an updated patch where I added a corresponding comment.
Now the patch LGTM. I'm going to change its status to RfC unless
anyone wants to review it too.
--
Best regards,
Aleksander Alekseev
Attachments:
v4-0001-Improve-performance-of-type-cache-cleanup.patchapplication/octet-stream; name=v4-0001-Improve-performance-of-type-cache-cleanup.patchDownload
From 0d6171e0ad039af624be31f44eb60c3788e275b7 Mon Sep 17 00:00:00 2001
From: Aleksander Alekseev <aleksander@timescale.com>
Date: Tue, 5 Mar 2024 14:13:40 +0300
Subject: [PATCH v4] Improve performance of type cache cleanup
This patch significantly improves the performance in cases when user creates
several thousands of temporary tables.
Firstly, the patch adds a local map between a relation and its type as it was
previously suggested in the comments for TypeCacheRelCallback(). Unfortunately,
syscache can't be used here because TypeCacheRelCallback() can be called
outside of a transaction.
Secondly, it modifies TypeCacheTypCallback() so that it finds an entry in the
hash table by O(1). Linear scan was used previously because the hash algorithm
differed from the one used by syscache. Additionally dynahash had no interface
for finding entries with a given hash value.
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1@sigaev.ru
---
src/backend/utils/cache/typcache.c | 211 +++++++++++++++++++++--------
src/backend/utils/hash/dynahash.c | 106 +++++++++------
src/include/utils/hsearch.h | 4 +
3 files changed, 221 insertions(+), 100 deletions(-)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 0d4d0b0a15..9145088f44 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,15 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/* The map from relation's oid to its type oid */
+typedef struct mapRelTypeEntry
+{
+ Oid typrelid;
+ Oid type_id;
+} mapRelTypeEntry;
+
+static HTAB *mapRelType = NULL;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -330,6 +339,15 @@ static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+/*
+ * Hash function compatible with one-arg system cache hash function.
+ */
+static uint32
+type_cache_syshash(const void *key, Size keysize)
+{
+ Assert(keysize == sizeof(Oid));
+ return GetSysCacheHashValue1(TYPEOID, ObjectIdGetDatum(*(const Oid*)key));
+}
/*
* lookup_type_cache
@@ -355,8 +373,20 @@ lookup_type_cache(Oid type_id, int flags)
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
+ /*
+ * TypeEntry takes hash value from the system cache. For TypeCacheHash
+ * we use the same hash in order to speedup search by hash value. This
+ * is used by hash_seq_init_with_hash_value().
+ */
+ ctl.hash = type_cache_syshash;
+
TypeCacheHash = hash_create("Type information cache", 64,
- &ctl, HASH_ELEM | HASH_BLOBS);
+ &ctl, HASH_ELEM | HASH_FUNCTION);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(mapRelTypeEntry);
+ mapRelType = hash_create("Map reloid to typeoid", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
@@ -407,8 +437,7 @@ lookup_type_cache(Oid type_id, int flags)
/* These fields can never change, by definition */
typentry->type_id = type_id;
- typentry->type_id_hash = GetSysCacheHashValue1(TYPEOID,
- ObjectIdGetDatum(type_id));
+ typentry->type_id_hash = get_hash_value(TypeCacheHash, &type_id);
/* Keep this part in sync with the code below */
typentry->typlen = typtup->typlen;
@@ -470,6 +499,24 @@ lookup_type_cache(Oid type_id, int flags)
ReleaseSysCache(tp);
}
+ /*
+ * Add a record to the relation->type map. We don't bother if type will
+ * become disconnected from the relation. Although it seems to be impossible,
+ * storing old data is safe in any case. In the worst case scenario we will
+ * just do an extra cleanup of a cache entry.
+ */
+ if (OidIsValid(typentry->typrelid) && typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry*) hash_search(mapRelType,
+ &typentry->typrelid,
+ HASH_ENTER, NULL);
+
+ relentry->typrelid = typentry->typrelid;
+ relentry->type_id = typentry->type_id;
+ }
+
/*
* Look up opclasses if we haven't already and any dependent info is
* requested.
@@ -2264,6 +2311,33 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+static void
+invalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching
+ * it will realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2273,63 +2347,46 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * a dedicated relid->type map, mapRelType.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * mapRelType and TypeCacheHash should exist, otherwise this callback
+ * wouldn't be registered
+ */
+
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry *) hash_search(mapRelType,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->type_id,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ invalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2341,6 +2398,35 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid = 0, so we need to reset all composite types in cache. Also, we
+ * should reset flags for domain types, and we loop over all entries
+ * in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ invalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't bother
+ * trying to determine whether the specific base type needs a
+ * reset.) Note that if we haven't determined whether the base
+ * type is composite, we don't need to reset anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2358,20 +2444,27 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
TypeCacheEntry *typentry;
/* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
+
+ /*
+ * By convection, zero hash value is passed to the callback as a sign
+ * that it's time to invalidate the cache. See sinval.c, inval.c and
+ * InvalidateSystemCachesExtended().
+ */
+ if (hashvalue == 0)
+ hash_seq_init(&status, TypeCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue);
+
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
- /* Is this the targeted type row (or it's a total cache flush)? */
- if (hashvalue == 0 || typentry->type_id_hash == hashvalue)
- {
- /*
- * Mark the data obtained directly from pg_type as invalid. Also,
- * if it's a domain, typnotnull might've changed, so we'll need to
- * recalculate its constraints.
- */
- typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
- TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
- }
+ Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
+ /*
+ * Mark the data obtained directly from pg_type as invalid. Also,
+ * if it's a domain, typnotnull might've changed, so we'll need to
+ * recalculate its constraints.
+ */
+ typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
+ TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
}
}
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index a4152080b5..3fc8b7ff5e 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -962,6 +962,33 @@ hash_search(HTAB *hashp,
foundPtr);
}
+/*
+ * Do initial lookup of a bucket by the given hash value.
+ */
+static HASHBUCKET*
+hash_initial_lookup(HTAB *hashp, uint32 hashvalue, uint32 *bucket)
+{
+ HASHHDR *hctl = hashp->hctl;
+ HASHSEGMENT segp;
+ long segment_num;
+ long segment_ndx;
+
+ /*
+ * Do the initial lookup
+ */
+ *bucket = calc_bucket(hctl, hashvalue);
+
+ segment_num = *bucket >> hashp->sshift;
+ segment_ndx = MOD(*bucket, hashp->ssize);
+
+ segp = hashp->dir[segment_num];
+
+ if (segp == NULL)
+ hash_corrupted(hashp);
+
+ return &segp[segment_ndx];
+}
+
void *
hash_search_with_hash_value(HTAB *hashp,
const void *keyPtr,
@@ -973,9 +1000,6 @@ hash_search_with_hash_value(HTAB *hashp,
int freelist_idx = FREELIST_IDX(hctl, hashvalue);
Size keysize;
uint32 bucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HashCompareFunc match;
@@ -1008,17 +1032,7 @@ hash_search_with_hash_value(HTAB *hashp,
/*
* Do the initial lookup
*/
- bucket = calc_bucket(hctl, hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, hashvalue, &bucket);
currBucket = *prevBucketPtr;
/*
@@ -1159,14 +1173,10 @@ hash_update_hash_key(HTAB *hashp,
const void *newKeyPtr)
{
HASHELEMENT *existingElement = ELEMENT_FROM_KEY(existingEntry);
- HASHHDR *hctl = hashp->hctl;
uint32 newhashvalue;
Size keysize;
uint32 bucket;
uint32 newbucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HASHBUCKET *oldPrevPtr;
@@ -1187,17 +1197,7 @@ hash_update_hash_key(HTAB *hashp,
* this to be able to unlink it from its hash chain, but as a side benefit
* we can verify the validity of the passed existingEntry pointer.
*/
- bucket = calc_bucket(hctl, existingElement->hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, existingElement->hashvalue, &bucket);
currBucket = *prevBucketPtr;
while (currBucket != NULL)
@@ -1219,18 +1219,7 @@ hash_update_hash_key(HTAB *hashp,
* chain we want to put the entry into.
*/
newhashvalue = hashp->hash(newKeyPtr, hashp->keysize);
-
- newbucket = calc_bucket(hctl, newhashvalue);
-
- segment_num = newbucket >> hashp->sshift;
- segment_ndx = MOD(newbucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, newhashvalue, &newbucket);
currBucket = *prevBucketPtr;
/*
@@ -1423,10 +1412,27 @@ hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp)
status->hashp = hashp;
status->curBucket = 0;
status->curEntry = NULL;
+ status->hasHashvalue = false;
if (!hashp->frozen)
register_seq_scan(hashp);
}
+/*
+ * Same as above but scan by the given hash value.
+ * See also hash_seq_search().
+ */
+void
+hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue)
+{
+ hash_seq_init(status, hashp);
+
+ status->hasHashvalue = true;
+ status->hashvalue = hashvalue;
+
+ status->curEntry = *hash_initial_lookup(hashp, hashvalue, &status->curBucket);
+}
+
void *
hash_seq_search(HASH_SEQ_STATUS *status)
{
@@ -1440,6 +1446,24 @@ hash_seq_search(HASH_SEQ_STATUS *status)
uint32 curBucket;
HASHELEMENT *curElem;
+ if (status->hasHashvalue)
+ {
+ /*
+ * Scan entries only in the current bucket because only this bucket can
+ * contain entries with the given hash value.
+ */
+ while ((curElem = status->curEntry) != NULL)
+ {
+ status->curEntry = curElem->link;
+ if (status->hashvalue != curElem->hashvalue)
+ continue;
+ return (void *) ELEMENTKEY(curElem);
+ }
+
+ hash_seq_term(status);
+ return NULL;
+ }
+
if ((curElem = status->curEntry) != NULL)
{
/* Continuing scan of curBucket... */
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index da26941f6d..c99d74625f 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -122,6 +122,8 @@ typedef struct
HTAB *hashp;
uint32 curBucket; /* index of current bucket */
HASHELEMENT *curEntry; /* current entry in bucket */
+ bool hasHashvalue; /* true if hashvalue was provided */
+ uint32 hashvalue; /* hashvalue to start seqscan over hash */
} HASH_SEQ_STATUS;
/*
@@ -141,6 +143,8 @@ extern bool hash_update_hash_key(HTAB *hashp, void *existingEntry,
const void *newKeyPtr);
extern long hash_get_num_entries(HTAB *hashp);
extern void hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp);
+extern void hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue);
extern void *hash_seq_search(HASH_SEQ_STATUS *status);
extern void hash_seq_term(HASH_SEQ_STATUS *status);
extern void hash_freeze(HTAB *hashp);
--
2.44.0
Got it. Here is an updated patch where I added a corresponding comment.
Thank you!
Playing around I found one more place which could easily modified with
hash_seq_init_with_hash_value() call.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
attoptcache-v1.patchtext/x-patch; charset=UTF-8; name=attoptcache-v1.patchDownload
diff --git a/src/backend/utils/cache/attoptcache.c b/src/backend/utils/cache/attoptcache.c
index af978ccd4b1..28980620662 100644
--- a/src/backend/utils/cache/attoptcache.c
+++ b/src/backend/utils/cache/attoptcache.c
@@ -44,12 +44,10 @@ typedef struct
/*
* InvalidateAttoptCacheCallback
- * Flush all cache entries when pg_attribute is updated.
+ * Flush cache entry (or entries) when pg_attribute is updated.
*
* When pg_attribute is updated, we must flush the cache entry at least
- * for that attribute. Currently, we just flush them all. Since attribute
- * options are not currently used in performance-critical paths (such as
- * query execution), this seems OK.
+ * for that attribute.
*/
static void
InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
@@ -57,7 +55,16 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
HASH_SEQ_STATUS status;
AttoptCacheEntry *attopt;
- hash_seq_init(&status, AttoptCacheHash);
+ /*
+ * By convection, zero hash value is passed to the callback as a sign
+ * that it's time to invalidate the cache. See sinval.c, inval.c and
+ * InvalidateSystemCachesExtended().
+ */
+ if (hashvalue == 0)
+ hash_seq_init(&status, AttoptCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, AttoptCacheHash, hashvalue);
+
while ((attopt = (AttoptCacheEntry *) hash_seq_search(&status)) != NULL)
{
if (attopt->opts)
@@ -70,6 +77,17 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
}
}
+/*
+ * Hash function compatible with two-arg system cache hash function.
+ */
+static uint32
+relatt_cache_syshash(const void *key, Size keysize)
+{
+ const AttoptCacheKey* ckey = key;
+
+ return GetSysCacheHashValue2(ATTNUM, ckey->attrelid, ckey->attnum);
+}
+
/*
* InitializeAttoptCache
* Initialize the attribute options cache.
@@ -82,9 +100,17 @@ InitializeAttoptCache(void)
/* Initialize the hash table. */
ctl.keysize = sizeof(AttoptCacheKey);
ctl.entrysize = sizeof(AttoptCacheEntry);
+
+ /*
+ * AttoptCacheEntry takes hash value from the system cache. For
+ * AttoptCacheHash we use the same hash in order to speedup search by hash
+ * value. This is used by hash_seq_init_with_hash_value().
+ */
+ ctl.hash = relatt_cache_syshash;
+
AttoptCacheHash =
hash_create("Attopt cache", 256, &ctl,
- HASH_ELEM | HASH_BLOBS);
+ HASH_ELEM | HASH_FUNCTION);
/* Make sure we've initialized CacheMemoryContext. */
if (!CacheMemoryContext)
On Tue, Mar 12, 2024 at 06:55:41PM +0300, Teodor Sigaev wrote:
Playing around I found one more place which could easily modified with
hash_seq_init_with_hash_value() call.
I think that this patch should be split for clarity, as there are a
few things that are independently useful. I guess something like
that:
- Introduction of hash_initial_lookup(), that simplifies 3 places of
dynahash.c where the same code is used. The routine should be
inlined.
- The split in hash_seq_search to force a different type of search is
weird, complicating the dynahash interface by hiding what seems like a
search mode. Rather than hasHashvalue that's hidden in the middle of
HASH_SEQ_STATUS, could it be better to have an entirely different API
for the search? That should be a patch on its own, as well.
- The typcache changes.
--
Michael
I think that this patch should be split for clarity, as there are a
few things that are independently useful. I guess something like
that:
Done, all patches should be applied consequentially.
- The typcache changes.
01-map_rel_to_type.v5.patch adds map relation to its type
- Introduction of hash_initial_lookup(), that simplifies 3 places of
dynahash.c where the same code is used. The routine should be
inlined.
- The split in hash_seq_search to force a different type of search is
weird, complicating the dynahash interface by hiding what seems like a
search mode. Rather than hasHashvalue that's hidden in the middle of
HASH_SEQ_STATUS, could it be better to have an entirely different API
for the search? That should be a patch on its own, as well.
02-hash_seq_init_with_hash_value.v5.patch - introduces a
hash_seq_init_with_hash_value() method. hash_initial_lookup() is marked as
inline, but I suppose, modern compilers are smart enough to inline it automatically.
Using separate interface for scanning hash with hash value will make scan code
more ugly in case when we need to use special value of hash value as it is done
in cache's scans. Look, instead of this simplified code:
if (hashvalue == 0)
hash_seq_init(&status, TypeCacheHash);
else
hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue);
while ((typentry = hash_seq_search(&status)) != NULL) {
...
}
we will need to code something like that:
if (hashvalue == 0)
{
hash_seq_init(&status, TypeCacheHash);
while ((typentry = hash_seq_search(&status)) != NULL) {
...
}
}
else
{
hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue);
while ((typentry = hash_seq_search_with_hash_value(&status)) != NULL) {
...
}
}
Or I didn't understand you.
I thought about integrate check inside existing loop in hash_seq_search() :
+ rerun:
if ((curElem = status->curEntry) != NULL)
{
/* Continuing scan of curBucket... */
status->curEntry = curElem->link;
if (status->curEntry == NULL) /* end of this bucket */
+ {
+ if (status->hasHashvalue)
+ hash_seq_term(status);
++status->curBucket;
+ }
+ else if (status->hasHashvalue && status->hashvalue !=
+ curElem->hashvalue)
+ goto rerun;
return (void *) ELEMENTKEY(curElem);
}
But for me it looks weird and adds some checks which will takes some CPU time.
03-att_with_hash_value.v5.patch - adds usage of previous patch.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
03-att_with_hash_value.v5.patchtext/x-patch; charset=UTF-8; name=03-att_with_hash_value.v5.patchDownload
diff --git a/src/backend/utils/cache/attoptcache.c b/src/backend/utils/cache/attoptcache.c
index af978ccd4b1..3a18b2e9a77 100644
--- a/src/backend/utils/cache/attoptcache.c
+++ b/src/backend/utils/cache/attoptcache.c
@@ -44,12 +44,10 @@ typedef struct
/*
* InvalidateAttoptCacheCallback
- * Flush all cache entries when pg_attribute is updated.
+ * Flush cache entry (or entries) when pg_attribute is updated.
*
* When pg_attribute is updated, we must flush the cache entry at least
- * for that attribute. Currently, we just flush them all. Since attribute
- * options are not currently used in performance-critical paths (such as
- * query execution), this seems OK.
+ * for that attribute.
*/
static void
InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
@@ -57,7 +55,16 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
HASH_SEQ_STATUS status;
AttoptCacheEntry *attopt;
- hash_seq_init(&status, AttoptCacheHash);
+ /*
+ * By convection, zero hash value is passed to the callback as a sign
+ * that it's time to invalidate the cache. See sinval.c, inval.c and
+ * InvalidateSystemCachesExtended().
+ */
+ if (hashvalue == 0)
+ hash_seq_init(&status, AttoptCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, AttoptCacheHash, hashvalue);
+
while ((attopt = (AttoptCacheEntry *) hash_seq_search(&status)) != NULL)
{
if (attopt->opts)
@@ -70,6 +77,18 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
}
}
+/*
+ * Hash function compatible with two-arg system cache hash function.
+ */
+static uint32
+relatt_cache_syshash(const void *key, Size keysize)
+{
+ const AttoptCacheKey* ckey = key;
+
+ Assert(keysize == sizeof(*ckey));
+ return GetSysCacheHashValue2(ATTNUM, ckey->attrelid, ckey->attnum);
+}
+
/*
* InitializeAttoptCache
* Initialize the attribute options cache.
@@ -82,9 +101,17 @@ InitializeAttoptCache(void)
/* Initialize the hash table. */
ctl.keysize = sizeof(AttoptCacheKey);
ctl.entrysize = sizeof(AttoptCacheEntry);
+
+ /*
+ * AttoptCacheEntry takes hash value from the system cache. For
+ * AttoptCacheHash we use the same hash in order to speedup search by hash
+ * value. This is used by hash_seq_init_with_hash_value().
+ */
+ ctl.hash = relatt_cache_syshash;
+
AttoptCacheHash =
hash_create("Attopt cache", 256, &ctl,
- HASH_ELEM | HASH_BLOBS);
+ HASH_ELEM | HASH_FUNCTION);
/* Make sure we've initialized CacheMemoryContext. */
if (!CacheMemoryContext)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 7936c3b46d0..9145088f44d 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -339,6 +339,15 @@ static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+/*
+ * Hash function compatible with one-arg system cache hash function.
+ */
+static uint32
+type_cache_syshash(const void *key, Size keysize)
+{
+ Assert(keysize == sizeof(Oid));
+ return GetSysCacheHashValue1(TYPEOID, ObjectIdGetDatum(*(const Oid*)key));
+}
/*
* lookup_type_cache
@@ -364,8 +373,15 @@ lookup_type_cache(Oid type_id, int flags)
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
+ /*
+ * TypeEntry takes hash value from the system cache. For TypeCacheHash
+ * we use the same hash in order to speedup search by hash value. This
+ * is used by hash_seq_init_with_hash_value().
+ */
+ ctl.hash = type_cache_syshash;
+
TypeCacheHash = hash_create("Type information cache", 64,
- &ctl, HASH_ELEM | HASH_BLOBS);
+ &ctl, HASH_ELEM | HASH_FUNCTION);
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(mapRelTypeEntry);
@@ -421,8 +437,7 @@ lookup_type_cache(Oid type_id, int flags)
/* These fields can never change, by definition */
typentry->type_id = type_id;
- typentry->type_id_hash = GetSysCacheHashValue1(TYPEOID,
- ObjectIdGetDatum(type_id));
+ typentry->type_id_hash = get_hash_value(TypeCacheHash, &type_id);
/* Keep this part in sync with the code below */
typentry->typlen = typtup->typlen;
@@ -2429,20 +2444,27 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
TypeCacheEntry *typentry;
/* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
+
+ /*
+ * By convection, zero hash value is passed to the callback as a sign
+ * that it's time to invalidate the cache. See sinval.c, inval.c and
+ * InvalidateSystemCachesExtended().
+ */
+ if (hashvalue == 0)
+ hash_seq_init(&status, TypeCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue);
+
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
- /* Is this the targeted type row (or it's a total cache flush)? */
- if (hashvalue == 0 || typentry->type_id_hash == hashvalue)
- {
- /*
- * Mark the data obtained directly from pg_type as invalid. Also,
- * if it's a domain, typnotnull might've changed, so we'll need to
- * recalculate its constraints.
- */
- typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
- TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
- }
+ Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
+ /*
+ * Mark the data obtained directly from pg_type as invalid. Also,
+ * if it's a domain, typnotnull might've changed, so we'll need to
+ * recalculate its constraints.
+ */
+ typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
+ TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
}
}
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index a4152080b5d..dfea7a904e2 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -962,6 +962,33 @@ hash_search(HTAB *hashp,
foundPtr);
}
+/*
+ * Do initial lookup of a bucket by the given hash value.
+ */
+static inline HASHBUCKET*
+hash_initial_lookup(HTAB *hashp, uint32 hashvalue, uint32 *bucket)
+{
+ HASHHDR *hctl = hashp->hctl;
+ HASHSEGMENT segp;
+ long segment_num;
+ long segment_ndx;
+
+ /*
+ * Do the initial lookup
+ */
+ *bucket = calc_bucket(hctl, hashvalue);
+
+ segment_num = *bucket >> hashp->sshift;
+ segment_ndx = MOD(*bucket, hashp->ssize);
+
+ segp = hashp->dir[segment_num];
+
+ if (segp == NULL)
+ hash_corrupted(hashp);
+
+ return &segp[segment_ndx];
+}
+
void *
hash_search_with_hash_value(HTAB *hashp,
const void *keyPtr,
@@ -973,9 +1000,6 @@ hash_search_with_hash_value(HTAB *hashp,
int freelist_idx = FREELIST_IDX(hctl, hashvalue);
Size keysize;
uint32 bucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HashCompareFunc match;
@@ -1008,17 +1032,7 @@ hash_search_with_hash_value(HTAB *hashp,
/*
* Do the initial lookup
*/
- bucket = calc_bucket(hctl, hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, hashvalue, &bucket);
currBucket = *prevBucketPtr;
/*
@@ -1159,14 +1173,10 @@ hash_update_hash_key(HTAB *hashp,
const void *newKeyPtr)
{
HASHELEMENT *existingElement = ELEMENT_FROM_KEY(existingEntry);
- HASHHDR *hctl = hashp->hctl;
uint32 newhashvalue;
Size keysize;
uint32 bucket;
uint32 newbucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HASHBUCKET *oldPrevPtr;
@@ -1187,17 +1197,7 @@ hash_update_hash_key(HTAB *hashp,
* this to be able to unlink it from its hash chain, but as a side benefit
* we can verify the validity of the passed existingEntry pointer.
*/
- bucket = calc_bucket(hctl, existingElement->hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, existingElement->hashvalue, &bucket);
currBucket = *prevBucketPtr;
while (currBucket != NULL)
@@ -1219,18 +1219,7 @@ hash_update_hash_key(HTAB *hashp,
* chain we want to put the entry into.
*/
newhashvalue = hashp->hash(newKeyPtr, hashp->keysize);
-
- newbucket = calc_bucket(hctl, newhashvalue);
-
- segment_num = newbucket >> hashp->sshift;
- segment_ndx = MOD(newbucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, newhashvalue, &newbucket);
currBucket = *prevBucketPtr;
/*
@@ -1423,10 +1412,27 @@ hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp)
status->hashp = hashp;
status->curBucket = 0;
status->curEntry = NULL;
+ status->hasHashvalue = false;
if (!hashp->frozen)
register_seq_scan(hashp);
}
+/*
+ * Same as above but scan by the given hash value.
+ * See also hash_seq_search().
+ */
+void
+hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue)
+{
+ hash_seq_init(status, hashp);
+
+ status->hasHashvalue = true;
+ status->hashvalue = hashvalue;
+
+ status->curEntry = *hash_initial_lookup(hashp, hashvalue, &status->curBucket);
+}
+
void *
hash_seq_search(HASH_SEQ_STATUS *status)
{
@@ -1440,6 +1446,24 @@ hash_seq_search(HASH_SEQ_STATUS *status)
uint32 curBucket;
HASHELEMENT *curElem;
+ if (status->hasHashvalue)
+ {
+ /*
+ * Scan entries only in the current bucket because only this bucket can
+ * contain entries with the given hash value.
+ */
+ while ((curElem = status->curEntry) != NULL)
+ {
+ status->curEntry = curElem->link;
+ if (status->hashvalue != curElem->hashvalue)
+ continue;
+ return (void *) ELEMENTKEY(curElem);
+ }
+
+ hash_seq_term(status);
+ return NULL;
+ }
+
if ((curElem = status->curEntry) != NULL)
{
/* Continuing scan of curBucket... */
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index da26941f6db..c99d74625f7 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -122,6 +122,8 @@ typedef struct
HTAB *hashp;
uint32 curBucket; /* index of current bucket */
HASHELEMENT *curEntry; /* current entry in bucket */
+ bool hasHashvalue; /* true if hashvalue was provided */
+ uint32 hashvalue; /* hashvalue to start seqscan over hash */
} HASH_SEQ_STATUS;
/*
@@ -141,6 +143,8 @@ extern bool hash_update_hash_key(HTAB *hashp, void *existingEntry,
const void *newKeyPtr);
extern long hash_get_num_entries(HTAB *hashp);
extern void hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp);
+extern void hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue);
extern void *hash_seq_search(HASH_SEQ_STATUS *status);
extern void hash_seq_term(HASH_SEQ_STATUS *status);
extern void hash_freeze(HTAB *hashp);
02-hash_seq_init_with_hash_value.v5.patchtext/x-patch; charset=UTF-8; name=02-hash_seq_init_with_hash_value.v5.patchDownload
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index a4152080b5d..dfea7a904e2 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -962,6 +962,33 @@ hash_search(HTAB *hashp,
foundPtr);
}
+/*
+ * Do initial lookup of a bucket by the given hash value.
+ */
+static inline HASHBUCKET*
+hash_initial_lookup(HTAB *hashp, uint32 hashvalue, uint32 *bucket)
+{
+ HASHHDR *hctl = hashp->hctl;
+ HASHSEGMENT segp;
+ long segment_num;
+ long segment_ndx;
+
+ /*
+ * Do the initial lookup
+ */
+ *bucket = calc_bucket(hctl, hashvalue);
+
+ segment_num = *bucket >> hashp->sshift;
+ segment_ndx = MOD(*bucket, hashp->ssize);
+
+ segp = hashp->dir[segment_num];
+
+ if (segp == NULL)
+ hash_corrupted(hashp);
+
+ return &segp[segment_ndx];
+}
+
void *
hash_search_with_hash_value(HTAB *hashp,
const void *keyPtr,
@@ -973,9 +1000,6 @@ hash_search_with_hash_value(HTAB *hashp,
int freelist_idx = FREELIST_IDX(hctl, hashvalue);
Size keysize;
uint32 bucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HashCompareFunc match;
@@ -1008,17 +1032,7 @@ hash_search_with_hash_value(HTAB *hashp,
/*
* Do the initial lookup
*/
- bucket = calc_bucket(hctl, hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, hashvalue, &bucket);
currBucket = *prevBucketPtr;
/*
@@ -1159,14 +1173,10 @@ hash_update_hash_key(HTAB *hashp,
const void *newKeyPtr)
{
HASHELEMENT *existingElement = ELEMENT_FROM_KEY(existingEntry);
- HASHHDR *hctl = hashp->hctl;
uint32 newhashvalue;
Size keysize;
uint32 bucket;
uint32 newbucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HASHBUCKET *oldPrevPtr;
@@ -1187,17 +1197,7 @@ hash_update_hash_key(HTAB *hashp,
* this to be able to unlink it from its hash chain, but as a side benefit
* we can verify the validity of the passed existingEntry pointer.
*/
- bucket = calc_bucket(hctl, existingElement->hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, existingElement->hashvalue, &bucket);
currBucket = *prevBucketPtr;
while (currBucket != NULL)
@@ -1219,18 +1219,7 @@ hash_update_hash_key(HTAB *hashp,
* chain we want to put the entry into.
*/
newhashvalue = hashp->hash(newKeyPtr, hashp->keysize);
-
- newbucket = calc_bucket(hctl, newhashvalue);
-
- segment_num = newbucket >> hashp->sshift;
- segment_ndx = MOD(newbucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ prevBucketPtr = hash_initial_lookup(hashp, newhashvalue, &newbucket);
currBucket = *prevBucketPtr;
/*
@@ -1423,10 +1412,27 @@ hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp)
status->hashp = hashp;
status->curBucket = 0;
status->curEntry = NULL;
+ status->hasHashvalue = false;
if (!hashp->frozen)
register_seq_scan(hashp);
}
+/*
+ * Same as above but scan by the given hash value.
+ * See also hash_seq_search().
+ */
+void
+hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue)
+{
+ hash_seq_init(status, hashp);
+
+ status->hasHashvalue = true;
+ status->hashvalue = hashvalue;
+
+ status->curEntry = *hash_initial_lookup(hashp, hashvalue, &status->curBucket);
+}
+
void *
hash_seq_search(HASH_SEQ_STATUS *status)
{
@@ -1440,6 +1446,24 @@ hash_seq_search(HASH_SEQ_STATUS *status)
uint32 curBucket;
HASHELEMENT *curElem;
+ if (status->hasHashvalue)
+ {
+ /*
+ * Scan entries only in the current bucket because only this bucket can
+ * contain entries with the given hash value.
+ */
+ while ((curElem = status->curEntry) != NULL)
+ {
+ status->curEntry = curElem->link;
+ if (status->hashvalue != curElem->hashvalue)
+ continue;
+ return (void *) ELEMENTKEY(curElem);
+ }
+
+ hash_seq_term(status);
+ return NULL;
+ }
+
if ((curElem = status->curEntry) != NULL)
{
/* Continuing scan of curBucket... */
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index da26941f6db..c99d74625f7 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -122,6 +122,8 @@ typedef struct
HTAB *hashp;
uint32 curBucket; /* index of current bucket */
HASHELEMENT *curEntry; /* current entry in bucket */
+ bool hasHashvalue; /* true if hashvalue was provided */
+ uint32 hashvalue; /* hashvalue to start seqscan over hash */
} HASH_SEQ_STATUS;
/*
@@ -141,6 +143,8 @@ extern bool hash_update_hash_key(HTAB *hashp, void *existingEntry,
const void *newKeyPtr);
extern long hash_get_num_entries(HTAB *hashp);
extern void hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp);
+extern void hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue);
extern void *hash_seq_search(HASH_SEQ_STATUS *status);
extern void hash_seq_term(HASH_SEQ_STATUS *status);
extern void hash_freeze(HTAB *hashp);
01-map_rel_to_type.v5.patchtext/x-patch; charset=UTF-8; name=01-map_rel_to_type.v5.patchDownload
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 0d4d0b0a154..7936c3b46d0 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,15 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/* The map from relation's oid to its type oid */
+typedef struct mapRelTypeEntry
+{
+ Oid typrelid;
+ Oid type_id;
+} mapRelTypeEntry;
+
+static HTAB *mapRelType = NULL;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -358,6 +367,11 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_BLOBS);
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(mapRelTypeEntry);
+ mapRelType = hash_create("Map reloid to typeoid", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -470,6 +484,24 @@ lookup_type_cache(Oid type_id, int flags)
ReleaseSysCache(tp);
}
+ /*
+ * Add a record to the relation->type map. We don't bother if type will
+ * become disconnected from the relation. Although it seems to be impossible,
+ * storing old data is safe in any case. In the worst case scenario we will
+ * just do an extra cleanup of a cache entry.
+ */
+ if (OidIsValid(typentry->typrelid) && typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry*) hash_search(mapRelType,
+ &typentry->typrelid,
+ HASH_ENTER, NULL);
+
+ relentry->typrelid = typentry->typrelid;
+ relentry->type_id = typentry->type_id;
+ }
+
/*
* Look up opclasses if we haven't already and any dependent info is
* requested.
@@ -2264,6 +2296,33 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+static void
+invalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching
+ * it will realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2273,63 +2332,46 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * a dedicated relid->type map, mapRelType.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * mapRelType and TypeCacheHash should exist, otherwise this callback
+ * wouldn't be registered
+ */
+
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry *) hash_search(mapRelType,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->type_id,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ invalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2341,6 +2383,35 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid = 0, so we need to reset all composite types in cache. Also, we
+ * should reset flags for domain types, and we loop over all entries
+ * in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ invalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't bother
+ * trying to determine whether the specific base type needs a
+ * reset.) Note that if we haven't determined whether the base
+ * type is composite, we don't need to reset anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
On Wed, Mar 13, 2024 at 04:40:38PM +0300, Teodor Sigaev wrote:
Done, all patches should be applied consequentially.
One thing that first pops out to me is that we can do the refactor of
hash_initial_lookup() as an independent piece, without the extra paths
introduced. But rather than returning the bucket hash and have the
bucket number as an in/out argument of hash_initial_lookup(), there is
an argument for reversing them: hash_search_with_hash_value() does not
care about the bucket number.
02-hash_seq_init_with_hash_value.v5.patch - introduces a
hash_seq_init_with_hash_value() method. hash_initial_lookup() is marked as
inline, but I suppose, modern compilers are smart enough to inline it
automatically.
Likely so, though that does not hurt to show the intention to the
reader.
So I would like to suggest the attached patch for this first piece.
What do you think?
It may also be an idea to use `git format-patch` when generating a
series of patches. That makes for easier reviews.
--
Michael
Attachments:
0001-Refactor-initial-hash-lookup-in-dynahash.c.patchtext/x-diff; charset=us-asciiDownload
From 6b1fe126b9f72ff27aca08128948f4e617ba70dd Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Thu, 14 Mar 2024 16:35:40 +0900
Subject: [PATCH] Refactor initial hash lookup in dynahash.c
---
src/backend/utils/hash/dynahash.c | 75 ++++++++++++++-----------------
1 file changed, 33 insertions(+), 42 deletions(-)
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index a4152080b5..e1bd92a01c 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -273,6 +273,8 @@ static void hdefault(HTAB *hashp);
static int choose_nelem_alloc(Size entrysize);
static bool init_htab(HTAB *hashp, long nelem);
static void hash_corrupted(HTAB *hashp);
+static uint32 hash_initial_lookup(HTAB *hashp, uint32 hashvalue,
+ HASHBUCKET **bucketptr);
static long next_pow2_long(long num);
static int next_pow2_int(long num);
static void register_seq_scan(HTAB *hashp);
@@ -972,10 +974,6 @@ hash_search_with_hash_value(HTAB *hashp,
HASHHDR *hctl = hashp->hctl;
int freelist_idx = FREELIST_IDX(hctl, hashvalue);
Size keysize;
- uint32 bucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HashCompareFunc match;
@@ -1008,17 +1006,7 @@ hash_search_with_hash_value(HTAB *hashp,
/*
* Do the initial lookup
*/
- bucket = calc_bucket(hctl, hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ (void) hash_initial_lookup(hashp, hashvalue, &prevBucketPtr);
currBucket = *prevBucketPtr;
/*
@@ -1159,14 +1147,10 @@ hash_update_hash_key(HTAB *hashp,
const void *newKeyPtr)
{
HASHELEMENT *existingElement = ELEMENT_FROM_KEY(existingEntry);
- HASHHDR *hctl = hashp->hctl;
uint32 newhashvalue;
Size keysize;
uint32 bucket;
uint32 newbucket;
- long segment_num;
- long segment_ndx;
- HASHSEGMENT segp;
HASHBUCKET currBucket;
HASHBUCKET *prevBucketPtr;
HASHBUCKET *oldPrevPtr;
@@ -1187,17 +1171,8 @@ hash_update_hash_key(HTAB *hashp,
* this to be able to unlink it from its hash chain, but as a side benefit
* we can verify the validity of the passed existingEntry pointer.
*/
- bucket = calc_bucket(hctl, existingElement->hashvalue);
-
- segment_num = bucket >> hashp->sshift;
- segment_ndx = MOD(bucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ bucket = hash_initial_lookup(hashp, existingElement->hashvalue,
+ &prevBucketPtr);
currBucket = *prevBucketPtr;
while (currBucket != NULL)
@@ -1219,18 +1194,7 @@ hash_update_hash_key(HTAB *hashp,
* chain we want to put the entry into.
*/
newhashvalue = hashp->hash(newKeyPtr, hashp->keysize);
-
- newbucket = calc_bucket(hctl, newhashvalue);
-
- segment_num = newbucket >> hashp->sshift;
- segment_ndx = MOD(newbucket, hashp->ssize);
-
- segp = hashp->dir[segment_num];
-
- if (segp == NULL)
- hash_corrupted(hashp);
-
- prevBucketPtr = &segp[segment_ndx];
+ newbucket = hash_initial_lookup(hashp, newhashvalue, &prevBucketPtr);
currBucket = *prevBucketPtr;
/*
@@ -1741,6 +1705,33 @@ element_alloc(HTAB *hashp, int nelem, int freelist_idx)
return true;
}
+/*
+ * Do initial lookup of a bucket for the given hash value, retrieving its
+ * bucket number and its hash bucket.
+ */
+static inline uint32
+hash_initial_lookup(HTAB *hashp, uint32 hashvalue, HASHBUCKET **bucketptr)
+{
+ HASHHDR *hctl = hashp->hctl;
+ HASHSEGMENT segp;
+ long segment_num;
+ long segment_ndx;
+ uint32 bucket;
+
+ bucket = calc_bucket(hctl, hashvalue);
+
+ segment_num = bucket >> hashp->sshift;
+ segment_ndx = MOD(bucket, hashp->ssize);
+
+ segp = hashp->dir[segment_num];
+
+ if (segp == NULL)
+ hash_corrupted(hashp);
+
+ *bucketptr = &segp[segment_ndx];
+ return bucket;
+}
+
/* complain when we have detected a corrupted hashtable */
static void
hash_corrupted(HTAB *hashp)
--
2.43.0
One thing that first pops out to me is that we can do the refactor of
hash_initial_lookup() as an independent piece, without the extra paths
introduced. But rather than returning the bucket hash and have the
bucket number as an in/out argument of hash_initial_lookup(), there is
an argument for reversing them: hash_search_with_hash_value() does not
care about the bucket number.
Ok, no problem
02-hash_seq_init_with_hash_value.v5.patch - introduces a
hash_seq_init_with_hash_value() method. hash_initial_lookup() is marked as
inline, but I suppose, modern compilers are smart enough to inline it
automatically.Likely so, though that does not hurt to show the intention to the
reader.
Agree
So I would like to suggest the attached patch for this first piece.
What do you think?
I have not any objections
It may also be an idea to use `git format-patch` when generating a
series of patches. That makes for easier reviews.
Thanks, will try
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
On Thu, Mar 14, 2024 at 04:27:43PM +0300, Teodor Sigaev wrote:
So I would like to suggest the attached patch for this first piece.
What do you think?I have not any objections
Okay, I've applied this piece for now. Not sure I'll have much room
to look at the rest.
--
Michael
Okay, I've applied this piece for now. Not sure I'll have much room
to look at the rest.
Thank you very much!
Rest of patches, rebased.
--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/
Attachments:
0003-usage-of-hash_search_with_hash_value.patchtext/x-patch; charset=UTF-8; name=0003-usage-of-hash_search_with_hash_value.patchDownload
From b30af080d7768c2fdb6198e2e40ef93419f60732 Mon Sep 17 00:00:00 2001
From: Teodor Sigaev <teodor@sigaev.ru>
Date: Fri, 15 Mar 2024 13:55:10 +0300
Subject: [PATCH 3/3] usage of hash_search_with_hash_value
---
src/backend/utils/cache/attoptcache.c | 39 ++++++++++++++++----
src/backend/utils/cache/typcache.c | 52 +++++++++++++++++++--------
2 files changed, 70 insertions(+), 21 deletions(-)
diff --git a/src/backend/utils/cache/attoptcache.c b/src/backend/utils/cache/attoptcache.c
index af978ccd4b1..3a18b2e9a77 100644
--- a/src/backend/utils/cache/attoptcache.c
+++ b/src/backend/utils/cache/attoptcache.c
@@ -44,12 +44,10 @@ typedef struct
/*
* InvalidateAttoptCacheCallback
- * Flush all cache entries when pg_attribute is updated.
+ * Flush cache entry (or entries) when pg_attribute is updated.
*
* When pg_attribute is updated, we must flush the cache entry at least
- * for that attribute. Currently, we just flush them all. Since attribute
- * options are not currently used in performance-critical paths (such as
- * query execution), this seems OK.
+ * for that attribute.
*/
static void
InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
@@ -57,7 +55,16 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
HASH_SEQ_STATUS status;
AttoptCacheEntry *attopt;
- hash_seq_init(&status, AttoptCacheHash);
+ /*
+ * By convection, zero hash value is passed to the callback as a sign
+ * that it's time to invalidate the cache. See sinval.c, inval.c and
+ * InvalidateSystemCachesExtended().
+ */
+ if (hashvalue == 0)
+ hash_seq_init(&status, AttoptCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, AttoptCacheHash, hashvalue);
+
while ((attopt = (AttoptCacheEntry *) hash_seq_search(&status)) != NULL)
{
if (attopt->opts)
@@ -70,6 +77,18 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
}
}
+/*
+ * Hash function compatible with two-arg system cache hash function.
+ */
+static uint32
+relatt_cache_syshash(const void *key, Size keysize)
+{
+ const AttoptCacheKey* ckey = key;
+
+ Assert(keysize == sizeof(*ckey));
+ return GetSysCacheHashValue2(ATTNUM, ckey->attrelid, ckey->attnum);
+}
+
/*
* InitializeAttoptCache
* Initialize the attribute options cache.
@@ -82,9 +101,17 @@ InitializeAttoptCache(void)
/* Initialize the hash table. */
ctl.keysize = sizeof(AttoptCacheKey);
ctl.entrysize = sizeof(AttoptCacheEntry);
+
+ /*
+ * AttoptCacheEntry takes hash value from the system cache. For
+ * AttoptCacheHash we use the same hash in order to speedup search by hash
+ * value. This is used by hash_seq_init_with_hash_value().
+ */
+ ctl.hash = relatt_cache_syshash;
+
AttoptCacheHash =
hash_create("Attopt cache", 256, &ctl,
- HASH_ELEM | HASH_BLOBS);
+ HASH_ELEM | HASH_FUNCTION);
/* Make sure we've initialized CacheMemoryContext. */
if (!CacheMemoryContext)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 7936c3b46d0..9145088f44d 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -339,6 +339,15 @@ static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+/*
+ * Hash function compatible with one-arg system cache hash function.
+ */
+static uint32
+type_cache_syshash(const void *key, Size keysize)
+{
+ Assert(keysize == sizeof(Oid));
+ return GetSysCacheHashValue1(TYPEOID, ObjectIdGetDatum(*(const Oid*)key));
+}
/*
* lookup_type_cache
@@ -364,8 +373,15 @@ lookup_type_cache(Oid type_id, int flags)
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
+ /*
+ * TypeEntry takes hash value from the system cache. For TypeCacheHash
+ * we use the same hash in order to speedup search by hash value. This
+ * is used by hash_seq_init_with_hash_value().
+ */
+ ctl.hash = type_cache_syshash;
+
TypeCacheHash = hash_create("Type information cache", 64,
- &ctl, HASH_ELEM | HASH_BLOBS);
+ &ctl, HASH_ELEM | HASH_FUNCTION);
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(mapRelTypeEntry);
@@ -421,8 +437,7 @@ lookup_type_cache(Oid type_id, int flags)
/* These fields can never change, by definition */
typentry->type_id = type_id;
- typentry->type_id_hash = GetSysCacheHashValue1(TYPEOID,
- ObjectIdGetDatum(type_id));
+ typentry->type_id_hash = get_hash_value(TypeCacheHash, &type_id);
/* Keep this part in sync with the code below */
typentry->typlen = typtup->typlen;
@@ -2429,20 +2444,27 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
TypeCacheEntry *typentry;
/* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
+
+ /*
+ * By convection, zero hash value is passed to the callback as a sign
+ * that it's time to invalidate the cache. See sinval.c, inval.c and
+ * InvalidateSystemCachesExtended().
+ */
+ if (hashvalue == 0)
+ hash_seq_init(&status, TypeCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue);
+
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
- /* Is this the targeted type row (or it's a total cache flush)? */
- if (hashvalue == 0 || typentry->type_id_hash == hashvalue)
- {
- /*
- * Mark the data obtained directly from pg_type as invalid. Also,
- * if it's a domain, typnotnull might've changed, so we'll need to
- * recalculate its constraints.
- */
- typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
- TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
- }
+ Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
+ /*
+ * Mark the data obtained directly from pg_type as invalid. Also,
+ * if it's a domain, typnotnull might've changed, so we'll need to
+ * recalculate its constraints.
+ */
+ typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
+ TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
}
}
--
2.43.2
0002-hash_search_with_hash_value.patchtext/x-patch; charset=UTF-8; name=0002-hash_search_with_hash_value.patchDownload
From 917d70a81ad37d366d73a4e3a9ed92212b4698ad Mon Sep 17 00:00:00 2001
From: Teodor Sigaev <teodor@sigaev.ru>
Date: Fri, 15 Mar 2024 13:52:50 +0300
Subject: [PATCH 2/3] hash_search_with_hash_value
---
src/backend/utils/hash/dynahash.c | 38 +++++++++++++++++++++++++++++++
src/include/utils/hsearch.h | 4 ++++
2 files changed, 42 insertions(+)
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 4080833df0f..e981298ea47 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -1387,10 +1387,30 @@ hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp)
status->hashp = hashp;
status->curBucket = 0;
status->curEntry = NULL;
+ status->hasHashvalue = false;
if (!hashp->frozen)
register_seq_scan(hashp);
}
+/*
+ * Same as above but scan by the given hash value.
+ * See also hash_seq_search().
+ */
+void
+hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue)
+{
+ HASHBUCKET *bucketPtr;
+
+ hash_seq_init(status, hashp);
+
+ status->hasHashvalue = true;
+ status->hashvalue = hashvalue;
+
+ status->curBucket = hash_initial_lookup(hashp, hashvalue, &bucketPtr);
+ status->curEntry = *bucketPtr;
+}
+
void *
hash_seq_search(HASH_SEQ_STATUS *status)
{
@@ -1404,6 +1424,24 @@ hash_seq_search(HASH_SEQ_STATUS *status)
uint32 curBucket;
HASHELEMENT *curElem;
+ if (status->hasHashvalue)
+ {
+ /*
+ * Scan entries only in the current bucket because only this bucket can
+ * contain entries with the given hash value.
+ */
+ while ((curElem = status->curEntry) != NULL)
+ {
+ status->curEntry = curElem->link;
+ if (status->hashvalue != curElem->hashvalue)
+ continue;
+ return (void *) ELEMENTKEY(curElem);
+ }
+
+ hash_seq_term(status);
+ return NULL;
+ }
+
if ((curElem = status->curEntry) != NULL)
{
/* Continuing scan of curBucket... */
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index da26941f6db..c99d74625f7 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -122,6 +122,8 @@ typedef struct
HTAB *hashp;
uint32 curBucket; /* index of current bucket */
HASHELEMENT *curEntry; /* current entry in bucket */
+ bool hasHashvalue; /* true if hashvalue was provided */
+ uint32 hashvalue; /* hashvalue to start seqscan over hash */
} HASH_SEQ_STATUS;
/*
@@ -141,6 +143,8 @@ extern bool hash_update_hash_key(HTAB *hashp, void *existingEntry,
const void *newKeyPtr);
extern long hash_get_num_entries(HTAB *hashp);
extern void hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp);
+extern void hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue);
extern void *hash_seq_search(HASH_SEQ_STATUS *status);
extern void hash_seq_term(HASH_SEQ_STATUS *status);
extern void hash_freeze(HTAB *hashp);
--
2.43.2
0001-type-cache.patchtext/x-patch; charset=UTF-8; name=0001-type-cache.patchDownload
From 93d3ff32c7c09ec96e7649f831d508c2921b5b5b Mon Sep 17 00:00:00 2001
From: Teodor Sigaev <teodor@sigaev.ru>
Date: Fri, 15 Mar 2024 13:52:34 +0300
Subject: [PATCH 1/3] type cache
---
src/backend/utils/cache/typcache.c | 159 +++++++++++++++++++++--------
1 file changed, 115 insertions(+), 44 deletions(-)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 0d4d0b0a154..7936c3b46d0 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,15 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/* The map from relation's oid to its type oid */
+typedef struct mapRelTypeEntry
+{
+ Oid typrelid;
+ Oid type_id;
+} mapRelTypeEntry;
+
+static HTAB *mapRelType = NULL;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -358,6 +367,11 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_BLOBS);
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(mapRelTypeEntry);
+ mapRelType = hash_create("Map reloid to typeoid", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -470,6 +484,24 @@ lookup_type_cache(Oid type_id, int flags)
ReleaseSysCache(tp);
}
+ /*
+ * Add a record to the relation->type map. We don't bother if type will
+ * become disconnected from the relation. Although it seems to be impossible,
+ * storing old data is safe in any case. In the worst case scenario we will
+ * just do an extra cleanup of a cache entry.
+ */
+ if (OidIsValid(typentry->typrelid) && typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry*) hash_search(mapRelType,
+ &typentry->typrelid,
+ HASH_ENTER, NULL);
+
+ relentry->typrelid = typentry->typrelid;
+ relentry->type_id = typentry->type_id;
+ }
+
/*
* Look up opclasses if we haven't already and any dependent info is
* requested.
@@ -2264,6 +2296,33 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+static void
+invalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching
+ * it will realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2273,63 +2332,46 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * a dedicated relid->type map, mapRelType.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * mapRelType and TypeCacheHash should exist, otherwise this callback
+ * wouldn't be registered
+ */
+
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry *) hash_search(mapRelType,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->type_id,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ invalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2341,6 +2383,35 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid = 0, so we need to reset all composite types in cache. Also, we
+ * should reset flags for domain types, and we loop over all entries
+ * in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ invalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't bother
+ * trying to determine whether the specific base type needs a
+ * reset.) Note that if we haven't determined whether the base
+ * type is composite, we don't need to reset anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
--
2.43.2
Rest of patches, rebased.
Hello,
I read and tested only the first patch so far. Creation of temp
tables and rollback in your script work 3-4 times faster with
0001-type-cache.patch on my windows laptop.
In the patch I found a copy of the comment "If it's domain over
composite, reset flags...". Can we move the reset flags operation
and its comment into the invalidateCompositeTypeCacheEntry()
function? This simplify the TypeCacheRelCallback() func, but
adds two more IF statements when we need to clean up a cache
entry for a specific relation. (diff attached).
--
Roman Zharkov
Attachments:
mapRelType-v2.patchapplication/octet-streamDownload
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index aa4720cb598..0b97eea9136 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,15 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/* The map from relation's oid to its type oid */
+typedef struct mapRelTypeEntry
+{
+ Oid typrelid;
+ Oid type_id;
+} mapRelTypeEntry;
+
+static HTAB *mapRelType = NULL;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -358,6 +367,11 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_BLOBS);
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(mapRelTypeEntry);
+ mapRelType = hash_create("Map reloid to typeoid", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -470,6 +484,24 @@ lookup_type_cache(Oid type_id, int flags)
ReleaseSysCache(tp);
}
+ /*
+ * Add a record to the relation->type map. We don't bother if type will
+ * become disconnected from the relation. Although it seems to be impossible,
+ * storing old data is safe in any case. In the worst case scenario we will
+ * just do an extra cleanup of a cache entry.
+ */
+ if (OidIsValid(typentry->typrelid) && typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry*) hash_search(mapRelType,
+ &typentry->typrelid,
+ HASH_ENTER, NULL);
+
+ relentry->typrelid = typentry->typrelid;
+ relentry->type_id = typentry->type_id;
+ }
+
/*
* Look up opclasses if we haven't already and any dependent info is
* requested.
@@ -2264,6 +2296,47 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+static void
+invalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching
+ * it will realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't bother
+ * trying to determine whether the specific base type needs a
+ * reset.) Note that if we haven't determined whether the base
+ * type is composite, we don't need to reset anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2273,72 +2346,63 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * a dedicated relid->type map, mapRelType.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * mapRelType and TypeCacheHash should exist, otherwise this callback
+ * wouldn't be registered
+ */
+
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ mapRelTypeEntry *relentry;
+
+ relentry = (mapRelTypeEntry *) hash_search(mapRelType,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->type_id,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
-
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
+
+ invalidateCompositeTypeCacheEntry(typentry);
}
+ }
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
+ {
+ invalidateCompositeTypeCacheEntry(typentry);
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+ }
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid = 0, so we need to reset all composite types in cache. Also, we
+ * should reset flags for domain types, and we loop over all entries
+ * in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
- /*
- * If it's domain over composite, reset flags. (We don't bother
- * trying to determine whether the specific base type needs a
- * reset.) Note that if we haven't determined whether the base
- * type is composite, we don't need to reset anything.
- */
- if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ invalidateCompositeTypeCacheEntry(typentry);
}
}
}
On 3/15/24 17:57, Teodor Sigaev wrote:
Okay, I've applied this piece for now. Not sure I'll have much room
to look at the rest.Thank you very much!
I have spent some time reviewing this feature. I think we can discuss
and apply it step-by-step. So, the 0001-* patch is at this moment.
The feature addresses the issue of TypCache being bloated by intensive
usage of non-standard types and domains. It adds significant overhead
during relcache invalidation by thoroughly scanning this hash table.
IMO, this feature will be handy soon, as we already see some patches
where TypCache is intensively used for storing composite types—for
example, look into solutions proposed in [1]/messages/by-id/CAKcux6ktu-8tefLWtQuuZBYFaZA83vUzuRd7c1YHC-yEWyYFpg@mail.gmail.com.
One of my main concerns with this feature is the possibility of lost
entries, which could be mistakenly used by relations with the same oid
in the future. This seems particularly possible in cases with multiple
temporary tables. The author has attempted to address this by replacing
the typrelid and type_id fields in the mapRelType on each call of
lookup_type_cache. However, I believe we could further improve this by
removing the entry from mapRelType on invalidation, thus avoiding this
potential issue.
While reviewing the patch, I made some minor changes (see attachment)
that you're free to adopt or reject. However, it's crucial that the
patch includes a detailed explanation, not just a single sentence, to
ensure everyone understands the changes.
Upon closer inspection, I noticed that the current implementation only
invalidates the cache entry. While this is acceptable for standard
types, it may not be sufficient to maintain numerous custom types (as in
the example in the initial letter) or in cases where whole-row vars are
heavily used. In such scenarios, removing the entry and reducing the
hash table's size might be more efficient.
In toto, the 0001-* patch looks good, and I would be glad to see it in
the core.
[1]: /messages/by-id/CAKcux6ktu-8tefLWtQuuZBYFaZA83vUzuRd7c1YHC-yEWyYFpg@mail.gmail.com
/messages/by-id/CAKcux6ktu-8tefLWtQuuZBYFaZA83vUzuRd7c1YHC-yEWyYFpg@mail.gmail.com
--
regards,
Andrei Lepikhov
Postgres Professional
Attachments:
0001-minor_improvements.difftext/x-patch; charset=UTF-8; name=0001-minor_improvements.diffDownload
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index e3c32c7848..ed321603d5 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -74,16 +74,17 @@
#include "utils/typcache.h"
-/* The main type cache hashtable searched by lookup_type_cache */
-static HTAB *TypeCacheHash = NULL;
-
/* The map from relation's oid to its type oid */
-typedef struct mapRelTypeEntry
+typedef struct RelTypeMapEntry
{
Oid typrelid;
Oid type_id;
-} mapRelTypeEntry;
+} RelTypeMapEntry;
+
+/* The main type cache hashtable searched by lookup_type_cache */
+static HTAB *TypeCacheHash = NULL;
+/* Utility hash table to speed up processing of invalidation relcache events. */
static HTAB *mapRelType = NULL;
/* List of type cache entries for domain types */
@@ -368,7 +369,7 @@ lookup_type_cache(Oid type_id, int flags)
&ctl, HASH_ELEM | HASH_BLOBS);
ctl.keysize = sizeof(Oid);
- ctl.entrysize = sizeof(mapRelTypeEntry);
+ ctl.entrysize = sizeof(RelTypeMapEntry);
mapRelType = hash_create("Map reloid to typeoid", 64,
&ctl, HASH_ELEM | HASH_BLOBS);
@@ -492,11 +493,11 @@ lookup_type_cache(Oid type_id, int flags)
*/
if (OidIsValid(typentry->typrelid) && typentry->typtype == TYPTYPE_COMPOSITE)
{
- mapRelTypeEntry *relentry;
+ RelTypeMapEntry *relentry;
- relentry = (mapRelTypeEntry*) hash_search(mapRelType,
- &typentry->typrelid,
- HASH_ENTER, NULL);
+ relentry = (RelTypeMapEntry *) hash_search(mapRelType,
+ &typentry->typrelid,
+ HASH_ENTER, NULL);
relentry->typrelid = typentry->typrelid;
relentry->type_id = typentry->type_id;
@@ -2297,7 +2298,7 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
}
static void
-invalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+invalidateTypeCacheEntry(TypeCacheEntry *typentry)
{
/* Delete tupdesc if we have it */
if (typentry->tupDesc != NULL)
@@ -2348,11 +2349,11 @@ TypeCacheRelCallback(Datum arg, Oid relid)
if (OidIsValid(relid))
{
- mapRelTypeEntry *relentry;
+ RelTypeMapEntry *relentry;
- relentry = (mapRelTypeEntry *) hash_search(mapRelType,
- &relid,
- HASH_FIND, NULL);
+ relentry = (RelTypeMapEntry *) hash_search(mapRelType,
+ &relid,
+ HASH_FIND, NULL);
if (relentry != NULL)
{
@@ -2365,7 +2366,7 @@ TypeCacheRelCallback(Datum arg, Oid relid)
Assert(typentry->typtype == TYPTYPE_COMPOSITE);
Assert(relid == typentry->typrelid);
- invalidateCompositeTypeCacheEntry(typentry);
+ invalidateTypeCacheEntry(typentry);
}
}
@@ -2397,7 +2398,7 @@ TypeCacheRelCallback(Datum arg, Oid relid)
{
if (typentry->typtype == TYPTYPE_COMPOSITE)
{
- invalidateCompositeTypeCacheEntry(typentry);
+ invalidateTypeCacheEntry(typentry);
}
else if (typentry->typtype == TYPTYPE_DOMAIN)
{
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index cfa9d5aaea..8f24690306 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2347,6 +2347,7 @@ RelocationBufferInfo
RelptrFreePageBtree
RelptrFreePageManager
RelptrFreePageSpanLeader
+RelTypeMapEntry
RemoteSlot
RenameStmt
ReopenPtrType
Hi!
On Wed, Apr 3, 2024 at 9:07 AM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
On 3/15/24 17:57, Teodor Sigaev wrote:
Okay, I've applied this piece for now. Not sure I'll have much room
to look at the rest.Thank you very much!
I have spent some time reviewing this feature. I think we can discuss
and apply it step-by-step. So, the 0001-* patch is at this moment.
The feature addresses the issue of TypCache being bloated by intensive
usage of non-standard types and domains. It adds significant overhead
during relcache invalidation by thoroughly scanning this hash table.
IMO, this feature will be handy soon, as we already see some patches
where TypCache is intensively used for storing composite types—for
example, look into solutions proposed in [1].
One of my main concerns with this feature is the possibility of lost
entries, which could be mistakenly used by relations with the same oid
in the future. This seems particularly possible in cases with multiple
temporary tables. The author has attempted to address this by replacing
the typrelid and type_id fields in the mapRelType on each call of
lookup_type_cache. However, I believe we could further improve this by
removing the entry from mapRelType on invalidation, thus avoiding this
potential issue.
While reviewing the patch, I made some minor changes (see attachment)
that you're free to adopt or reject. However, it's crucial that the
patch includes a detailed explanation, not just a single sentence, to
ensure everyone understands the changes.
Upon closer inspection, I noticed that the current implementation only
invalidates the cache entry. While this is acceptable for standard
types, it may not be sufficient to maintain numerous custom types (as in
the example in the initial letter) or in cases where whole-row vars are
heavily used. In such scenarios, removing the entry and reducing the
hash table's size might be more efficient.
In toto, the 0001-* patch looks good, and I would be glad to see it in
the core.
I've revised the patchset. First of all, I've re-ordered the patches.
0001-0002 (former 0002-0003)
Comprises hash_search_with_hash_value() function and its application
to avoid full hash iteration in InvalidateAttoptCacheCallback() and
TypeCacheTypCallback(). I think this is quite straightforward
optimization without negative side effects. I've revised comments,
commit message and did some code beautification. I'm going to push
this if no objections.
0003 (former 0001)
I've revised this patch. I think main concerns expressed in the
thread about this path is that we don't have invalidation mechanism
for relid => typid map. Finally due to oid wraparound same relids
could get reused. That could lead to invalid entries in the map about
existing relids and typeids. This is rather messy, but I don't think
this could cause a material bug. The maps items are used only for
cache invalidation. Extra invalidation doesn't cause a bug. If type
with same relid will be cached, then correspoding map item will be
overridden, so no missing invalidation. However, I see the following
reasons for keeping consistent state of relid => typid map.
1) As the main use-case for this optimization is flood of temporary
tables, it would be nice not let relid => typid map bloat in this
case. I see that TypeCacheHash would get bloated, because its entries
are never deleted. However, I would prefer to not get this situation
even worse.
2) In future we may find some more use-cases for relid => typid map
besides cache invalidation. Keeping that in consistent state could be
advantage then.
In the attached patch, I'm keeping relid => typid map when
corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or
TCFLAGS_OPERATOR_FLAGS, or tupdesc. Thus, when temporary table gets
deleted, we would invalidate the map item.
It will be also nice to get rid of iteration over all the cached
domain types in TypeCacheRelCallback(). However, this typically
shouldn't be a problem since domain types are less tended to bloat.
Domain types are created manually, unlike composite types which are
automatically created for every temporary table. We will probably
need to optimize this in future, but I don't feel this to be necessary
in present patch.
I think the revised 0003 requires review.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v7-0001-Introduce-hash_search_with_hash_value-function.patchapplication/octet-stream; name=v7-0001-Introduce-hash_search_with_hash_value-function.patchDownload
From 54ead85554d3bb571b74f2ac85aea2cd330ddc2f Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Mon, 5 Aug 2024 00:34:08 +0300
Subject: [PATCH v7 1/3] Introduce hash_search_with_hash_value() function
This new function iterates hash entries with given hash values. This function
is designed to avoid full sequential hash search in the syscache invalidation
callbacks.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov
---
src/backend/utils/hash/dynahash.c | 38 +++++++++++++++++++++++++++++++
src/include/utils/hsearch.h | 4 ++++
2 files changed, 42 insertions(+)
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 145e058fe67..8040416a13c 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -1387,10 +1387,30 @@ hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp)
status->hashp = hashp;
status->curBucket = 0;
status->curEntry = NULL;
+ status->hasHashvalue = false;
if (!hashp->frozen)
register_seq_scan(hashp);
}
+/*
+ * Same as above but scan by the given hash value.
+ * See also hash_seq_search().
+ */
+void
+hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue)
+{
+ HASHBUCKET *bucketPtr;
+
+ hash_seq_init(status, hashp);
+
+ status->hasHashvalue = true;
+ status->hashvalue = hashvalue;
+
+ status->curBucket = hash_initial_lookup(hashp, hashvalue, &bucketPtr);
+ status->curEntry = *bucketPtr;
+}
+
void *
hash_seq_search(HASH_SEQ_STATUS *status)
{
@@ -1404,6 +1424,24 @@ hash_seq_search(HASH_SEQ_STATUS *status)
uint32 curBucket;
HASHELEMENT *curElem;
+ if (status->hasHashvalue)
+ {
+ /*
+ * Scan entries only in the current bucket because only this bucket
+ * can contain entries with the given hash value.
+ */
+ while ((curElem = status->curEntry) != NULL)
+ {
+ status->curEntry = curElem->link;
+ if (status->hashvalue != curElem->hashvalue)
+ continue;
+ return (void *) ELEMENTKEY(curElem);
+ }
+
+ hash_seq_term(status);
+ return NULL;
+ }
+
if ((curElem = status->curEntry) != NULL)
{
/* Continuing scan of curBucket... */
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index da26941f6db..c99d74625f7 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -122,6 +122,8 @@ typedef struct
HTAB *hashp;
uint32 curBucket; /* index of current bucket */
HASHELEMENT *curEntry; /* current entry in bucket */
+ bool hasHashvalue; /* true if hashvalue was provided */
+ uint32 hashvalue; /* hashvalue to start seqscan over hash */
} HASH_SEQ_STATUS;
/*
@@ -141,6 +143,8 @@ extern bool hash_update_hash_key(HTAB *hashp, void *existingEntry,
const void *newKeyPtr);
extern long hash_get_num_entries(HTAB *hashp);
extern void hash_seq_init(HASH_SEQ_STATUS *status, HTAB *hashp);
+extern void hash_seq_init_with_hash_value(HASH_SEQ_STATUS *status, HTAB *hashp,
+ uint32 hashvalue);
extern void *hash_seq_search(HASH_SEQ_STATUS *status);
extern void hash_seq_term(HASH_SEQ_STATUS *status);
extern void hash_freeze(HTAB *hashp);
--
2.39.3 (Apple Git-145)
v7-0002-Optimize-InvalidateAttoptCacheCallback-and-TypeCa.patchapplication/octet-stream; name=v7-0002-Optimize-InvalidateAttoptCacheCallback-and-TypeCa.patchDownload
From b7db567611f6ac7f9a6e550f2e7863bbef5bc361 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Mon, 5 Aug 2024 00:44:49 +0300
Subject: [PATCH v7 2/3] Optimize InvalidateAttoptCacheCallback() and
TypeCacheTypCallback()
These callbacks are receiving hash values as arguments, which doesn't allow
direct lookups for AttoptCacheHash and TypeCacheHash. This is why subject
callbacks currently use full iteration over corresponding hashes.
This commit avoids full hash interation in InvalidateAttoptCacheCallback(),
and TypeCacheTypCallback(). At first, we switch AttoptCacheHash and
TypeCacheHash to use same hash function as syscache. As second, we
use hash_seq_init_with_hash_value() to iterate only hash entries with matching
hash value.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov
---
src/backend/utils/cache/attoptcache.c | 39 ++++++++++++++++---
src/backend/utils/cache/typcache.c | 55 +++++++++++++++++++--------
2 files changed, 73 insertions(+), 21 deletions(-)
diff --git a/src/backend/utils/cache/attoptcache.c b/src/backend/utils/cache/attoptcache.c
index af978ccd4b1..bd2e07080bd 100644
--- a/src/backend/utils/cache/attoptcache.c
+++ b/src/backend/utils/cache/attoptcache.c
@@ -44,12 +44,10 @@ typedef struct
/*
* InvalidateAttoptCacheCallback
- * Flush all cache entries when pg_attribute is updated.
+ * Flush cache entry (or entries) when pg_attribute is updated.
*
* When pg_attribute is updated, we must flush the cache entry at least
- * for that attribute. Currently, we just flush them all. Since attribute
- * options are not currently used in performance-critical paths (such as
- * query execution), this seems OK.
+ * for that attribute.
*/
static void
InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
@@ -57,7 +55,16 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
HASH_SEQ_STATUS status;
AttoptCacheEntry *attopt;
- hash_seq_init(&status, AttoptCacheHash);
+ /*
+ * By convection, zero hash value is passed to the callback as a sign that
+ * it's time to invalidate the cache. See sinval.c, inval.c and
+ * InvalidateSystemCachesExtended().
+ */
+ if (hashvalue == 0)
+ hash_seq_init(&status, AttoptCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, AttoptCacheHash, hashvalue);
+
while ((attopt = (AttoptCacheEntry *) hash_seq_search(&status)) != NULL)
{
if (attopt->opts)
@@ -70,6 +77,18 @@ InvalidateAttoptCacheCallback(Datum arg, int cacheid, uint32 hashvalue)
}
}
+/*
+ * Hash function compatible with two-arg system cache hash function.
+ */
+static uint32
+relatt_cache_syshash(const void *key, Size keysize)
+{
+ const AttoptCacheKey *ckey = key;
+
+ Assert(keysize == sizeof(*ckey));
+ return GetSysCacheHashValue2(ATTNUM, ckey->attrelid, ckey->attnum);
+}
+
/*
* InitializeAttoptCache
* Initialize the attribute options cache.
@@ -82,9 +101,17 @@ InitializeAttoptCache(void)
/* Initialize the hash table. */
ctl.keysize = sizeof(AttoptCacheKey);
ctl.entrysize = sizeof(AttoptCacheEntry);
+
+ /*
+ * AttoptCacheEntry takes hash value from the system cache. For
+ * AttoptCacheHash we use the same hash in order to speedup search by hash
+ * value. This is used by hash_seq_init_with_hash_value().
+ */
+ ctl.hash = relatt_cache_syshash;
+
AttoptCacheHash =
hash_create("Attopt cache", 256, &ctl,
- HASH_ELEM | HASH_BLOBS);
+ HASH_ELEM | HASH_FUNCTION);
/* Make sure we've initialized CacheMemoryContext. */
if (!CacheMemoryContext)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index aa4720cb598..a6d9ce0c513 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -331,6 +331,16 @@ static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+/*
+ * Hash function compatible with one-arg system cache hash function.
+ */
+static uint32
+type_cache_syshash(const void *key, Size keysize)
+{
+ Assert(keysize == sizeof(Oid));
+ return GetSysCacheHashValue1(TYPEOID, ObjectIdGetDatum(*(const Oid *) key));
+}
+
/*
* lookup_type_cache
*
@@ -355,8 +365,16 @@ lookup_type_cache(Oid type_id, int flags)
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
+
+ /*
+ * TypeEntry takes hash value from the system cache. For TypeCacheHash
+ * we use the same hash in order to speedup search by hash value. This
+ * is used by hash_seq_init_with_hash_value().
+ */
+ ctl.hash = type_cache_syshash;
+
TypeCacheHash = hash_create("Type information cache", 64,
- &ctl, HASH_ELEM | HASH_BLOBS);
+ &ctl, HASH_ELEM | HASH_FUNCTION);
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
@@ -407,8 +425,7 @@ lookup_type_cache(Oid type_id, int flags)
/* These fields can never change, by definition */
typentry->type_id = type_id;
- typentry->type_id_hash = GetSysCacheHashValue1(TYPEOID,
- ObjectIdGetDatum(type_id));
+ typentry->type_id_hash = get_hash_value(TypeCacheHash, &type_id);
/* Keep this part in sync with the code below */
typentry->typlen = typtup->typlen;
@@ -2358,20 +2375,28 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
TypeCacheEntry *typentry;
/* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
+
+ /*
+ * By convection, zero hash value is passed to the callback as a sign that
+ * it's time to invalidate the cache. See sinval.c, inval.c and
+ * InvalidateSystemCachesExtended().
+ */
+ if (hashvalue == 0)
+ hash_seq_init(&status, TypeCacheHash);
+ else
+ hash_seq_init_with_hash_value(&status, TypeCacheHash, hashvalue);
+
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
- /* Is this the targeted type row (or it's a total cache flush)? */
- if (hashvalue == 0 || typentry->type_id_hash == hashvalue)
- {
- /*
- * Mark the data obtained directly from pg_type as invalid. Also,
- * if it's a domain, typnotnull might've changed, so we'll need to
- * recalculate its constraints.
- */
- typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
- TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
- }
+ Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
+
+ /*
+ * Mark the data obtained directly from pg_type as invalid. Also, if
+ * it's a domain, typnotnull might've changed, so we'll need to
+ * recalculate its constraints.
+ */
+ typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
+ TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
}
}
--
2.39.3 (Apple Git-145)
v7-0003-Avoid-looping-over-all-type-cache-entries-in-Type.patchapplication/octet-stream; name=v7-0003-Avoid-looping-over-all-type-cache-entries-in-Type.patchDownload
From 6a97ad5f4d9822edafe3f6f3aec939d7bdc2ef4b Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Mon, 5 Aug 2024 00:20:24 +0300
Subject: [PATCH v7 3/3] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation oid to its composite type
oid.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov
---
src/backend/utils/cache/typcache.c | 295 ++++++++++++++++++++++++-----
src/tools/pgindent/typedefs.list | 1 +
2 files changed, 251 insertions(+), 45 deletions(-)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index a6d9ce0c513..8a93e060ac2 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The map from relation's oid to the corresponding composite type oid. We're
+ * keeping the map entry when corresponding typentry have either
+ * TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc. That is
+ * we're keeping map entry if the entry has something to clear.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -329,7 +343,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
-
+static void check_insert_rel_type_cache(TypeCacheEntry *typentry);
+static void check_delete_rel_type_cache(TypeCacheEntry *typentry);
/*
* Hash function compatible with one-arg system cache hash function.
@@ -376,6 +391,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to oid of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -387,6 +409,8 @@ lookup_type_cache(Oid type_id, int flags)
CreateCacheMemoryContext();
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -438,6 +462,7 @@ lookup_type_cache(Oid type_id, int flags)
typentry->typelem = typtup->typelem;
typentry->typcollation = typtup->typcollation;
typentry->flags |= TCFLAGS_HAVE_PG_TYPE_DATA;
+ check_insert_rel_type_cache(typentry);
/* If it's a domain, immediately thread it into the domain cache list */
if (typentry->typtype == TYPTYPE_DOMAIN)
@@ -483,6 +508,7 @@ lookup_type_cache(Oid type_id, int flags)
typentry->typelem = typtup->typelem;
typentry->typcollation = typtup->typcollation;
typentry->flags |= TCFLAGS_HAVE_PG_TYPE_DATA;
+ check_insert_rel_type_cache(typentry);
ReleaseSysCache(tp);
}
@@ -2281,6 +2307,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call check_delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ check_delete_rel_type_cache(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2290,63 +2363,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically shouldn't be a
+ * problem since domain types are less tended to bloat. Domain types
+ * are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2358,6 +2423,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid = 0, so we need to reset all composite types in cache. Also,
+ * we should reset flags for domain types, and we loop over all
+ * entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2388,6 +2483,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2397,6 +2494,10 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /* Call check_delete_rel_type_cache() if cleaned TCFLAGS_HAVE_PG_TYPE_DATA flag. */
+ if (hadPgTypeData)
+ check_delete_rel_type_cache(typentry);
}
}
@@ -2905,3 +3006,107 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry after setting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag if needed.
+ */
+static void
+check_insert_rel_type_cache(TypeCacheEntry *typentry)
+{
+ Assert(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating there should be such an entry already.
+ */
+ if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ Assert(!found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+
+#ifdef USE_ASSERT_CHECKING
+
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc if needed.
+ */
+static void
+check_delete_rel_type_cache(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found);
+ }
+
+#ifdef USE_ASSERT_CHECKING
+
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 75fc05093cc..2ac7199a6ce 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2371,6 +2371,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.3 (Apple Git-145)
On Mon, Aug 5, 2024 at 4:16 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
I've revised the patchset. First of all, I've re-ordered the patches.
0001-0002 (former 0002-0003)
Comprises hash_search_with_hash_value() function and its application
to avoid full hash iteration in InvalidateAttoptCacheCallback() and
TypeCacheTypCallback(). I think this is quite straightforward
optimization without negative side effects. I've revised comments,
commit message and did some code beautification. I'm going to push
this if no objections.0003 (former 0001)
I've revised this patch. I think main concerns expressed in the
thread about this path is that we don't have invalidation mechanism
for relid => typid map. Finally due to oid wraparound same relids
could get reused. That could lead to invalid entries in the map about
existing relids and typeids. This is rather messy, but I don't think
this could cause a material bug. The maps items are used only for
cache invalidation. Extra invalidation doesn't cause a bug. If type
with same relid will be cached, then correspoding map item will be
overridden, so no missing invalidation. However, I see the following
reasons for keeping consistent state of relid => typid map.1) As the main use-case for this optimization is flood of temporary
tables, it would be nice not let relid => typid map bloat in this
case. I see that TypeCacheHash would get bloated, because its entries
are never deleted. However, I would prefer to not get this situation
even worse.
2) In future we may find some more use-cases for relid => typid map
besides cache invalidation. Keeping that in consistent state could be
advantage then.In the attached patch, I'm keeping relid => typid map when
corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or
TCFLAGS_OPERATOR_FLAGS, or tupdesc. Thus, when temporary table gets
deleted, we would invalidate the map item.It will be also nice to get rid of iteration over all the cached
domain types in TypeCacheRelCallback(). However, this typically
shouldn't be a problem since domain types are less tended to bloat.
Domain types are created manually, unlike composite types which are
automatically created for every temporary table. We will probably
need to optimize this in future, but I don't feel this to be necessary
in present patch.I think the revised 0003 requires review.
The rebased remaining patch is attached.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v8-0001-Avoid-looping-over-all-type-cache-entries-in-Type.patchapplication/octet-stream; name=v8-0001-Avoid-looping-over-all-type-cache-entries-in-Type.patchDownload
From c2be3e3d32cd784841e3ff4355176bd7572220d1 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Mon, 5 Aug 2024 00:20:24 +0300
Subject: [PATCH v8] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation oid to its composite type
oid.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov
---
src/backend/utils/cache/typcache.c | 295 ++++++++++++++++++++++++-----
src/tools/pgindent/typedefs.list | 1 +
2 files changed, 251 insertions(+), 45 deletions(-)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 0b9e60845b2..472df866512 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The map from relation's oid to the corresponding composite type oid. We're
+ * keeping the map entry when corresponding typentry have either
+ * TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc. That is
+ * we're keeping map entry if the entry has something to clear.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -329,7 +343,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
-
+static void check_insert_rel_type_cache(TypeCacheEntry *typentry);
+static void check_delete_rel_type_cache(TypeCacheEntry *typentry);
/*
* Hash function compatible with one-arg system cache hash function.
@@ -376,6 +391,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to oid of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -387,6 +409,8 @@ lookup_type_cache(Oid type_id, int flags)
CreateCacheMemoryContext();
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -438,6 +462,7 @@ lookup_type_cache(Oid type_id, int flags)
typentry->typelem = typtup->typelem;
typentry->typcollation = typtup->typcollation;
typentry->flags |= TCFLAGS_HAVE_PG_TYPE_DATA;
+ check_insert_rel_type_cache(typentry);
/* If it's a domain, immediately thread it into the domain cache list */
if (typentry->typtype == TYPTYPE_DOMAIN)
@@ -483,6 +508,7 @@ lookup_type_cache(Oid type_id, int flags)
typentry->typelem = typtup->typelem;
typentry->typcollation = typtup->typcollation;
typentry->flags |= TCFLAGS_HAVE_PG_TYPE_DATA;
+ check_insert_rel_type_cache(typentry);
ReleaseSysCache(tp);
}
@@ -2281,6 +2307,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call check_delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ check_delete_rel_type_cache(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2290,63 +2363,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically shouldn't be a
+ * problem since domain types are less tended to bloat. Domain types
+ * are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2358,6 +2423,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid = 0, so we need to reset all composite types in cache. Also,
+ * we should reset flags for domain types, and we loop over all
+ * entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2388,6 +2483,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2397,6 +2494,10 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /* Call check_delete_rel_type_cache() if cleaned TCFLAGS_HAVE_PG_TYPE_DATA flag. */
+ if (hadPgTypeData)
+ check_delete_rel_type_cache(typentry);
}
}
@@ -2905,3 +3006,107 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry after setting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag if needed.
+ */
+static void
+check_insert_rel_type_cache(TypeCacheEntry *typentry)
+{
+ Assert(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating there should be such an entry already.
+ */
+ if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ Assert(!found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+
+#ifdef USE_ASSERT_CHECKING
+
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc if needed.
+ */
+static void
+check_delete_rel_type_cache(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found);
+ }
+
+#ifdef USE_ASSERT_CHECKING
+
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 547d14b3e7c..09ad8f69223 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2375,6 +2375,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.3 (Apple Git-146)
Hi, Alexander!
On Tue, 20 Aug 2024 at 23:01, Alexander Korotkov <aekorotkov@gmail.com>
wrote:
On Mon, Aug 5, 2024 at 4:16 AM Alexander Korotkov <aekorotkov@gmail.com>
wrote:I've revised the patchset. First of all, I've re-ordered the patches.
0001-0002 (former 0002-0003)
Comprises hash_search_with_hash_value() function and its application
to avoid full hash iteration in InvalidateAttoptCacheCallback() and
TypeCacheTypCallback(). I think this is quite straightforward
optimization without negative side effects. I've revised comments,
commit message and did some code beautification. I'm going to push
this if no objections.0003 (former 0001)
I've revised this patch. I think main concerns expressed in the
thread about this path is that we don't have invalidation mechanism
for relid => typid map. Finally due to oid wraparound same relids
could get reused. That could lead to invalid entries in the map about
existing relids and typeids. This is rather messy, but I don't think
this could cause a material bug. The maps items are used only for
cache invalidation. Extra invalidation doesn't cause a bug. If type
with same relid will be cached, then correspoding map item will be
overridden, so no missing invalidation. However, I see the following
reasons for keeping consistent state of relid => typid map.1) As the main use-case for this optimization is flood of temporary
tables, it would be nice not let relid => typid map bloat in this
case. I see that TypeCacheHash would get bloated, because its entries
are never deleted. However, I would prefer to not get this situation
even worse.
2) In future we may find some more use-cases for relid => typid map
besides cache invalidation. Keeping that in consistent state could be
advantage then.In the attached patch, I'm keeping relid => typid map when
corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or
TCFLAGS_OPERATOR_FLAGS, or tupdesc. Thus, when temporary table gets
deleted, we would invalidate the map item.It will be also nice to get rid of iteration over all the cached
domain types in TypeCacheRelCallback(). However, this typically
shouldn't be a problem since domain types are less tended to bloat.
Domain types are created manually, unlike composite types which are
automatically created for every temporary table. We will probably
need to optimize this in future, but I don't feel this to be necessary
in present patch.I think the revised 0003 requires review.
The rebased remaining patch is attached.
I've looked at patch v8.
1.
In function check_insert_rel_type_cache() the block:
+#ifdef USE_ASSERT_CHECKING
+
+ /*
+ * In assert-enabled builds otherwise check for
RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
As I understand it does HASH_FIND after the same value just inserted by
HASH_ENT
ER above under the same if condition:
if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
Why do we need to do this re-check HASH_ENTER? Also I see "otherwise" in
comment in a quoted block, but if condition is the same.
2.
For function check_delete_rel_type_cache():
I'd modify the block:
+#ifdef USE_ASSERT_CHECKING
+
+ /*
+ * In assert-enabled builds otherwise check for
RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
as:
+
+ /*
+ * In assert-enabled builds otherwise check for
RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ else
+{
+ #ifdef USE_ASSERT_CHECKING
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+#endif
+}
3. I think check_delete_rel_type_cache and check_insert_rel_type_cache are
better to be renamed to be more clear, though I don't have exact proposals
yet,
4. I haven't looked into comments, though I'd recommend oid -> OID
replacement in the comments.
Thank you for working on this patchset!
Regards,
Pavel Borisov
Supabase
Hi, Pavel!
On Wed, Aug 21, 2024 at 4:28 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
I've looked at patch v8.
1.
In function check_insert_rel_type_cache() the block:+#ifdef USE_ASSERT_CHECKING + + /* + * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash + * entry if it should exist. + */ + if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) && + typentry->tupDesc == NULL) + { + bool found; + + (void) hash_search(RelIdToTypeIdCacheHash, + &typentry->typrelid, + HASH_FIND, &found); + Assert(found); + } +#endifAs I understand it does HASH_FIND after the same value just inserted by HASH_ENT
ER above under the same if condition:if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)Why do we need to do this re-check HASH_ENTER? Also I see "otherwise" in comment in a quoted block, but if condition is the same.
Yep, these are remains from one of my previous attempt. No sense to
check for HASH_FIND right after HASH_ENTER. Removed.
2. For function check_delete_rel_type_cache(): I'd modify the block: +#ifdef USE_ASSERT_CHECKING + + /* + * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash + * entry if it should exist. + */ + if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) || + (typentry->flags & TCFLAGS_OPERATOR_FLAGS) || + typentry->tupDesc != NULL) + { + bool found; + + (void) hash_search(RelIdToTypeIdCacheHash, + &typentry->typrelid, + HASH_FIND, &found); + Assert(found); + } +#endifas: + + /* + * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash + * entry if it should exist. + */ + else +{ + #ifdef USE_ASSERT_CHECKING + bool found; + + (void) hash_search(RelIdToTypeIdCacheHash, + &typentry->typrelid, + HASH_FIND, &found); + Assert(found); +#endif +}
Changed in the way you proposed, except I put the comment inside the
#ifdef. I this it's easier to understand this way.
3. I think check_delete_rel_type_cache and check_insert_rel_type_cache are better to be renamed to be more clear, though I don't have exact proposals yet,
Renamed to delete_rel_type_cache_if_needed and
insert_rel_type_cache_if_needed. I've checked that
4. I haven't looked into comments, though I'd recommend oid -> OID replacement in the comments.
I've changed oid -> OID in the comments and in the commit message.
Thank you for working on this patchset!
Thank you for review!
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v9-0001-Avoid-looping-over-all-type-cache-entries-in-Type.patchapplication/octet-stream; name=v9-0001-Avoid-looping-over-all-type-cache-entries-in-Type.patchDownload
From dc58a31f59a9c5b7a6752e5633f252a650343e2b Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Mon, 5 Aug 2024 00:20:24 +0300
Subject: [PATCH v9] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov
---
src/backend/utils/cache/typcache.c | 273 ++++++++++++++++++++++++-----
src/tools/pgindent/typedefs.list | 1 +
2 files changed, 229 insertions(+), 45 deletions(-)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 0b9e60845b2..c883851e66c 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The map from relation's OID to the corresponding composite type OID. We're
+ * keeping the map entry when corresponding typentry have either
+ * TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc. That is
+ * we're keeping map entry if the entry has something to clear.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -329,7 +343,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
-
+static void insert_rel_type_cache_if_needed(TypeCacheEntry *typentry);
+static void delete_rel_type_cache_if_needed(TypeCacheEntry *typentry);
/*
* Hash function compatible with one-arg system cache hash function.
@@ -376,6 +391,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to OID of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -387,6 +409,8 @@ lookup_type_cache(Oid type_id, int flags)
CreateCacheMemoryContext();
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -438,6 +462,7 @@ lookup_type_cache(Oid type_id, int flags)
typentry->typelem = typtup->typelem;
typentry->typcollation = typtup->typcollation;
typentry->flags |= TCFLAGS_HAVE_PG_TYPE_DATA;
+ insert_rel_type_cache_if_needed(typentry);
/* If it's a domain, immediately thread it into the domain cache list */
if (typentry->typtype == TYPTYPE_DOMAIN)
@@ -483,6 +508,7 @@ lookup_type_cache(Oid type_id, int flags)
typentry->typelem = typtup->typelem;
typentry->typcollation = typtup->typcollation;
typentry->flags |= TCFLAGS_HAVE_PG_TYPE_DATA;
+ insert_rel_type_cache_if_needed(typentry);
ReleaseSysCache(tp);
}
@@ -2281,6 +2307,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call check_delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ delete_rel_type_cache_if_needed(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2290,63 +2363,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically shouldn't be a
+ * problem since domain types are less tended to bloat. Domain types
+ * are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2358,6 +2423,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid = 0, so we need to reset all composite types in cache. Also,
+ * we should reset flags for domain types, and we loop over all
+ * entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2388,6 +2483,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2397,6 +2494,10 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /* Call check_delete_rel_type_cache() if cleaned TCFLAGS_HAVE_PG_TYPE_DATA flag. */
+ if (hadPgTypeData)
+ delete_rel_type_cache_if_needed(typentry);
}
}
@@ -2905,3 +3006,85 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry after setting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag if needed.
+ */
+static void
+insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ Assert(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating there should be such an entry already.
+ */
+ if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ Assert(!found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc if needed.
+ */
+static void
+delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found);
+ }
+ else
+ {
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+#endif
+ }
+}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 6d424c89186..0cdb33801da 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2376,6 +2376,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.3 (Apple Git-146)
Hi, Alexander!
On Wed, 21 Aug 2024 at 19:29, Alexander Korotkov <aekorotkov@gmail.com>
wrote:
Hi, Pavel!
On Wed, Aug 21, 2024 at 4:28 PM Pavel Borisov <pashkin.elfe@gmail.com>
wrote:I've looked at patch v8.
1.
In function check_insert_rel_type_cache() the block:+#ifdef USE_ASSERT_CHECKING + + /* + * In assert-enabled builds otherwise check forRelIdToTypeIdCacheHash
+ * entry if it should exist. + */ + if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) && + typentry->tupDesc == NULL) + { + bool found; + + (void) hash_search(RelIdToTypeIdCacheHash, + &typentry->typrelid, + HASH_FIND, &found); + Assert(found); + } +#endifAs I understand it does HASH_FIND after the same value just inserted by
HASH_ENT
ER above under the same if condition:
if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)Why do we need to do this re-check HASH_ENTER? Also I see "otherwise" in
comment in a quoted block, but if condition is the same.
Yep, these are remains from one of my previous attempt. No sense to
check for HASH_FIND right after HASH_ENTER. Removed.2. For function check_delete_rel_type_cache(): I'd modify the block: +#ifdef USE_ASSERT_CHECKING + + /* + * In assert-enabled builds otherwise check forRelIdToTypeIdCacheHash
+ * entry if it should exist. + */ + if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) || + (typentry->flags & TCFLAGS_OPERATOR_FLAGS) || + typentry->tupDesc != NULL) + { + bool found; + + (void) hash_search(RelIdToTypeIdCacheHash, + &typentry->typrelid, + HASH_FIND, &found); + Assert(found); + } +#endifas: + + /* + * In assert-enabled builds otherwise check forRelIdToTypeIdCacheHash
+ * entry if it should exist. + */ + else +{ + #ifdef USE_ASSERT_CHECKING + bool found; + + (void) hash_search(RelIdToTypeIdCacheHash, + &typentry->typrelid, + HASH_FIND, &found); + Assert(found); +#endif +}Changed in the way you proposed, except I put the comment inside the
#ifdef. I this it's easier to understand this way.3. I think check_delete_rel_type_cache and check_insert_rel_type_cache
are better to be renamed to be more clear, though I don't have exact
proposals yet,Renamed to delete_rel_type_cache_if_needed and
insert_rel_type_cache_if_needed. I've checked that4. I haven't looked into comments, though I'd recommend oid -> OID
replacement in the comments.
I've changed oid -> OID in the comments and in the commit message.
Thank you for working on this patchset!
Thank you for review!
Looked at v9:
Patch looks good to me. I'd only suggest comments changes:
"The map from relation's OID to the corresponding composite type OID" ->
"The mapping of relation's OID to the corresponding composite type OID"
"We're keeping the map entry when corresponding typentry have either
TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc. That is
we're keeping map entry if the entry has something to clear." -> "We're
keeping the map entry when the corresponding typentry has something to
clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or
TCFLAGS_OPERATOR_FLAGS, or tupdesc."
"Invalidate particular TypeCacheEntry on Relcache inval callback" - remove
extra tabs before. Maybe also add empty line above.
"Typically shouldn't be a problem" -> "Typically this shouldn't affect
performance"
"Relid = 0, so we need" -> "Relid is invalid. By convention we need"
"if cleaned TCFLAGS_HAVE_PG_TYPE_DATA flag" -> "if we cleaned
TCFLAGS_HAVE_PG_TYPE_DATA flag previously"
"+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc if needed." - remove one "if needed"
Regards,
Pavel Borisov
Supabase
On 21/8/2024 17:28, Alexander Korotkov wrote:
I've changed oid -> OID in the comments and in the commit message.
I passed through the patch again: no objections and +1 to the changes of
comments proposed by Pavel.
--
regards, Andrei Lepikhov
Hi!
On Thu, Aug 22, 2024 at 1:02 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
Looked at v9:
Patch looks good to me. I'd only suggest comments changes:
"The map from relation's OID to the corresponding composite type OID" -> "The mapping of relation's OID to the corresponding composite type OID" "We're keeping the map entry when corresponding typentry have either TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc. That is we're keeping map entry if the entry has something to clear." -> "We're keeping the map entry when the corresponding typentry has something to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or TCFLAGS_OPERATOR_FLAGS, or tupdesc." "Invalidate particular TypeCacheEntry on Relcache inval callback" - remove extra tabs before. Maybe also add empty line above. "Typically shouldn't be a problem" -> "Typically this shouldn't affect performance" "Relid = 0, so we need" -> "Relid is invalid. By convention we need" "if cleaned TCFLAGS_HAVE_PG_TYPE_DATA flag" -> "if we cleaned TCFLAGS_HAVE_PG_TYPE_DATA flag previously" "+/* + * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the + * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags, + * or tupDesc if needed." - remove one "if needed"
Thank you for your feedback. I've integrated all your edits except
the formatting change of InvalidateCompositeTypeCacheEntry() header
comment. I think the functions below have the same formatting of
header comments, and it's not necessary to change format.
If no objections, I'm planning to push this after reverting PARTITION
SPLIT/MERGE.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v10-0001-Avoid-looping-over-all-type-cache-entries-in-Typ.patchapplication/octet-stream; name=v10-0001-Avoid-looping-over-all-type-cache-entries-in-Typ.patchDownload
From d5ecbae3588f4c4ecec02ab3fd9251553d1cf0eb Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Mon, 5 Aug 2024 00:20:24 +0300
Subject: [PATCH v10] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov
---
src/backend/utils/cache/typcache.c | 275 ++++++++++++++++++++++++-----
src/tools/pgindent/typedefs.list | 1 +
2 files changed, 232 insertions(+), 44 deletions(-)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 0b9e60845b2..494e5a41d79 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The mapping of relation's OID to the corresponding composite type OID.
+ * We're keeping the map entry when the corresponding typentry has something
+ * to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or
+ * TCFLAGS_OPERATOR_FLAGS, or tupdesc.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -329,6 +343,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+static void insert_rel_type_cache_if_needed(TypeCacheEntry *typentry);
+static void delete_rel_type_cache_if_needed(TypeCacheEntry *typentry);
/*
@@ -376,6 +392,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to OID of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -387,6 +410,8 @@ lookup_type_cache(Oid type_id, int flags)
CreateCacheMemoryContext();
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -438,6 +463,7 @@ lookup_type_cache(Oid type_id, int flags)
typentry->typelem = typtup->typelem;
typentry->typcollation = typtup->typcollation;
typentry->flags |= TCFLAGS_HAVE_PG_TYPE_DATA;
+ insert_rel_type_cache_if_needed(typentry);
/* If it's a domain, immediately thread it into the domain cache list */
if (typentry->typtype == TYPTYPE_DOMAIN)
@@ -483,6 +509,7 @@ lookup_type_cache(Oid type_id, int flags)
typentry->typelem = typtup->typelem;
typentry->typcollation = typtup->typcollation;
typentry->flags |= TCFLAGS_HAVE_PG_TYPE_DATA;
+ insert_rel_type_cache_if_needed(typentry);
ReleaseSysCache(tp);
}
@@ -2281,6 +2308,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call check_delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ delete_rel_type_cache_if_needed(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2290,63 +2364,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically this shouldn't
+ * affect performance since domain types are less tended to bloat.
+ * Domain types are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2358,6 +2424,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid is invalid. By convention we need to reset all composite
+ * types in cache. Also, we should reset flags for domain types, and
+ * we loop over all entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2388,6 +2484,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2397,6 +2495,13 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /*
+ * Call check_delete_rel_type_cache() if we cleaned
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
+ */
+ if (hadPgTypeData)
+ delete_rel_type_cache_if_needed(typentry);
}
}
@@ -2905,3 +3010,85 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry after setting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag if needed.
+ */
+static void
+insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ Assert(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating there should be such an entry already.
+ */
+ if (!(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ Assert(!found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc.
+ */
+static void
+delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found);
+ }
+ else
+ {
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+#endif
+ }
+}
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 3f3a8f2634b..91203a7225f 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2377,6 +2377,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.3 (Apple Git-146)
Hello Alexander,
22.08.2024 19:52, Alexander Korotkov wrotd:
If no objections, I'm planning to push this after reverting PARTITION
SPLIT/MERGE.
Please try to perform `make check` against a CLOBBER_CACHE_ALWAYS build.
trilobite failed it:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-08-25%2005%3A22%3A07
and I'm observing the same locally:
...
#5 0x00005636d37555f8 in ExceptionalCondition (conditionName=0x5636d39b1940 "found",
fileName=0x5636d39b1308 "typcache.c", lineNumber=3077) at assert.c:66
#6 0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
#7 0x00005636d3754063 in InvalidateCompositeTypeCacheEntry (typentry=0x5636d41d5d10) at typcache.c:2355
#8 0x00005636d37541d3 in TypeCacheRelCallback (arg=0, relid=0) at typcache.c:2441
...
(gdb) f 6
#6 0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
3077 Assert(found);
(gdb) p found
$1 = false
(This Assert is introduced by c14d4acb8.)
Best regards,
Alexander
On Sun, Aug 25, 2024 at 10:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
22.08.2024 19:52, Alexander Korotkov wrotd:
If no objections, I'm planning to push this after reverting PARTITION
SPLIT/MERGE.Please try to perform `make check` against a CLOBBER_CACHE_ALWAYS build.
trilobite failed it:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-08-25%2005%3A22%3A07and I'm observing the same locally:
...
#5 0x00005636d37555f8 in ExceptionalCondition (conditionName=0x5636d39b1940 "found",
fileName=0x5636d39b1308 "typcache.c", lineNumber=3077) at assert.c:66
#6 0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
#7 0x00005636d3754063 in InvalidateCompositeTypeCacheEntry (typentry=0x5636d41d5d10) at typcache.c:2355
#8 0x00005636d37541d3 in TypeCacheRelCallback (arg=0, relid=0) at typcache.c:2441
...(gdb) f 6
#6 0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
3077 Assert(found);
(gdb) p found
$1 = false(This Assert is introduced by c14d4acb8.)
Thank you for noticing. I'm checking this.
------
Regards,
Alexander Korotkov
Supabase
On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
On Sun, Aug 25, 2024 at 10:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
22.08.2024 19:52, Alexander Korotkov wrotd:
If no objections, I'm planning to push this after reverting PARTITION
SPLIT/MERGE.Please try to perform `make check` against a CLOBBER_CACHE_ALWAYS build.
trilobite failed it:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=trilobite&dt=2024-08-25%2005%3A22%3A07and I'm observing the same locally:
...
#5 0x00005636d37555f8 in ExceptionalCondition (conditionName=0x5636d39b1940 "found",
fileName=0x5636d39b1308 "typcache.c", lineNumber=3077) at assert.c:66
#6 0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
#7 0x00005636d3754063 in InvalidateCompositeTypeCacheEntry (typentry=0x5636d41d5d10) at typcache.c:2355
#8 0x00005636d37541d3 in TypeCacheRelCallback (arg=0, relid=0) at typcache.c:2441
...(gdb) f 6
#6 0x00005636d37554a4 in delete_rel_type_cache_if_needed (typentry=0x5636d41d5d10) at typcache.c:3077
3077 Assert(found);
(gdb) p found
$1 = false(This Assert is introduced by c14d4acb8.)
Thank you for noticing. I'm checking this.
I didn't take into account that TypeCacheEntry could be invalidated
while lookup_type_cache() does syscache lookups. When I realized that
I was curious on how does it currently work. It appears that type
cache invalidation mostly only clears the flags while values are
remaining in place and still available for lookup_type_cache() caller.
TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee
to survive only because we don't do any syscache lookups for composite
data types later in lookup_type_cache(). I'm becoming less fan of how
this works... I think these aspects needs to be at least documented
in details.
Regarding c14d4acb8, it appears to require redesign. I'm going to revert it.
------
Regards,
Alexander Korotkov
Supabase
On 25/8/2024 23:22, Alexander Korotkov wrote:
On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov
(This Assert is introduced by c14d4acb8.)
Thank you for noticing. I'm checking this.
I didn't take into account that TypeCacheEntry could be invalidated
while lookup_type_cache() does syscache lookups. When I realized that
I was curious on how does it currently work. It appears that type
cache invalidation mostly only clears the flags while values are
remaining in place and still available for lookup_type_cache() caller.
TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee
to survive only because we don't do any syscache lookups for composite
data types later in lookup_type_cache(). I'm becoming less fan of how
this works... I think these aspects needs to be at least documented
in details.Regarding c14d4acb8, it appears to require redesign. I'm going to revert it.
Sorry, but I don't understand your point.
Let's refocus on the problem at hand. The issue arose when the
TypeCacheTypCallback and the TypeCacheRelCallback were executed in
sequence within InvalidateSystemCachesExtended.
The first callback cleaned the flags TCFLAGS_HAVE_PG_TYPE_DATA and
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS. But the call of the second callback
checks the typentry->tupDesc and, because it wasn't NULL, attempted to
remove this record a second time.
I think there is no case for redesign, but we have a mess in
insertion/deletion conditions.
--
regards, Andrei Lepikhov
On Mon, Aug 26, 2024 at 9:37 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 25/8/2024 23:22, Alexander Korotkov wrote:
On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov
(This Assert is introduced by c14d4acb8.)
Thank you for noticing. I'm checking this.
I didn't take into account that TypeCacheEntry could be invalidated
while lookup_type_cache() does syscache lookups. When I realized that
I was curious on how does it currently work. It appears that type
cache invalidation mostly only clears the flags while values are
remaining in place and still available for lookup_type_cache() caller.
TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee
to survive only because we don't do any syscache lookups for composite
data types later in lookup_type_cache(). I'm becoming less fan of how
this works... I think these aspects needs to be at least documented
in details.Regarding c14d4acb8, it appears to require redesign. I'm going to revert it.
Sorry, but I don't understand your point.
Let's refocus on the problem at hand. The issue arose when the
TypeCacheTypCallback and the TypeCacheRelCallback were executed in
sequence within InvalidateSystemCachesExtended.
The first callback cleaned the flags TCFLAGS_HAVE_PG_TYPE_DATA and
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS. But the call of the second callback
checks the typentry->tupDesc and, because it wasn't NULL, attempted to
remove this record a second time.
I think there is no case for redesign, but we have a mess in
insertion/deletion conditions.
Yes, it's possible to repair the current approach. But we need to do
this correct, not just "not failing with current usages". Then we
need to call insert_rel_type_cache_if_needed() not just when we set
TCFLAGS_HAVE_PG_TYPE_DATA flag, but every time we set any of
TCFLAGS_OPERATOR_FLAGS or tupDesc. That's a lot of places, not as
simple and elegant as it was planned. This is why I wonder if there
is a better approach.
Secondly, I'm not terribly happy with current state of type cache.
The caller of lookup_type_cache() might get already invalidated data.
This probably OK, because caller probably hold locks on dependent
objects to guarantee that relevant properties of type actually
persists. At very least this should be documented, but it doesn't
seem so. Setting of tupdesc is sensitive to its order of execution.
That feels quite fragile to me, and not documented either. I think
this area needs improvements before we push additional functionality
there.
------
Regards,
Alexander Korotkov
Supabase
On Mon, Aug 26, 2024 at 11:26 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
On Mon, Aug 26, 2024 at 9:37 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 25/8/2024 23:22, Alexander Korotkov wrote:
On Sun, Aug 25, 2024 at 10:21 PM Alexander Korotkov
(This Assert is introduced by c14d4acb8.)
Thank you for noticing. I'm checking this.
I didn't take into account that TypeCacheEntry could be invalidated
while lookup_type_cache() does syscache lookups. When I realized that
I was curious on how does it currently work. It appears that type
cache invalidation mostly only clears the flags while values are
remaining in place and still available for lookup_type_cache() caller.
TypeCacheEntry.tupDesc is invalidated directly, and it has guarantee
to survive only because we don't do any syscache lookups for composite
data types later in lookup_type_cache(). I'm becoming less fan of how
this works... I think these aspects needs to be at least documented
in details.Regarding c14d4acb8, it appears to require redesign. I'm going to revert it.
Sorry, but I don't understand your point.
Let's refocus on the problem at hand. The issue arose when the
TypeCacheTypCallback and the TypeCacheRelCallback were executed in
sequence within InvalidateSystemCachesExtended.
The first callback cleaned the flags TCFLAGS_HAVE_PG_TYPE_DATA and
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS. But the call of the second callback
checks the typentry->tupDesc and, because it wasn't NULL, attempted to
remove this record a second time.
I think there is no case for redesign, but we have a mess in
insertion/deletion conditions.Yes, it's possible to repair the current approach. But we need to do
this correct, not just "not failing with current usages". Then we
need to call insert_rel_type_cache_if_needed() not just when we set
TCFLAGS_HAVE_PG_TYPE_DATA flag, but every time we set any of
TCFLAGS_OPERATOR_FLAGS or tupDesc. That's a lot of places, not as
simple and elegant as it was planned. This is why I wonder if there
is a better approach.Secondly, I'm not terribly happy with current state of type cache.
The caller of lookup_type_cache() might get already invalidated data.
This probably OK, because caller probably hold locks on dependent
objects to guarantee that relevant properties of type actually
persists. At very least this should be documented, but it doesn't
seem so. Setting of tupdesc is sensitive to its order of execution.
That feels quite fragile to me, and not documented either. I think
this area needs improvements before we push additional functionality
there.
I see fdd965d074 added a proper handling for concurrent invalidation
for relation cache. If a concurrent invalidation occurs, we retry
building a relation descriptor. Thus, we end up with returning of a
valid relation descriptor to caller. I wonder if we can take the same
approach to type cache. That would make the whole type cache more
consistent and less fragile. Also, this patch will be simpler.
------
Regards,
Alexander Korotkov
Supabase
On 29/8/2024 11:01, Alexander Korotkov wrote:
On Mon, Aug 26, 2024 at 11:26 AM Alexander Korotkov
Secondly, I'm not terribly happy with current state of type cache.
The caller of lookup_type_cache() might get already invalidated data.
This probably OK, because caller probably hold locks on dependent
objects to guarantee that relevant properties of type actually
persists. At very least this should be documented, but it doesn't
seem so. Setting of tupdesc is sensitive to its order of execution.
That feels quite fragile to me, and not documented either. I think
this area needs improvements before we push additional functionality
there.I see fdd965d074 added a proper handling for concurrent invalidation
for relation cache. If a concurrent invalidation occurs, we retry
building a relation descriptor. Thus, we end up with returning of a
valid relation descriptor to caller. I wonder if we can take the same
approach to type cache. That would make the whole type cache more
consistent and less fragile. Also, this patch will be simpler.
I think I understand the solution from the commit fdd965d074.
Just for the record, you mentioned invalidation inside the
lookup_type_cache above. Passing through the code, I found the only
place for such a case - the call of the GetDefaultOpClass, which
triggers the opening of the relation pg_opclass, which can cause an
AcceptInvalidationMessages call. Did you mean this case, or does a wider
field of cases exist here?
--
regards, Andrei Lepikhov
On Sat, Aug 31, 2024 at 10:33 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 29/8/2024 11:01, Alexander Korotkov wrote:
On Mon, Aug 26, 2024 at 11:26 AM Alexander Korotkov
Secondly, I'm not terribly happy with current state of type cache.
The caller of lookup_type_cache() might get already invalidated data.
This probably OK, because caller probably hold locks on dependent
objects to guarantee that relevant properties of type actually
persists. At very least this should be documented, but it doesn't
seem so. Setting of tupdesc is sensitive to its order of execution.
That feels quite fragile to me, and not documented either. I think
this area needs improvements before we push additional functionality
there.I see fdd965d074 added a proper handling for concurrent invalidation
for relation cache. If a concurrent invalidation occurs, we retry
building a relation descriptor. Thus, we end up with returning of a
valid relation descriptor to caller. I wonder if we can take the same
approach to type cache. That would make the whole type cache more
consistent and less fragile. Also, this patch will be simpler.I think I understand the solution from the commit fdd965d074.
Just for the record, you mentioned invalidation inside the
lookup_type_cache above. Passing through the code, I found the only
place for such a case - the call of the GetDefaultOpClass, which
triggers the opening of the relation pg_opclass, which can cause an
AcceptInvalidationMessages call. Did you mean this case, or does a wider
field of cases exist here?
I've tried to implement handling of concurrent invalidation similar to
commit fdd965d074. However that appears to be more difficult that I
thought, because for some datatypes like arrays, ranges etc we might
need fill the element type and reference it. So, I decided to
continue with the current approach but borrowing some ideas from
fdd965d074. The revised patchset attached.
0001 - adds comment about concurrent invalidation handling
0002 - revised c14d4acb8. Now we track type oids, whose
TypeCacheEntry's filing is in-progress. Add entry to
RelIdToTypeIdCacheHash at the end of lookup_type_cache() or on the
transaction abort. During invalidation don't assert
RelIdToTypeIdCacheHash to be here if TypeCacheEntry is in-progress.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v11-0001-Update-header-comment-for-lookup_type_cache.patchapplication/x-patch; name=v11-0001-Update-header-comment-for-lookup_type_cache.patchDownload
From 0ed4be04b42acab74efc59ea35772fd51f1b954c Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri, 13 Sep 2024 02:10:04 +0300
Subject: [PATCH v11 1/2] Update header comment for lookup_type_cache()
Describe the way we handle concurrent invalidation messages.
---
src/backend/utils/cache/typcache.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 2ec136b7d30..11382547ec0 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -351,6 +351,15 @@ type_cache_syshash(const void *key, Size keysize)
* invalid. Note however that we may fail to find one or more of the
* values requested by 'flags'; the caller needs to check whether the fields
* are InvalidOid or not.
+ *
+ * Note that while filling TypeCacheEntry we might process concurrent
+ * invalidation messages, causing our not-yet-filled TypeCacheEntry to be
+ * invalidated. In this case, we typically only clear flags while values are
+ * still available for the caller. It's expected that the caller holds
+ * enough locks on type-depending objects that the values are still relevant.
+ * It's also important that the tupdesc is filled in the last for
+ * TYPTYPE_COMPOSITE. So, it can't get invalidated during the
+ * lookup_type_cache() call.
*/
TypeCacheEntry *
lookup_type_cache(Oid type_id, int flags)
--
2.39.3 (Apple Git-146)
v11-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchapplication/x-patch; name=v11-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchDownload
From 9735f1266f3dad2955758a28814c4b8a593d834b Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 10 Sep 2024 23:25:04 +0300
Subject: [PATCH v11 2/2] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov
---
src/backend/access/transam/xact.c | 10 +
src/backend/utils/cache/typcache.c | 350 +++++++++++++++++++++++++----
src/include/utils/typcache.h | 4 +
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 321 insertions(+), 44 deletions(-)
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 87700c7c5c7..b0b05e28790 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -70,6 +70,7 @@
#include "utils/snapmgr.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
+#include "utils/typcache.h"
/*
* User-tweakable parameters
@@ -2407,6 +2408,9 @@ CommitTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/*
* Make catalog changes visible to all backends. This has to happen after
* relcache references are dropped (see comments for
@@ -2709,6 +2713,9 @@ PrepareTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/* notify doesn't need a postprepare call */
PostPrepare_PgStat();
@@ -2951,6 +2958,7 @@ AbortTransaction(void)
false, true);
AtEOXact_Buffers(false);
AtEOXact_RelationCache(false);
+ AtEOXact_TypeCache();
AtEOXact_Inval(false);
AtEOXact_MultiXact();
ResourceOwnerRelease(TopTransactionResourceOwner,
@@ -5153,6 +5161,7 @@ CommitSubTransaction(void)
true, false);
AtEOSubXact_RelationCache(true, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(true);
AtSubCommit_smgr();
@@ -5328,6 +5337,7 @@ AbortSubTransaction(void)
AtEOSubXact_RelationCache(false, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(false);
ResourceOwnerRelease(s->curTransactionOwner,
RESOURCE_RELEASE_LOCKS,
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 11382547ec0..dcbf50e865b 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The mapping of relation's OID to the corresponding composite type OID.
+ * We're keeping the map entry when the corresponding typentry has something
+ * to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or
+ * TCFLAGS_OPERATOR_FLAGS, or tupdesc.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -208,6 +222,10 @@ typedef struct SharedTypmodTableEntry
dsa_pointer shared_tupdesc;
} SharedTypmodTableEntry;
+static Oid *in_progress_list;
+static int in_progress_list_len;
+static int in_progress_list_maxlen;
+
/*
* A comparator function for SharedRecordTableKey.
*/
@@ -329,6 +347,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+static void insert_rel_type_cache_if_needed(TypeCacheEntry *typentry);
+static void delete_rel_type_cache_if_needed(TypeCacheEntry *typentry);
/*
@@ -366,11 +386,22 @@ lookup_type_cache(Oid type_id, int flags)
{
TypeCacheEntry *typentry;
bool found;
+ int in_progress_offset;
if (TypeCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
+ int allocsize;
+
+ /*
+ * reserve enough in_progress_list slots for many cases
+ */
+ allocsize = 4;
+ in_progress_list =
+ MemoryContextAlloc(CacheMemoryContext,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
@@ -385,6 +416,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to OID of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -396,6 +434,21 @@ lookup_type_cache(Oid type_id, int flags)
CreateCacheMemoryContext();
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
+ /* Register to catch invalidation messages */
+ if (in_progress_list_len >= in_progress_list_maxlen)
+ {
+ int allocsize;
+
+ allocsize = in_progress_list_maxlen * 2;
+ in_progress_list = repalloc(in_progress_list,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
+ }
+ in_progress_offset = in_progress_list_len++;
+ in_progress_list[in_progress_offset] = type_id;
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -896,6 +949,11 @@ lookup_type_cache(Oid type_id, int flags)
load_domaintype_info(typentry);
}
+ Assert(in_progress_offset + 1 == in_progress_list_len);
+ in_progress_list_len--;
+
+ insert_rel_type_cache_if_needed(typentry);
+
return typentry;
}
@@ -2290,6 +2348,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call check_delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ delete_rel_type_cache_if_needed(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2299,63 +2404,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically this shouldn't
+ * affect performance since domain types are less tended to bloat.
+ * Domain types are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2367,6 +2464,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid is invalid. By convention we need to reset all composite
+ * types in cache. Also, we should reset flags for domain types, and
+ * we loop over all entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2397,6 +2524,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2406,6 +2535,13 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /*
+ * Call check_delete_rel_type_cache() if we cleaned
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
+ */
+ if (hadPgTypeData)
+ delete_rel_type_cache_if_needed(typentry);
}
}
@@ -2914,3 +3050,129 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry if needed.
+ */
+static void
+insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry have any
+ * information indicating it should be here.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc.
+ */
+static void
+delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+#ifdef USE_ASSERT_CHECKING
+ int i;
+ bool is_in_progress = false;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ if (in_progress_list[i] == typentry->type_id)
+ {
+ is_in_progress = true;
+ break;
+ }
+ }
+#endif
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found || is_in_progress);
+ }
+ else
+ {
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ bool found;
+
+ if (!is_in_progress)
+ {
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+ }
+}
+
+static void
+cleanup_in_progress_typentries(void)
+{
+ int i;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ TypeCacheEntry *typentry;
+
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &in_progress_list[i],
+ HASH_FIND, NULL);
+ insert_rel_type_cache_if_needed(typentry);
+ }
+
+ in_progress_list_len = 0;
+}
+
+void
+AtEOXact_TypeCache(void)
+{
+ cleanup_in_progress_typentries();
+}
+
+void
+AtEOSubXact_TypeCache(void)
+{
+ cleanup_in_progress_typentries();
+}
diff --git a/src/include/utils/typcache.h b/src/include/utils/typcache.h
index f506cc4aa35..f3d73ecee3a 100644
--- a/src/include/utils/typcache.h
+++ b/src/include/utils/typcache.h
@@ -207,4 +207,8 @@ extern void SharedRecordTypmodRegistryInit(SharedRecordTypmodRegistry *,
extern void SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *);
+extern void AtEOXact_TypeCache(void);
+
+extern void AtEOSubXact_TypeCache(void);
+
#endif /* TYPCACHE_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e9ebddde24d..38b60a5944b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.3 (Apple Git-146)
On 13/9/2024 01:38, Alexander Korotkov wrote:
I've tried to implement handling of concurrent invalidation similar to
commit fdd965d074. However that appears to be more difficult that I
thought, because for some datatypes like arrays, ranges etc we might
need fill the element type and reference it. So, I decided to
continue with the current approach but borrowing some ideas from
fdd965d074. The revised patchset attached.
Let me rephrase the issue in more straightforward terms to ensure we are
all clear on the problem:
The critical problem of the typcache lookup on not-yet-locked data is
that it can lead to an inconsistent state of the TypEntry, potentially
causing disruptions in the DBMS's operations, correct?
Let's exemplify this statement. By filling typentry's lt_opr, eq_opr,
and gt_opr fields, we access the AMOPSTRATEGY cache. One operation can
successfully fetch data from the cache, but another can miss data and
touch the catalogue table, causing invalidations. In this case, we can
get an inconsistent set of operators. Do I understand the problem
statement correctly?
If this view is correct, your derived approach should work fine if all
necessary callbacks are registered. I see that at least AMOPSTRATEGY and
PROCOID were missed at the moment of the typcache initialization.
--
regards, Andrei Lepikhov
On Wed, Sep 18, 2024 at 5:10 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 13/9/2024 01:38, Alexander Korotkov wrote:
I've tried to implement handling of concurrent invalidation similar to
commit fdd965d074. However that appears to be more difficult that I
thought, because for some datatypes like arrays, ranges etc we might
need fill the element type and reference it. So, I decided to
continue with the current approach but borrowing some ideas from
fdd965d074. The revised patchset attached.Let me rephrase the issue in more straightforward terms to ensure we are
all clear on the problem:
The critical problem of the typcache lookup on not-yet-locked data is
that it can lead to an inconsistent state of the TypEntry, potentially
causing disruptions in the DBMS's operations, correct?
Let's exemplify this statement. By filling typentry's lt_opr, eq_opr,
and gt_opr fields, we access the AMOPSTRATEGY cache. One operation can
successfully fetch data from the cache, but another can miss data and
touch the catalogue table, causing invalidations. In this case, we can
get an inconsistent set of operators. Do I understand the problem
statement correctly?
Actually, I didn't research much if there is a material problem. So,
I didn't try to concurrently delete some operator class members
concurrently to lookup_type_cache(). There are probably some bugs,
but they likely have low impact in practice, given that type/opclass
changes are very rare.
Yet I was concentrated on why do lookup_type_cache() returns
TypeCacheEntry filled with whatever caller asked given there could be
concurrent invalidations.
So, my approach was to
1) Document how we currently handle concurrent invalidations.
2) Maintain RelIdToTypeIdCacheHash correctly with concurrent invalidations.
------
Regards,
Alexander Korotkov
Supabase
Hi all,
On Fri, 13 Sept 2024 at 01:38, Alexander Korotkov <aekorotkov@gmail.com> wrote:
0001 - adds comment about concurrent invalidation handling
0002 - revised c14d4acb8. Now we track type oids, whose
TypeCacheEntry's filing is in-progress. Add entry to
RelIdToTypeIdCacheHash at the end of lookup_type_cache() or on the
transaction abort. During invalidation don't assert
RelIdToTypeIdCacheHash to be here if TypeCacheEntry is in-progress.
Thank you Alexander for the patch. I reviewed and tested it.
I used Teodor's script to check the performance. On my laptop on
master ROLLBACK runs 11496.219 ms. With patch ROLLBACK runs 378.990
ms.
It seems to me that there are couple of possible issues in the patch:
In `lookup_type_cache()` `in_progress_list` is allocated using
`CacheMemoryContext`, on the other hand it seems there might be a case
when `CacheMemoryContext` is not created yet. It is created below in
the code if it doesn't exist:
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
It is probably a very rare case, but it might be better to allocate
`in_progress_list` after that line, or move creation of
`CacheMemoryContext` higher.
Within `insert_rel_type_cache_if_needed()` and
`delete_rel_type_cache_if_needed()` there is an if condition:
if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
(typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
typentry->tupDesc != NULL)
Based on the logic of the rest of the code does it make sense to use
TCFLAGS_DOMAIN_BASE_IS_COMPOSITE instead of TCFLAGS_OPERATOR_FLAGS?
Otherwise the condition doesn't look logical.
--
Kind regards,
Artur
Hi, Arthur!
Thank you so much for your review!
On Thu, Oct 10, 2024 at 6:54 PM Artur Zakirov <zaartur@gmail.com> wrote:
On Fri, 13 Sept 2024 at 01:38, Alexander Korotkov <aekorotkov@gmail.com> wrote:
0001 - adds comment about concurrent invalidation handling
0002 - revised c14d4acb8. Now we track type oids, whose
TypeCacheEntry's filing is in-progress. Add entry to
RelIdToTypeIdCacheHash at the end of lookup_type_cache() or on the
transaction abort. During invalidation don't assert
RelIdToTypeIdCacheHash to be here if TypeCacheEntry is in-progress.Thank you Alexander for the patch. I reviewed and tested it.
I used Teodor's script to check the performance. On my laptop on
master ROLLBACK runs 11496.219 ms. With patch ROLLBACK runs 378.990
ms.It seems to me that there are couple of possible issues in the patch:
In `lookup_type_cache()` `in_progress_list` is allocated using
`CacheMemoryContext`, on the other hand it seems there might be a case
when `CacheMemoryContext` is not created yet. It is created below in
the code if it doesn't exist:/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();It is probably a very rare case, but it might be better to allocate
`in_progress_list` after that line, or move creation of
`CacheMemoryContext` higher.
Yes, it makes sense to initialize `in_progress_list ` when
`CacheMemoryContext` is guaranteed to be initialized. Fixed in the
attached patch.
Within `insert_rel_type_cache_if_needed()` and
`delete_rel_type_cache_if_needed()` there is an if condition:if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
(typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
typentry->tupDesc != NULL)Based on the logic of the rest of the code does it make sense to use
TCFLAGS_DOMAIN_BASE_IS_COMPOSITE instead of TCFLAGS_OPERATOR_FLAGS?
Otherwise the condition doesn't look logical.
I'm not sure I get the point. This check ensures that type entry has
something to be cleared. In this case we need to keep
RelIdToTypeIdCacheHash entry to find this item on invalidation
message. I'm not sure how TCFLAGS_DOMAIN_BASE_IS_COMPOSITE is
relevant here, because it's valid only for TYPTYPE_DOMAIN while this
patch deals with TYPTYPE_COMPOSITE.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v12-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchapplication/octet-stream; name=v12-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchDownload
From ace04f1edd1f71300175738dd60f85570b1c646f Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 10 Sep 2024 23:25:04 +0300
Subject: [PATCH v12 2/2] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov
---
src/backend/access/transam/xact.c | 10 +
src/backend/utils/cache/typcache.c | 350 +++++++++++++++++++++++++----
src/include/utils/typcache.h | 4 +
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 321 insertions(+), 44 deletions(-)
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 87700c7c5c7..b0b05e28790 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -70,6 +70,7 @@
#include "utils/snapmgr.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
+#include "utils/typcache.h"
/*
* User-tweakable parameters
@@ -2407,6 +2408,9 @@ CommitTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/*
* Make catalog changes visible to all backends. This has to happen after
* relcache references are dropped (see comments for
@@ -2709,6 +2713,9 @@ PrepareTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/* notify doesn't need a postprepare call */
PostPrepare_PgStat();
@@ -2951,6 +2958,7 @@ AbortTransaction(void)
false, true);
AtEOXact_Buffers(false);
AtEOXact_RelationCache(false);
+ AtEOXact_TypeCache();
AtEOXact_Inval(false);
AtEOXact_MultiXact();
ResourceOwnerRelease(TopTransactionResourceOwner,
@@ -5153,6 +5161,7 @@ CommitSubTransaction(void)
true, false);
AtEOSubXact_RelationCache(true, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(true);
AtSubCommit_smgr();
@@ -5328,6 +5337,7 @@ AbortSubTransaction(void)
AtEOSubXact_RelationCache(false, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(false);
ResourceOwnerRelease(s->curTransactionOwner,
RESOURCE_RELEASE_LOCKS,
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 11382547ec0..fe37ff442b9 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The mapping of relation's OID to the corresponding composite type OID.
+ * We're keeping the map entry when the corresponding typentry has something
+ * to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or
+ * TCFLAGS_OPERATOR_FLAGS, or tupdesc.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -208,6 +222,10 @@ typedef struct SharedTypmodTableEntry
dsa_pointer shared_tupdesc;
} SharedTypmodTableEntry;
+static Oid *in_progress_list;
+static int in_progress_list_len;
+static int in_progress_list_maxlen;
+
/*
* A comparator function for SharedRecordTableKey.
*/
@@ -329,6 +347,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+static void insert_rel_type_cache_if_needed(TypeCacheEntry *typentry);
+static void delete_rel_type_cache_if_needed(TypeCacheEntry *typentry);
/*
@@ -366,11 +386,13 @@ lookup_type_cache(Oid type_id, int flags)
{
TypeCacheEntry *typentry;
bool found;
+ int in_progress_offset;
if (TypeCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
+ int allocsize;
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
@@ -385,6 +407,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to OID of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -394,8 +423,32 @@ lookup_type_cache(Oid type_id, int flags)
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
+
+ /*
+ * reserve enough in_progress_list slots for many cases
+ */
+ allocsize = 4;
+ in_progress_list =
+ MemoryContextAlloc(CacheMemoryContext,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
+ /* Register to catch invalidation messages */
+ if (in_progress_list_len >= in_progress_list_maxlen)
+ {
+ int allocsize;
+
+ allocsize = in_progress_list_maxlen * 2;
+ in_progress_list = repalloc(in_progress_list,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
+ }
+ in_progress_offset = in_progress_list_len++;
+ in_progress_list[in_progress_offset] = type_id;
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -896,6 +949,11 @@ lookup_type_cache(Oid type_id, int flags)
load_domaintype_info(typentry);
}
+ Assert(in_progress_offset + 1 == in_progress_list_len);
+ in_progress_list_len--;
+
+ insert_rel_type_cache_if_needed(typentry);
+
return typentry;
}
@@ -2290,6 +2348,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call check_delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ delete_rel_type_cache_if_needed(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2299,63 +2404,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically this shouldn't
+ * affect performance since domain types are less tended to bloat.
+ * Domain types are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2367,6 +2464,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid is invalid. By convention we need to reset all composite
+ * types in cache. Also, we should reset flags for domain types, and
+ * we loop over all entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2397,6 +2524,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2406,6 +2535,13 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /*
+ * Call check_delete_rel_type_cache() if we cleaned
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
+ */
+ if (hadPgTypeData)
+ delete_rel_type_cache_if_needed(typentry);
}
}
@@ -2914,3 +3050,129 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry if needed.
+ */
+static void
+insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry have any
+ * information indicating it should be here.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc.
+ */
+static void
+delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+#ifdef USE_ASSERT_CHECKING
+ int i;
+ bool is_in_progress = false;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ if (in_progress_list[i] == typentry->type_id)
+ {
+ is_in_progress = true;
+ break;
+ }
+ }
+#endif
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found || is_in_progress);
+ }
+ else
+ {
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ bool found;
+
+ if (!is_in_progress)
+ {
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+ }
+}
+
+static void
+cleanup_in_progress_typentries(void)
+{
+ int i;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ TypeCacheEntry *typentry;
+
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &in_progress_list[i],
+ HASH_FIND, NULL);
+ insert_rel_type_cache_if_needed(typentry);
+ }
+
+ in_progress_list_len = 0;
+}
+
+void
+AtEOXact_TypeCache(void)
+{
+ cleanup_in_progress_typentries();
+}
+
+void
+AtEOSubXact_TypeCache(void)
+{
+ cleanup_in_progress_typentries();
+}
diff --git a/src/include/utils/typcache.h b/src/include/utils/typcache.h
index f506cc4aa35..f3d73ecee3a 100644
--- a/src/include/utils/typcache.h
+++ b/src/include/utils/typcache.h
@@ -207,4 +207,8 @@ extern void SharedRecordTypmodRegistryInit(SharedRecordTypmodRegistry *,
extern void SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *);
+extern void AtEOXact_TypeCache(void);
+
+extern void AtEOSubXact_TypeCache(void);
+
#endif /* TYPCACHE_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a65e1c07c5d..80c25099e67 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.5 (Apple Git-154)
v12-0001-Update-header-comment-for-lookup_type_cache.patchapplication/octet-stream; name=v12-0001-Update-header-comment-for-lookup_type_cache.patchDownload
From 92bbee11fa999ce9ad89349ab174f9e48ba2c753 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri, 13 Sep 2024 02:10:04 +0300
Subject: [PATCH v12 1/2] Update header comment for lookup_type_cache()
Describe the way we handle concurrent invalidation messages.
---
src/backend/utils/cache/typcache.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 2ec136b7d30..11382547ec0 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -351,6 +351,15 @@ type_cache_syshash(const void *key, Size keysize)
* invalid. Note however that we may fail to find one or more of the
* values requested by 'flags'; the caller needs to check whether the fields
* are InvalidOid or not.
+ *
+ * Note that while filling TypeCacheEntry we might process concurrent
+ * invalidation messages, causing our not-yet-filled TypeCacheEntry to be
+ * invalidated. In this case, we typically only clear flags while values are
+ * still available for the caller. It's expected that the caller holds
+ * enough locks on type-depending objects that the values are still relevant.
+ * It's also important that the tupdesc is filled in the last for
+ * TYPTYPE_COMPOSITE. So, it can't get invalidated during the
+ * lookup_type_cache() call.
*/
TypeCacheEntry *
lookup_type_cache(Oid type_id, int flags)
--
2.39.5 (Apple Git-154)
On Sun, Oct 13, 2024 at 8:09 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
hi. Alexander.
I don't fully understand all of it. but I did some tests anyway.
static void
cleanup_in_progress_typentries(void)
{
int i;
if (in_progress_list_len > 1)
elog(INFO, "%s:%d in_progress_list_len > 1", __FILE_NAME__, __LINE__);
for (i = 0; i < in_progress_list_len; i++)
{
TypeCacheEntry *typentry;
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&in_progress_list[i],
HASH_FIND, NULL);
insert_rel_type_cache_if_needed(typentry);
}
in_progress_list_len = 0;
}
the regress still passed.
I assume "elog(INFO, " won't interfere in cleanup_in_progress_typentries.
So we lack tests for larger in_progress_list_len values or i missed something?
/* Call check_delete_rel_type_cache() if we actually cleared something */
if (hadTupDescOrOpclass)
delete_rel_type_cache_if_needed(typentry);
/*
* Call check_delete_rel_type_cache() if we cleaned
* TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
*/
if (hadPgTypeData)
delete_rel_type_cache_if_needed(typentry);
check_delete_rel_type_cache don't exist, so these comments are wrong?
Hi, Jian!
Thank you for your review.
On Tue, Oct 15, 2024 at 10:34 AM jian he <jian.universality@gmail.com> wrote:
I don't fully understand all of it. but I did some tests anyway.
static void
cleanup_in_progress_typentries(void)
{
int i;
if (in_progress_list_len > 1)
elog(INFO, "%s:%d in_progress_list_len > 1", __FILE_NAME__, __LINE__);
for (i = 0; i < in_progress_list_len; i++)
{
TypeCacheEntry *typentry;
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&in_progress_list[i],
HASH_FIND, NULL);
insert_rel_type_cache_if_needed(typentry);
}
in_progress_list_len = 0;
}the regress still passed.
I assume "elog(INFO, " won't interfere in cleanup_in_progress_typentries.
So we lack tests for larger in_progress_list_len values or i missed something?
Try to run test suite with -DCLOBBER_CACHE_ALWAYS.
/* Call check_delete_rel_type_cache() if we actually cleared something */
if (hadTupDescOrOpclass)
delete_rel_type_cache_if_needed(typentry);/*
* Call check_delete_rel_type_cache() if we cleaned
* TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
*/
if (hadPgTypeData)
delete_rel_type_cache_if_needed(typentry);check_delete_rel_type_cache don't exist, so these comments are wrong?
Yep, they didn't get updated. Fixed in the attached patchset.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v13-0001-Update-header-comment-for-lookup_type_cache.patchapplication/octet-stream; name=v13-0001-Update-header-comment-for-lookup_type_cache.patchDownload
From 8e736ebc3f69fec323351bd466d178309b734e27 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri, 13 Sep 2024 02:10:04 +0300
Subject: [PATCH v13 1/2] Update header comment for lookup_type_cache()
Describe the way we handle concurrent invalidation messages.
---
src/backend/utils/cache/typcache.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 2ec136b7d30..11382547ec0 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -351,6 +351,15 @@ type_cache_syshash(const void *key, Size keysize)
* invalid. Note however that we may fail to find one or more of the
* values requested by 'flags'; the caller needs to check whether the fields
* are InvalidOid or not.
+ *
+ * Note that while filling TypeCacheEntry we might process concurrent
+ * invalidation messages, causing our not-yet-filled TypeCacheEntry to be
+ * invalidated. In this case, we typically only clear flags while values are
+ * still available for the caller. It's expected that the caller holds
+ * enough locks on type-depending objects that the values are still relevant.
+ * It's also important that the tupdesc is filled in the last for
+ * TYPTYPE_COMPOSITE. So, it can't get invalidated during the
+ * lookup_type_cache() call.
*/
TypeCacheEntry *
lookup_type_cache(Oid type_id, int flags)
--
2.39.5 (Apple Git-154)
v13-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchapplication/octet-stream; name=v13-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchDownload
From d2fe600b042ea6b21f4a2460b4754b47b3775e8e Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 10 Sep 2024 23:25:04 +0300
Subject: [PATCH v13 2/2] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov
---
src/backend/access/transam/xact.c | 10 +
src/backend/utils/cache/typcache.c | 350 +++++++++++++++++++++++++----
src/include/utils/typcache.h | 4 +
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 321 insertions(+), 44 deletions(-)
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 87700c7c5c7..b0b05e28790 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -70,6 +70,7 @@
#include "utils/snapmgr.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
+#include "utils/typcache.h"
/*
* User-tweakable parameters
@@ -2407,6 +2408,9 @@ CommitTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/*
* Make catalog changes visible to all backends. This has to happen after
* relcache references are dropped (see comments for
@@ -2709,6 +2713,9 @@ PrepareTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/* notify doesn't need a postprepare call */
PostPrepare_PgStat();
@@ -2951,6 +2958,7 @@ AbortTransaction(void)
false, true);
AtEOXact_Buffers(false);
AtEOXact_RelationCache(false);
+ AtEOXact_TypeCache();
AtEOXact_Inval(false);
AtEOXact_MultiXact();
ResourceOwnerRelease(TopTransactionResourceOwner,
@@ -5153,6 +5161,7 @@ CommitSubTransaction(void)
true, false);
AtEOSubXact_RelationCache(true, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(true);
AtSubCommit_smgr();
@@ -5328,6 +5337,7 @@ AbortSubTransaction(void)
AtEOSubXact_RelationCache(false, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(false);
ResourceOwnerRelease(s->curTransactionOwner,
RESOURCE_RELEASE_LOCKS,
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 11382547ec0..f54e7d531a8 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The mapping of relation's OID to the corresponding composite type OID.
+ * We're keeping the map entry when the corresponding typentry has something
+ * to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or
+ * TCFLAGS_OPERATOR_FLAGS, or tupdesc.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -208,6 +222,10 @@ typedef struct SharedTypmodTableEntry
dsa_pointer shared_tupdesc;
} SharedTypmodTableEntry;
+static Oid *in_progress_list;
+static int in_progress_list_len;
+static int in_progress_list_maxlen;
+
/*
* A comparator function for SharedRecordTableKey.
*/
@@ -329,6 +347,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+static void insert_rel_type_cache_if_needed(TypeCacheEntry *typentry);
+static void delete_rel_type_cache_if_needed(TypeCacheEntry *typentry);
/*
@@ -366,11 +386,13 @@ lookup_type_cache(Oid type_id, int flags)
{
TypeCacheEntry *typentry;
bool found;
+ int in_progress_offset;
if (TypeCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
+ int allocsize;
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
@@ -385,6 +407,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to OID of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -394,8 +423,32 @@ lookup_type_cache(Oid type_id, int flags)
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
+
+ /*
+ * reserve enough in_progress_list slots for many cases
+ */
+ allocsize = 4;
+ in_progress_list =
+ MemoryContextAlloc(CacheMemoryContext,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
+ /* Register to catch invalidation messages */
+ if (in_progress_list_len >= in_progress_list_maxlen)
+ {
+ int allocsize;
+
+ allocsize = in_progress_list_maxlen * 2;
+ in_progress_list = repalloc(in_progress_list,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
+ }
+ in_progress_offset = in_progress_list_len++;
+ in_progress_list[in_progress_offset] = type_id;
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -896,6 +949,11 @@ lookup_type_cache(Oid type_id, int flags)
load_domaintype_info(typentry);
}
+ Assert(in_progress_offset + 1 == in_progress_list_len);
+ in_progress_list_len--;
+
+ insert_rel_type_cache_if_needed(typentry);
+
return typentry;
}
@@ -2290,6 +2348,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ delete_rel_type_cache_if_needed(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2299,63 +2404,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically this shouldn't
+ * affect performance since domain types are less tended to bloat.
+ * Domain types are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2367,6 +2464,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid is invalid. By convention we need to reset all composite
+ * types in cache. Also, we should reset flags for domain types, and
+ * we loop over all entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2397,6 +2524,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2406,6 +2535,13 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /*
+ * Call delete_rel_type_cache() if we cleaned
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
+ */
+ if (hadPgTypeData)
+ delete_rel_type_cache_if_needed(typentry);
}
}
@@ -2914,3 +3050,129 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry if needed.
+ */
+static void
+insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry have any
+ * information indicating it should be here.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc.
+ */
+static void
+delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+#ifdef USE_ASSERT_CHECKING
+ int i;
+ bool is_in_progress = false;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ if (in_progress_list[i] == typentry->type_id)
+ {
+ is_in_progress = true;
+ break;
+ }
+ }
+#endif
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found || is_in_progress);
+ }
+ else
+ {
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ bool found;
+
+ if (!is_in_progress)
+ {
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+ }
+}
+
+static void
+cleanup_in_progress_typentries(void)
+{
+ int i;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ TypeCacheEntry *typentry;
+
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &in_progress_list[i],
+ HASH_FIND, NULL);
+ insert_rel_type_cache_if_needed(typentry);
+ }
+
+ in_progress_list_len = 0;
+}
+
+void
+AtEOXact_TypeCache(void)
+{
+ cleanup_in_progress_typentries();
+}
+
+void
+AtEOSubXact_TypeCache(void)
+{
+ cleanup_in_progress_typentries();
+}
diff --git a/src/include/utils/typcache.h b/src/include/utils/typcache.h
index f506cc4aa35..f3d73ecee3a 100644
--- a/src/include/utils/typcache.h
+++ b/src/include/utils/typcache.h
@@ -207,4 +207,8 @@ extern void SharedRecordTypmodRegistryInit(SharedRecordTypmodRegistry *,
extern void SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *);
+extern void AtEOXact_TypeCache(void);
+
+extern void AtEOSubXact_TypeCache(void);
+
#endif /* TYPCACHE_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 57de1acff3a..9b3e7fd104b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.5 (Apple Git-154)
On Tue, 15 Oct 2024 at 10:09, Alexander Korotkov <aekorotkov@gmail.com> wrote:
/* Call check_delete_rel_type_cache() if we actually cleared something */
if (hadTupDescOrOpclass)
delete_rel_type_cache_if_needed(typentry);/*
* Call check_delete_rel_type_cache() if we cleaned
* TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
*/
if (hadPgTypeData)
delete_rel_type_cache_if_needed(typentry);check_delete_rel_type_cache don't exist, so these comments are wrong?
Yep, they didn't get updated. Fixed in the attached patchset.
Thank you Alexander for the fixes. The last version of the patch looks
good to me.
I'm not sure I get the point. This check ensures that type entry has
something to be cleared. In this case we need to keep
RelIdToTypeIdCacheHash entry to find this item on invalidation
message. I'm not sure how TCFLAGS_DOMAIN_BASE_IS_COMPOSITE is
relevant here, because it's valid only for TYPTYPE_DOMAIN while this
patch deals with TYPTYPE_COMPOSITE.
Regarding this discussion earlier, I assumed that TYPTYPE_DOMAIN also
needs to be handled by `insert_rel_type_cache_if_needed()`. And it
seems that handling of TYPTYPE_DOMAIN will remain the same as before.
--
Kind regards,
Artur
On Tue, Oct 15, 2024 at 4:09 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
Hi, Jian!
Thank you for your review.
On Tue, Oct 15, 2024 at 10:34 AM jian he <jian.universality@gmail.com> wrote:
I don't fully understand all of it. but I did some tests anyway.
static void
cleanup_in_progress_typentries(void)
{
int i;
if (in_progress_list_len > 1)
elog(INFO, "%s:%d in_progress_list_len > 1", __FILE_NAME__, __LINE__);
for (i = 0; i < in_progress_list_len; i++)
{
TypeCacheEntry *typentry;
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&in_progress_list[i],
HASH_FIND, NULL);
insert_rel_type_cache_if_needed(typentry);
}
in_progress_list_len = 0;
}the regress still passed.
I assume "elog(INFO, " won't interfere in cleanup_in_progress_typentries.
So we lack tests for larger in_progress_list_len values or i missed something?Try to run test suite with -DCLOBBER_CACHE_ALWAYS.
build from source, DCLOBBER_CACHE_ALWAYS takes a very long time.
So I gave up.
in lookup_type_cache, we unconditionally do
in_progress_list_len++;
in_progress_list_len--;
"static int in_progress_list_len;"
means in_progress_list_len value change is confined in
src/backend/utils/cache/typcache.c.
based on above information, i am still confused with
cleanup_in_progress_typentries, in_progress_list_len
is there any simple sql example to demo
cleanup_in_progress_typentries, in_progress_list_len> 0.
On Tue, 15 Oct 2024 at 11:50, jian he <jian.universality@gmail.com> wrote:
based on above information, i am still confused with
cleanup_in_progress_typentries, in_progress_list_len
is there any simple sql example to demo
cleanup_in_progress_typentries, in_progress_list_len> 0.
AFAIK to reproduce cases when `in_progress_list_len > 0`
`lookup_type_cache()` should fail during its execution.
To do so you can call `lookup_type_cache()` with non-existing type_id
from a C function.
--
Kind regards,
Artur
On 10/15/24 15:08, Alexander Korotkov wrote:
Yep, they didn't get updated. Fixed in the attached patchset.
Let me wear Alexander Lakhin's mask for a moment and say that the code
may cause a segfault:
#0 0x000055e0da186000 in insert_rel_type_cache_if_needed (typentry=0x0)
at typcache.c:3066
b3066 if (typentry->typtype != TYPTYPE_COMPOSITE)
(gdb) bt 20
#0 0x000055e0da186000 in insert_rel_type_cache_if_needed (typentry=0x0)
at typcache.c:3066
#1 0x000055e0da18844f in cleanup_in_progress_typentries () at
typcache.c:3172
#2 0x000055e0da1883f9 in AtEOXact_TypeCache () at typcache.c:3181
#3 0x000055e0d9a22e59 in AbortTransaction () at xact.c:2961
#4 0x000055e0d9a1f75c in AbortCurrentTransactionInternal () at xact.c:3491
#5 0x000055e0d9a1f6be in AbortCurrentTransaction () at xact.c:3445
#6 0x000055e0d9f55f28 in PostgresMain (dbname=0x55e1057fb838
"regression", username=0x55e1057fb818 "danolivo") at postgres.c:4508
#7 0x000055e0d9f4e4a3 in BackendMain (startup_data=0x7ffeaf051310 "",
startup_data_len=4) at backend_startup.c:107
#8 0x000055e0d9e4bfee in postmaster_child_launch (child_type=B_BACKEND,
startup_data=0x7ffeaf051310 "", startup_data_len=4,
client_sock=0x7ffeaf051358) at launch_backend.c:274
#9 0x000055e0d9e522a3 in BackendStartup (client_sock=0x7ffeaf051358) at
postmaster.c:3420
#10 0x000055e0d9e502f9 in ServerLoop () at postmaster.c:1653
#11 0x000055e0d9e4f4fe in PostmasterMain (argc=3, argv=0x55e1057bb520)
at postmaster.c:1351
#12 0x000055e0d9cf4b2d in main (argc=3, argv=0x55e1057bb520) at main.c:197
It can happen if something triggers an error in the middle of
lookup_type_cache when in_progress_list[i] is already filled, but the
typentry wasn't created.
I think it can be easily shielded (see attached). Also, the name
cleanup_in_progress_typentries causes a lot of ponderings, I guess it
would be better to rename it likewise finalize_in_progress_typentries.
Also, I added trivial comments to better understand what the function does.
I think the first patch may already be committed, and this little burden
may be avoided in future versions.
--
regards, Andrei Lepikhov
Attachments:
minor-fix.difftext/x-patch; charset=UTF-8; name=minor-fix.diffDownload
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index f54e7d531a..45aed74019 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -3058,7 +3058,7 @@ static void
insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
{
/* Immediately quit for non-composite types */
- if (typentry->typtype != TYPTYPE_COMPOSITE)
+ if (!typentry || typentry->typtype != TYPTYPE_COMPOSITE)
return;
/* typrelid should be given for composite types */
@@ -3147,8 +3147,13 @@ delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
}
}
+/*
+ * Add into the accessory hash table entries, added into TypCache and not added
+ * into the RelIdToTypeId matching hash table.
+ * It may happen in case of an error raised during the lookup_type_cache call.
+ */
static void
-cleanup_in_progress_typentries(void)
+finalize_in_progress_typentries(void)
{
int i;
@@ -3168,11 +3173,11 @@ cleanup_in_progress_typentries(void)
void
AtEOXact_TypeCache(void)
{
- cleanup_in_progress_typentries();
+ finalize_in_progress_typentries();
}
void
AtEOSubXact_TypeCache(void)
{
- cleanup_in_progress_typentries();
+ finalize_in_progress_typentries();
}
On Thu, Oct 17, 2024 at 12:41 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 10/15/24 15:08, Alexander Korotkov wrote:
Yep, they didn't get updated. Fixed in the attached patchset.
Let me wear Alexander Lakhin's mask for a moment and say that the code
may cause a segfault:#0 0x000055e0da186000 in insert_rel_type_cache_if_needed (typentry=0x0)
at typcache.c:3066
b3066 if (typentry->typtype != TYPTYPE_COMPOSITE)
(gdb) bt 20
#0 0x000055e0da186000 in insert_rel_type_cache_if_needed (typentry=0x0)
at typcache.c:3066
#1 0x000055e0da18844f in cleanup_in_progress_typentries () at
typcache.c:3172
#2 0x000055e0da1883f9 in AtEOXact_TypeCache () at typcache.c:3181
#3 0x000055e0d9a22e59 in AbortTransaction () at xact.c:2961
#4 0x000055e0d9a1f75c in AbortCurrentTransactionInternal () at xact.c:3491
#5 0x000055e0d9a1f6be in AbortCurrentTransaction () at xact.c:3445
#6 0x000055e0d9f55f28 in PostgresMain (dbname=0x55e1057fb838
"regression", username=0x55e1057fb818 "danolivo") at postgres.c:4508
#7 0x000055e0d9f4e4a3 in BackendMain (startup_data=0x7ffeaf051310 "",
startup_data_len=4) at backend_startup.c:107
#8 0x000055e0d9e4bfee in postmaster_child_launch (child_type=B_BACKEND,
startup_data=0x7ffeaf051310 "", startup_data_len=4,
client_sock=0x7ffeaf051358) at launch_backend.c:274
#9 0x000055e0d9e522a3 in BackendStartup (client_sock=0x7ffeaf051358) at
postmaster.c:3420
#10 0x000055e0d9e502f9 in ServerLoop () at postmaster.c:1653
#11 0x000055e0d9e4f4fe in PostmasterMain (argc=3, argv=0x55e1057bb520)
at postmaster.c:1351
#12 0x000055e0d9cf4b2d in main (argc=3, argv=0x55e1057bb520) at main.c:197It can happen if something triggers an error in the middle of
lookup_type_cache when in_progress_list[i] is already filled, but the
typentry wasn't created.
I think it can be easily shielded (see attached). Also, the name
cleanup_in_progress_typentries causes a lot of ponderings, I guess it
would be better to rename it likewise finalize_in_progress_typentries.
Also, I added trivial comments to better understand what the function does.I think the first patch may already be committed, and this little burden
may be avoided in future versions.
Thank you!
I've integrated your patch. But I think
finalize_in_progress_typentries() is more appropriate place to check
typentry for NULL. Also, I've revised the
finalize_in_progress_typentries() header comment. I'm going to push
these patches if no objections.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v14-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchapplication/octet-stream; name=v14-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchDownload
From a4bbadce7df1777eea8b98fb72f1c9163ac8ce11 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 10 Sep 2024 23:25:04 +0300
Subject: [PATCH v14 2/2] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov
---
src/backend/access/transam/xact.c | 10 +
src/backend/utils/cache/typcache.c | 356 +++++++++++++++++++++++++----
src/include/utils/typcache.h | 4 +
src/tools/pgindent/typedefs.list | 1 +
4 files changed, 327 insertions(+), 44 deletions(-)
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 87700c7c5c7..b0b05e28790 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -70,6 +70,7 @@
#include "utils/snapmgr.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
+#include "utils/typcache.h"
/*
* User-tweakable parameters
@@ -2407,6 +2408,9 @@ CommitTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/*
* Make catalog changes visible to all backends. This has to happen after
* relcache references are dropped (see comments for
@@ -2709,6 +2713,9 @@ PrepareTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/* notify doesn't need a postprepare call */
PostPrepare_PgStat();
@@ -2951,6 +2958,7 @@ AbortTransaction(void)
false, true);
AtEOXact_Buffers(false);
AtEOXact_RelationCache(false);
+ AtEOXact_TypeCache();
AtEOXact_Inval(false);
AtEOXact_MultiXact();
ResourceOwnerRelease(TopTransactionResourceOwner,
@@ -5153,6 +5161,7 @@ CommitSubTransaction(void)
true, false);
AtEOSubXact_RelationCache(true, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(true);
AtSubCommit_smgr();
@@ -5328,6 +5337,7 @@ AbortSubTransaction(void)
AtEOSubXact_RelationCache(false, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(false);
ResourceOwnerRelease(s->curTransactionOwner,
RESOURCE_RELEASE_LOCKS,
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 11382547ec0..5df965443ae 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -77,6 +77,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The mapping of relation's OID to the corresponding composite type OID.
+ * We're keeping the map entry when the corresponding typentry has something
+ * to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or
+ * TCFLAGS_OPERATOR_FLAGS, or tupdesc.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -208,6 +222,10 @@ typedef struct SharedTypmodTableEntry
dsa_pointer shared_tupdesc;
} SharedTypmodTableEntry;
+static Oid *in_progress_list;
+static int in_progress_list_len;
+static int in_progress_list_maxlen;
+
/*
* A comparator function for SharedRecordTableKey.
*/
@@ -329,6 +347,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+static void insert_rel_type_cache_if_needed(TypeCacheEntry *typentry);
+static void delete_rel_type_cache_if_needed(TypeCacheEntry *typentry);
/*
@@ -366,11 +386,13 @@ lookup_type_cache(Oid type_id, int flags)
{
TypeCacheEntry *typentry;
bool found;
+ int in_progress_offset;
if (TypeCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
+ int allocsize;
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
@@ -385,6 +407,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to OID of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -394,8 +423,32 @@ lookup_type_cache(Oid type_id, int flags)
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
+
+ /*
+ * reserve enough in_progress_list slots for many cases
+ */
+ allocsize = 4;
+ in_progress_list =
+ MemoryContextAlloc(CacheMemoryContext,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
+ /* Register to catch invalidation messages */
+ if (in_progress_list_len >= in_progress_list_maxlen)
+ {
+ int allocsize;
+
+ allocsize = in_progress_list_maxlen * 2;
+ in_progress_list = repalloc(in_progress_list,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
+ }
+ in_progress_offset = in_progress_list_len++;
+ in_progress_list[in_progress_offset] = type_id;
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -896,6 +949,11 @@ lookup_type_cache(Oid type_id, int flags)
load_domaintype_info(typentry);
}
+ Assert(in_progress_offset + 1 == in_progress_list_len);
+ in_progress_list_len--;
+
+ insert_rel_type_cache_if_needed(typentry);
+
return typentry;
}
@@ -2290,6 +2348,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ delete_rel_type_cache_if_needed(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2299,63 +2404,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically this shouldn't
+ * affect performance since domain types are less tended to bloat.
+ * Domain types are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2367,6 +2464,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid is invalid. By convention we need to reset all composite
+ * types in cache. Also, we should reset flags for domain types, and
+ * we loop over all entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2397,6 +2524,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2406,6 +2535,13 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /*
+ * Call delete_rel_type_cache() if we cleaned
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
+ */
+ if (hadPgTypeData)
+ delete_rel_type_cache_if_needed(typentry);
}
}
@@ -2914,3 +3050,135 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry if needed.
+ */
+static void
+insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry have any
+ * information indicating it should be here.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc.
+ */
+static void
+delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+#ifdef USE_ASSERT_CHECKING
+ int i;
+ bool is_in_progress = false;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ if (in_progress_list[i] == typentry->type_id)
+ {
+ is_in_progress = true;
+ break;
+ }
+ }
+#endif
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found || is_in_progress);
+ }
+ else
+ {
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ bool found;
+
+ if (!is_in_progress)
+ {
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+ }
+}
+
+/*
+ * Add possibly missing RelIdToTypeId entries related to TypeCacheHas
+ * entries, marked as in-progress by lookup_type_cache(). It may happen
+ * in case of an error or interruption during the lookup_type_cache() call.
+ */
+static void
+finalize_in_progress_typentries(void)
+{
+ int i;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ TypeCacheEntry *typentry;
+
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &in_progress_list[i],
+ HASH_FIND, NULL);
+ if (typentry)
+ insert_rel_type_cache_if_needed(typentry);
+ }
+
+ in_progress_list_len = 0;
+}
+
+void
+AtEOXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
+
+void
+AtEOSubXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
diff --git a/src/include/utils/typcache.h b/src/include/utils/typcache.h
index f506cc4aa35..f3d73ecee3a 100644
--- a/src/include/utils/typcache.h
+++ b/src/include/utils/typcache.h
@@ -207,4 +207,8 @@ extern void SharedRecordTypmodRegistryInit(SharedRecordTypmodRegistry *,
extern void SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *);
+extern void AtEOXact_TypeCache(void);
+
+extern void AtEOSubXact_TypeCache(void);
+
#endif /* TYPCACHE_H */
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 57de1acff3a..9b3e7fd104b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.5 (Apple Git-154)
v14-0001-Update-header-comment-for-lookup_type_cache.patchapplication/octet-stream; name=v14-0001-Update-header-comment-for-lookup_type_cache.patchDownload
From d624b14c9a4bd93426a5e475d4503589927ef2ae Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri, 13 Sep 2024 02:10:04 +0300
Subject: [PATCH v14 1/2] Update header comment for lookup_type_cache()
Describe the way we handle concurrent invalidation messages.
---
src/backend/utils/cache/typcache.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 2ec136b7d30..11382547ec0 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -351,6 +351,15 @@ type_cache_syshash(const void *key, Size keysize)
* invalid. Note however that we may fail to find one or more of the
* values requested by 'flags'; the caller needs to check whether the fields
* are InvalidOid or not.
+ *
+ * Note that while filling TypeCacheEntry we might process concurrent
+ * invalidation messages, causing our not-yet-filled TypeCacheEntry to be
+ * invalidated. In this case, we typically only clear flags while values are
+ * still available for the caller. It's expected that the caller holds
+ * enough locks on type-depending objects that the values are still relevant.
+ * It's also important that the tupdesc is filled in the last for
+ * TYPTYPE_COMPOSITE. So, it can't get invalidated during the
+ * lookup_type_cache() call.
*/
TypeCacheEntry *
lookup_type_cache(Oid type_id, int flags)
--
2.39.5 (Apple Git-154)
On Tue, Oct 15, 2024 at 12:50 PM jian he <jian.universality@gmail.com> wrote:
On Tue, Oct 15, 2024 at 4:09 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
Hi, Jian!
Thank you for your review.
On Tue, Oct 15, 2024 at 10:34 AM jian he <jian.universality@gmail.com> wrote:
I don't fully understand all of it. but I did some tests anyway.
static void
cleanup_in_progress_typentries(void)
{
int i;
if (in_progress_list_len > 1)
elog(INFO, "%s:%d in_progress_list_len > 1", __FILE_NAME__, __LINE__);
for (i = 0; i < in_progress_list_len; i++)
{
TypeCacheEntry *typentry;
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&in_progress_list[i],
HASH_FIND, NULL);
insert_rel_type_cache_if_needed(typentry);
}
in_progress_list_len = 0;
}the regress still passed.
I assume "elog(INFO, " won't interfere in cleanup_in_progress_typentries.
So we lack tests for larger in_progress_list_len values or i missed something?Try to run test suite with -DCLOBBER_CACHE_ALWAYS.
build from source, DCLOBBER_CACHE_ALWAYS takes a very long time.
So I gave up.in lookup_type_cache, we unconditionally do
in_progress_list_len++;
in_progress_list_len--;
Yes, this should work OK when no errors. On error or interruption,
finalize_in_progress_typentries() will clean the things up.
"static int in_progress_list_len;"
means in_progress_list_len value change is confined in
src/backend/utils/cache/typcache.c.
Yep.
based on above information, i am still confused with
cleanup_in_progress_typentries, in_progress_list_len
is there any simple sql example to demo
cleanup_in_progress_typentries, in_progress_list_len> 0.
I don't think there is simple sql to reliably reproduce that. In
order to hit that, we must process invalidation messages in some
(short) moment of time during lookup_type_cache(). You can reproduce
that by setting a breakpoint in lookup_type_cache() and in parallel do
something to invalidate the type cache entry (for instance, ALTER
TABLE ... ADD COLUMN ... would invalidate the composite type). In
principle, we can reproduce that using injection points. However, I'm
not intended to do that as long as we have buildfarm members with
-DCLOBBER_CACHE_ALWAYS. FWIW, I will for sure run tests with
-DCLOBBER_CACHE_ALWAYS before committing this.
------
Regards,
Alexander Korotkov
Supabase
On Sun, Oct 20, 2024 at 8:47 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Tue, Oct 15, 2024 at 12:50 PM jian he <jian.universality@gmail.com> wrote:
build from source, DCLOBBER_CACHE_ALWAYS takes a very long time.
So I gave up.in lookup_type_cache, we unconditionally do
in_progress_list_len++;
in_progress_list_len--;Yes, this should work OK when no errors. On error or interruption,
finalize_in_progress_typentries() will clean the things up."static int in_progress_list_len;"
means in_progress_list_len value change is confined in
src/backend/utils/cache/typcache.c.Yep.
based on above information, i am still confused with
cleanup_in_progress_typentries, in_progress_list_len
is there any simple sql example to demo
cleanup_in_progress_typentries, in_progress_list_len> 0.I don't think there is simple sql to reliably reproduce that. In
order to hit that, we must process invalidation messages in some
(short) moment of time during lookup_type_cache(). You can reproduce
that by setting a breakpoint in lookup_type_cache() and in parallel do
something to invalidate the type cache entry (for instance, ALTER
TABLE ... ADD COLUMN ... would invalidate the composite type).
Oops, concurrent invalidation message is not enough here. So,
-DCLOBBER_CACHE_ALWAYS is also not enough to reproduce the situation.
Injection-point test is required. I'm going to add this.
------
Regards,
Alexander Korotkov
Supabase
On Sun, Oct 20, 2024 at 9:00 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Sun, Oct 20, 2024 at 8:47 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Tue, Oct 15, 2024 at 12:50 PM jian he <jian.universality@gmail.com> wrote:
build from source, DCLOBBER_CACHE_ALWAYS takes a very long time.
So I gave up.in lookup_type_cache, we unconditionally do
in_progress_list_len++;
in_progress_list_len--;Yes, this should work OK when no errors. On error or interruption,
finalize_in_progress_typentries() will clean the things up."static int in_progress_list_len;"
means in_progress_list_len value change is confined in
src/backend/utils/cache/typcache.c.Yep.
based on above information, i am still confused with
cleanup_in_progress_typentries, in_progress_list_len
is there any simple sql example to demo
cleanup_in_progress_typentries, in_progress_list_len> 0.I don't think there is simple sql to reliably reproduce that. In
order to hit that, we must process invalidation messages in some
(short) moment of time during lookup_type_cache(). You can reproduce
that by setting a breakpoint in lookup_type_cache() and in parallel do
something to invalidate the type cache entry (for instance, ALTER
TABLE ... ADD COLUMN ... would invalidate the composite type).Oops, concurrent invalidation message is not enough here. So,
-DCLOBBER_CACHE_ALWAYS is also not enough to reproduce the situation.
Injection-point test is required. I'm going to add this.
Here you go. The test with injection point is implemented.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v15-0001-Update-header-comment-for-lookup_type_cache.patchapplication/octet-stream; name=v15-0001-Update-header-comment-for-lookup_type_cache.patchDownload
From d624b14c9a4bd93426a5e475d4503589927ef2ae Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri, 13 Sep 2024 02:10:04 +0300
Subject: [PATCH v15 1/2] Update header comment for lookup_type_cache()
Describe the way we handle concurrent invalidation messages.
---
src/backend/utils/cache/typcache.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 2ec136b7d30..11382547ec0 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -351,6 +351,15 @@ type_cache_syshash(const void *key, Size keysize)
* invalid. Note however that we may fail to find one or more of the
* values requested by 'flags'; the caller needs to check whether the fields
* are InvalidOid or not.
+ *
+ * Note that while filling TypeCacheEntry we might process concurrent
+ * invalidation messages, causing our not-yet-filled TypeCacheEntry to be
+ * invalidated. In this case, we typically only clear flags while values are
+ * still available for the caller. It's expected that the caller holds
+ * enough locks on type-depending objects that the values are still relevant.
+ * It's also important that the tupdesc is filled in the last for
+ * TYPTYPE_COMPOSITE. So, it can't get invalidated during the
+ * lookup_type_cache() call.
*/
TypeCacheEntry *
lookup_type_cache(Oid type_id, int flags)
--
2.39.5 (Apple Git-154)
v15-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchapplication/octet-stream; name=v15-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchDownload
From 9412092fc39d2989167bb2d48606e0c27a755a5b Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 10 Sep 2024 23:25:04 +0300
Subject: [PATCH v15 2/2] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
There are many places in lookup_type_cache() where syscache invalidation,
user interruption, or even error could occur. In order to handle this, we
keep an array of in-progress type cache entries. In the case of
lookup_type_cache() interruption this array is processed to keep
RelIdToTypeIdCacheHash in a consistent state.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov
---
src/backend/access/transam/xact.c | 10 +
src/backend/utils/cache/typcache.c | 359 +++++++++++++++---
src/include/utils/typcache.h | 4 +
src/test/modules/Makefile | 4 +-
src/test/modules/meson.build | 1 +
src/test/modules/typcache/.gitignore | 4 +
src/test/modules/typcache/Makefile | 28 ++
.../expected/typcache_rel_type_cache.out | 34 ++
src/test/modules/typcache/meson.build | 16 +
.../typcache/sql/typcache_rel_type_cache.sql | 18 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 433 insertions(+), 46 deletions(-)
create mode 100644 src/test/modules/typcache/.gitignore
create mode 100644 src/test/modules/typcache/Makefile
create mode 100644 src/test/modules/typcache/expected/typcache_rel_type_cache.out
create mode 100644 src/test/modules/typcache/meson.build
create mode 100644 src/test/modules/typcache/sql/typcache_rel_type_cache.sql
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 87700c7c5c7..b0b05e28790 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -70,6 +70,7 @@
#include "utils/snapmgr.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
+#include "utils/typcache.h"
/*
* User-tweakable parameters
@@ -2407,6 +2408,9 @@ CommitTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/*
* Make catalog changes visible to all backends. This has to happen after
* relcache references are dropped (see comments for
@@ -2709,6 +2713,9 @@ PrepareTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/* notify doesn't need a postprepare call */
PostPrepare_PgStat();
@@ -2951,6 +2958,7 @@ AbortTransaction(void)
false, true);
AtEOXact_Buffers(false);
AtEOXact_RelationCache(false);
+ AtEOXact_TypeCache();
AtEOXact_Inval(false);
AtEOXact_MultiXact();
ResourceOwnerRelease(TopTransactionResourceOwner,
@@ -5153,6 +5161,7 @@ CommitSubTransaction(void)
true, false);
AtEOSubXact_RelationCache(true, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(true);
AtSubCommit_smgr();
@@ -5328,6 +5337,7 @@ AbortSubTransaction(void)
AtEOSubXact_RelationCache(false, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(false);
ResourceOwnerRelease(s->curTransactionOwner,
RESOURCE_RELEASE_LOCKS,
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 11382547ec0..6fa525d08f0 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -66,6 +66,7 @@
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -77,6 +78,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The mapping of relation's OID to the corresponding composite type OID.
+ * We're keeping the map entry when the corresponding typentry has something
+ * to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or
+ * TCFLAGS_OPERATOR_FLAGS, or tupdesc.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -208,6 +223,10 @@ typedef struct SharedTypmodTableEntry
dsa_pointer shared_tupdesc;
} SharedTypmodTableEntry;
+static Oid *in_progress_list;
+static int in_progress_list_len;
+static int in_progress_list_maxlen;
+
/*
* A comparator function for SharedRecordTableKey.
*/
@@ -329,6 +348,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+static void insert_rel_type_cache_if_needed(TypeCacheEntry *typentry);
+static void delete_rel_type_cache_if_needed(TypeCacheEntry *typentry);
/*
@@ -366,11 +387,13 @@ lookup_type_cache(Oid type_id, int flags)
{
TypeCacheEntry *typentry;
bool found;
+ int in_progress_offset;
if (TypeCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
+ int allocsize;
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
@@ -385,6 +408,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to OID of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -394,8 +424,32 @@ lookup_type_cache(Oid type_id, int flags)
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
+
+ /*
+ * reserve enough in_progress_list slots for many cases
+ */
+ allocsize = 4;
+ in_progress_list =
+ MemoryContextAlloc(CacheMemoryContext,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
+ /* Register to catch invalidation messages */
+ if (in_progress_list_len >= in_progress_list_maxlen)
+ {
+ int allocsize;
+
+ allocsize = in_progress_list_maxlen * 2;
+ in_progress_list = repalloc(in_progress_list,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
+ }
+ in_progress_offset = in_progress_list_len++;
+ in_progress_list[in_progress_offset] = type_id;
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -896,6 +950,13 @@ lookup_type_cache(Oid type_id, int flags)
load_domaintype_info(typentry);
}
+ INJECTION_POINT("typecache-before-rel-type-cache-insert");
+
+ Assert(in_progress_offset + 1 == in_progress_list_len);
+ in_progress_list_len--;
+
+ insert_rel_type_cache_if_needed(typentry);
+
return typentry;
}
@@ -2290,6 +2351,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ delete_rel_type_cache_if_needed(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2299,63 +2407,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically this shouldn't
+ * affect performance since domain types are less tended to bloat.
+ * Domain types are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2367,6 +2467,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid is invalid. By convention we need to reset all composite
+ * types in cache. Also, we should reset flags for domain types, and
+ * we loop over all entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2397,6 +2527,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2406,6 +2538,13 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /*
+ * Call delete_rel_type_cache() if we cleaned
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
+ */
+ if (hadPgTypeData)
+ delete_rel_type_cache_if_needed(typentry);
}
}
@@ -2914,3 +3053,135 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry if needed.
+ */
+static void
+insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry have any
+ * information indicating it should be here.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc.
+ */
+static void
+delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+#ifdef USE_ASSERT_CHECKING
+ int i;
+ bool is_in_progress = false;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ if (in_progress_list[i] == typentry->type_id)
+ {
+ is_in_progress = true;
+ break;
+ }
+ }
+#endif
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found || is_in_progress);
+ }
+ else
+ {
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ bool found;
+
+ if (!is_in_progress)
+ {
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+ }
+}
+
+/*
+ * Add possibly missing RelIdToTypeId entries related to TypeCacheHas
+ * entries, marked as in-progress by lookup_type_cache(). It may happen
+ * in case of an error or interruption during the lookup_type_cache() call.
+ */
+static void
+finalize_in_progress_typentries(void)
+{
+ int i;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ TypeCacheEntry *typentry;
+
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &in_progress_list[i],
+ HASH_FIND, NULL);
+ if (typentry)
+ insert_rel_type_cache_if_needed(typentry);
+ }
+
+ in_progress_list_len = 0;
+}
+
+void
+AtEOXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
+
+void
+AtEOSubXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
diff --git a/src/include/utils/typcache.h b/src/include/utils/typcache.h
index f506cc4aa35..f3d73ecee3a 100644
--- a/src/include/utils/typcache.h
+++ b/src/include/utils/typcache.h
@@ -207,4 +207,8 @@ extern void SharedRecordTypmodRegistryInit(SharedRecordTypmodRegistry *,
extern void SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *);
+extern void AtEOXact_TypeCache(void);
+
+extern void AtEOSubXact_TypeCache(void);
+
#endif /* TYPCACHE_H */
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 256799f520a..c0d3cf0e14b 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -43,9 +43,9 @@ SUBDIRS = \
ifeq ($(enable_injection_points),yes)
-SUBDIRS += injection_points gin
+SUBDIRS += injection_points gin typcache
else
-ALWAYS_SUBDIRS += injection_points gin
+ALWAYS_SUBDIRS += injection_points gin typcache
endif
ifeq ($(with_ssl),openssl)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index d8fe059d236..c829b619530 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -36,6 +36,7 @@ subdir('test_rls_hooks')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
+subdir('typcache')
subdir('unsafe_tests')
subdir('worker_spi')
subdir('xid_wraparound')
diff --git a/src/test/modules/typcache/.gitignore b/src/test/modules/typcache/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/typcache/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/typcache/Makefile b/src/test/modules/typcache/Makefile
new file mode 100644
index 00000000000..6ee46ec0891
--- /dev/null
+++ b/src/test/modules/typcache/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/typcache/Makefile
+
+EXTRA_INSTALL = src/test/modules/typcache_rel_type_cache.out.out
+
+REGRESS = typcache_rel_type_cache
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/typcache
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+# XXX: This test is conditional on enable_injection_points in the
+# parent Makefile, so we should never get here in the first place if
+# injection points are not enabled. But the buildfarm 'misc-check'
+# step doesn't pay attention to the if-condition in the parent
+# Makefile. To work around that, disable running the test here too.
+ifeq ($(enable_injection_points),yes)
+include $(top_srcdir)/contrib/contrib-global.mk
+else
+check:
+ @echo "injection points are disabled in this build"
+endif
+
+endif
diff --git a/src/test/modules/typcache/expected/typcache_rel_type_cache.out b/src/test/modules/typcache/expected/typcache_rel_type_cache.out
new file mode 100644
index 00000000000..b113e0bbd5d
--- /dev/null
+++ b/src/test/modules/typcache/expected/typcache_rel_type_cache.out
@@ -0,0 +1,34 @@
+--
+-- This test checks that lookup_type_cache() can correctly handle an
+-- interruption. We use the injection point to simulate an error but note
+-- that a similar situation could happen due to user query interruption.
+-- Despite the interruption, a map entry from the relation oid to type cache
+-- entry should be created. This is validated by subsequent modification of
+-- the table schema, then type casts which use new schema implying
+-- successful type cache invalidation by relation oid.
+--
+CREATE EXTENSION injection_points;
+CREATE TABLE t (i int);
+SELECT injection_points_attach('typecache-before-rel-type-cache-insert', 'error');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+SELECT '(1)'::t;
+ERROR: error triggered for injection point typecache-before-rel-type-cache-insert
+LINE 1: SELECT '(1)'::t;
+ ^
+SELECT injection_points_detach('typecache-before-rel-type-cache-insert');
+ injection_points_detach
+-------------------------
+
+(1 row)
+
+ALTER TABLE t ADD COLUMN j int;
+SELECT '(1,2)'::t;
+ t
+-------
+ (1,2)
+(1 row)
+
diff --git a/src/test/modules/typcache/meson.build b/src/test/modules/typcache/meson.build
new file mode 100644
index 00000000000..cb2e34c0d2b
--- /dev/null
+++ b/src/test/modules/typcache/meson.build
@@ -0,0 +1,16 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+if not get_option('injection_points')
+ subdir_done()
+endif
+
+tests += {
+ 'name': 'typcache',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'typcache_rel_type_cache',
+ ],
+ },
+}
diff --git a/src/test/modules/typcache/sql/typcache_rel_type_cache.sql b/src/test/modules/typcache/sql/typcache_rel_type_cache.sql
new file mode 100644
index 00000000000..2c0a434d988
--- /dev/null
+++ b/src/test/modules/typcache/sql/typcache_rel_type_cache.sql
@@ -0,0 +1,18 @@
+--
+-- This test checks that lookup_type_cache() can correctly handle an
+-- interruption. We use the injection point to simulate an error but note
+-- that a similar situation could happen due to user query interruption.
+-- Despite the interruption, a map entry from the relation oid to type cache
+-- entry should be created. This is validated by subsequent modification of
+-- the table schema, then type casts which use new schema implying
+-- successful type cache invalidation by relation oid.
+--
+
+CREATE EXTENSION injection_points;
+
+CREATE TABLE t (i int);
+SELECT injection_points_attach('typecache-before-rel-type-cache-insert', 'error');
+SELECT '(1)'::t;
+SELECT injection_points_detach('typecache-before-rel-type-cache-insert');
+ALTER TABLE t ADD COLUMN j int;
+SELECT '(1,2)'::t;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 57de1acff3a..9b3e7fd104b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.5 (Apple Git-154)
Alexander Korotkov <aekorotkov@gmail.com> writes:
+static Oid *in_progress_list; +static int in_progress_list_len; +static int in_progress_list_maxlen;
Is there any particular reason not to use pg_list.h for this?
- ilmari
On 21/10/2024 00:36, Alexander Korotkov wrote:
On Thu, Oct 17, 2024 at 12:41 PM Andrei Lepikhov <lepihov@gmail.com> wrote:
I think the first patch may already be committed, and this little burden
may be avoided in future versions.I've integrated your patch. But I think
finalize_in_progress_typentries() is more appropriate place to check
typentry for NULL. Also, I've revised the
finalize_in_progress_typentries() header comment. I'm going to push
these patches if no objections.
I agree with your idea. Also, I think it would be more conventional not
to check the type entry for a NULL value but to test the 'found' value
instead.
And thanks for the injection points tests! I must have slipped my mind
about this option.
Now, the patch set looks good for me.
--
regards, Andrei Lepikhov
On 21/10/2024 06:32, Dagfinn Ilmari Mannsåker wrote:
Alexander Korotkov <aekorotkov@gmail.com> writes:
+static Oid *in_progress_list; +static int in_progress_list_len; +static int in_progress_list_maxlen;Is there any particular reason not to use pg_list.h for this?
Sure. The type cache lookup has to be as much optimal as possible.
Using an array and relating sequential access to it, we avoid memory
allocations and deallocations 99.9% of the time. Also, quick access to
the single element (which we will have in real life almost all of the
time) is much faster than employing list machinery.
--
regards, Andrei Lepikhov
thanks for the
INJECTION_POINT("typecache-before-rel-type-cache-insert");
Now I have better understanding of the whole changes.
+/*
+ * Add possibly missing RelIdToTypeId entries related to TypeCacheHas
+ * entries, marked as in-progress by lookup_type_cache(). It may happen
+ * in case of an error or interruption during the lookup_type_cache() call.
+ */
+static void
+finalize_in_progress_typentries(void)
comment typo. "TypeCacheHas" should be "TypeCacheHash"
On Mon, Oct 21, 2024 at 8:40 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 21/10/2024 06:32, Dagfinn Ilmari Mannsåker wrote:
Alexander Korotkov <aekorotkov@gmail.com> writes:
+static Oid *in_progress_list; +static int in_progress_list_len; +static int in_progress_list_maxlen;Is there any particular reason not to use pg_list.h for this?
Sure. The type cache lookup has to be as much optimal as possible.
Using an array and relating sequential access to it, we avoid memory
allocations and deallocations 99.9% of the time. Also, quick access to
the single element (which we will have in real life almost all of the
time) is much faster than employing list machinery.
+1,
List with zero elements has to be NIL. That means continuous
allocations/deallocations.
------
Regards,
Alexander Korotkov
Supabase
On Mon, Oct 21, 2024 at 10:51 AM jian he <jian.universality@gmail.com> wrote:
thanks for the
INJECTION_POINT("typecache-before-rel-type-cache-insert");
Now I have better understanding of the whole changes.+/* + * Add possibly missing RelIdToTypeId entries related to TypeCacheHas + * entries, marked as in-progress by lookup_type_cache(). It may happen + * in case of an error or interruption during the lookup_type_cache() call. + */ +static void +finalize_in_progress_typentries(void) comment typo. "TypeCacheHas" should be "TypeCacheHash"
Thank you. This also has been spotted by Alexander Lakhin (off-list).
Fixed in the attached revision of the patchset.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v16-0001-Update-header-comment-for-lookup_type_cache.patchapplication/octet-stream; name=v16-0001-Update-header-comment-for-lookup_type_cache.patchDownload
From d624b14c9a4bd93426a5e475d4503589927ef2ae Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri, 13 Sep 2024 02:10:04 +0300
Subject: [PATCH v16 1/2] Update header comment for lookup_type_cache()
Describe the way we handle concurrent invalidation messages.
---
src/backend/utils/cache/typcache.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 2ec136b7d30..11382547ec0 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -351,6 +351,15 @@ type_cache_syshash(const void *key, Size keysize)
* invalid. Note however that we may fail to find one or more of the
* values requested by 'flags'; the caller needs to check whether the fields
* are InvalidOid or not.
+ *
+ * Note that while filling TypeCacheEntry we might process concurrent
+ * invalidation messages, causing our not-yet-filled TypeCacheEntry to be
+ * invalidated. In this case, we typically only clear flags while values are
+ * still available for the caller. It's expected that the caller holds
+ * enough locks on type-depending objects that the values are still relevant.
+ * It's also important that the tupdesc is filled in the last for
+ * TYPTYPE_COMPOSITE. So, it can't get invalidated during the
+ * lookup_type_cache() call.
*/
TypeCacheEntry *
lookup_type_cache(Oid type_id, int flags)
--
2.39.5 (Apple Git-154)
v16-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchapplication/octet-stream; name=v16-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchDownload
From e126267c3a6babbc5d37924947ba8f7ec9120cec Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 10 Sep 2024 23:25:04 +0300
Subject: [PATCH v16 2/2] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
There are many places in lookup_type_cache() where syscache invalidation,
user interruption, or even error could occur. In order to handle this, we
keep an array of in-progress type cache entries. In the case of
lookup_type_cache() interruption this array is processed to keep
RelIdToTypeIdCacheHash in a consistent state.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov, Jian He, Alexander Lakhin
---
src/backend/access/transam/xact.c | 10 +
src/backend/utils/cache/typcache.c | 359 +++++++++++++++---
src/include/utils/typcache.h | 4 +
src/test/modules/Makefile | 4 +-
src/test/modules/meson.build | 1 +
src/test/modules/typcache/.gitignore | 4 +
src/test/modules/typcache/Makefile | 28 ++
.../expected/typcache_rel_type_cache.out | 34 ++
src/test/modules/typcache/meson.build | 16 +
.../typcache/sql/typcache_rel_type_cache.sql | 18 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 433 insertions(+), 46 deletions(-)
create mode 100644 src/test/modules/typcache/.gitignore
create mode 100644 src/test/modules/typcache/Makefile
create mode 100644 src/test/modules/typcache/expected/typcache_rel_type_cache.out
create mode 100644 src/test/modules/typcache/meson.build
create mode 100644 src/test/modules/typcache/sql/typcache_rel_type_cache.sql
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 87700c7c5c7..b0b05e28790 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -70,6 +70,7 @@
#include "utils/snapmgr.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
+#include "utils/typcache.h"
/*
* User-tweakable parameters
@@ -2407,6 +2408,9 @@ CommitTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/*
* Make catalog changes visible to all backends. This has to happen after
* relcache references are dropped (see comments for
@@ -2709,6 +2713,9 @@ PrepareTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/* notify doesn't need a postprepare call */
PostPrepare_PgStat();
@@ -2951,6 +2958,7 @@ AbortTransaction(void)
false, true);
AtEOXact_Buffers(false);
AtEOXact_RelationCache(false);
+ AtEOXact_TypeCache();
AtEOXact_Inval(false);
AtEOXact_MultiXact();
ResourceOwnerRelease(TopTransactionResourceOwner,
@@ -5153,6 +5161,7 @@ CommitSubTransaction(void)
true, false);
AtEOSubXact_RelationCache(true, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(true);
AtSubCommit_smgr();
@@ -5328,6 +5337,7 @@ AbortSubTransaction(void)
AtEOSubXact_RelationCache(false, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(false);
ResourceOwnerRelease(s->curTransactionOwner,
RESOURCE_RELEASE_LOCKS,
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 11382547ec0..094d3ca00c1 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -66,6 +66,7 @@
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -77,6 +78,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The mapping of relation's OID to the corresponding composite type OID.
+ * We're keeping the map entry when the corresponding typentry has something
+ * to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or
+ * TCFLAGS_OPERATOR_FLAGS, or tupdesc.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -208,6 +223,10 @@ typedef struct SharedTypmodTableEntry
dsa_pointer shared_tupdesc;
} SharedTypmodTableEntry;
+static Oid *in_progress_list;
+static int in_progress_list_len;
+static int in_progress_list_maxlen;
+
/*
* A comparator function for SharedRecordTableKey.
*/
@@ -329,6 +348,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+static void insert_rel_type_cache_if_needed(TypeCacheEntry *typentry);
+static void delete_rel_type_cache_if_needed(TypeCacheEntry *typentry);
/*
@@ -366,11 +387,13 @@ lookup_type_cache(Oid type_id, int flags)
{
TypeCacheEntry *typentry;
bool found;
+ int in_progress_offset;
if (TypeCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
+ int allocsize;
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
@@ -385,6 +408,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to OID of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -394,8 +424,32 @@ lookup_type_cache(Oid type_id, int flags)
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
+
+ /*
+ * reserve enough in_progress_list slots for many cases
+ */
+ allocsize = 4;
+ in_progress_list =
+ MemoryContextAlloc(CacheMemoryContext,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
+ /* Register to catch invalidation messages */
+ if (in_progress_list_len >= in_progress_list_maxlen)
+ {
+ int allocsize;
+
+ allocsize = in_progress_list_maxlen * 2;
+ in_progress_list = repalloc(in_progress_list,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
+ }
+ in_progress_offset = in_progress_list_len++;
+ in_progress_list[in_progress_offset] = type_id;
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -896,6 +950,13 @@ lookup_type_cache(Oid type_id, int flags)
load_domaintype_info(typentry);
}
+ INJECTION_POINT("typecache-before-rel-type-cache-insert");
+
+ Assert(in_progress_offset + 1 == in_progress_list_len);
+ in_progress_list_len--;
+
+ insert_rel_type_cache_if_needed(typentry);
+
return typentry;
}
@@ -2290,6 +2351,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ delete_rel_type_cache_if_needed(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2299,63 +2407,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically this shouldn't
+ * affect performance since domain types are less tended to bloat.
+ * Domain types are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2367,6 +2467,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid is invalid. By convention we need to reset all composite
+ * types in cache. Also, we should reset flags for domain types, and
+ * we loop over all entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2397,6 +2527,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2406,6 +2538,13 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /*
+ * Call delete_rel_type_cache() if we cleaned
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
+ */
+ if (hadPgTypeData)
+ delete_rel_type_cache_if_needed(typentry);
}
}
@@ -2914,3 +3053,135 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry if needed.
+ */
+static void
+insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry have any
+ * information indicating it should be here.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc.
+ */
+static void
+delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+#ifdef USE_ASSERT_CHECKING
+ int i;
+ bool is_in_progress = false;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ if (in_progress_list[i] == typentry->type_id)
+ {
+ is_in_progress = true;
+ break;
+ }
+ }
+#endif
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found || is_in_progress);
+ }
+ else
+ {
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ bool found;
+
+ if (!is_in_progress)
+ {
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+ }
+}
+
+/*
+ * Add possibly missing RelIdToTypeId entries related to TypeCacheHash
+ * entries, marked as in-progress by lookup_type_cache(). It may happen
+ * in case of an error or interruption during the lookup_type_cache() call.
+ */
+static void
+finalize_in_progress_typentries(void)
+{
+ int i;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ TypeCacheEntry *typentry;
+
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &in_progress_list[i],
+ HASH_FIND, NULL);
+ if (typentry)
+ insert_rel_type_cache_if_needed(typentry);
+ }
+
+ in_progress_list_len = 0;
+}
+
+void
+AtEOXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
+
+void
+AtEOSubXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
diff --git a/src/include/utils/typcache.h b/src/include/utils/typcache.h
index f506cc4aa35..f3d73ecee3a 100644
--- a/src/include/utils/typcache.h
+++ b/src/include/utils/typcache.h
@@ -207,4 +207,8 @@ extern void SharedRecordTypmodRegistryInit(SharedRecordTypmodRegistry *,
extern void SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *);
+extern void AtEOXact_TypeCache(void);
+
+extern void AtEOSubXact_TypeCache(void);
+
#endif /* TYPCACHE_H */
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 256799f520a..c0d3cf0e14b 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -43,9 +43,9 @@ SUBDIRS = \
ifeq ($(enable_injection_points),yes)
-SUBDIRS += injection_points gin
+SUBDIRS += injection_points gin typcache
else
-ALWAYS_SUBDIRS += injection_points gin
+ALWAYS_SUBDIRS += injection_points gin typcache
endif
ifeq ($(with_ssl),openssl)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index d8fe059d236..c829b619530 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -36,6 +36,7 @@ subdir('test_rls_hooks')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
+subdir('typcache')
subdir('unsafe_tests')
subdir('worker_spi')
subdir('xid_wraparound')
diff --git a/src/test/modules/typcache/.gitignore b/src/test/modules/typcache/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/typcache/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/typcache/Makefile b/src/test/modules/typcache/Makefile
new file mode 100644
index 00000000000..6ee46ec0891
--- /dev/null
+++ b/src/test/modules/typcache/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/typcache/Makefile
+
+EXTRA_INSTALL = src/test/modules/typcache_rel_type_cache.out.out
+
+REGRESS = typcache_rel_type_cache
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/typcache
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+# XXX: This test is conditional on enable_injection_points in the
+# parent Makefile, so we should never get here in the first place if
+# injection points are not enabled. But the buildfarm 'misc-check'
+# step doesn't pay attention to the if-condition in the parent
+# Makefile. To work around that, disable running the test here too.
+ifeq ($(enable_injection_points),yes)
+include $(top_srcdir)/contrib/contrib-global.mk
+else
+check:
+ @echo "injection points are disabled in this build"
+endif
+
+endif
diff --git a/src/test/modules/typcache/expected/typcache_rel_type_cache.out b/src/test/modules/typcache/expected/typcache_rel_type_cache.out
new file mode 100644
index 00000000000..b113e0bbd5d
--- /dev/null
+++ b/src/test/modules/typcache/expected/typcache_rel_type_cache.out
@@ -0,0 +1,34 @@
+--
+-- This test checks that lookup_type_cache() can correctly handle an
+-- interruption. We use the injection point to simulate an error but note
+-- that a similar situation could happen due to user query interruption.
+-- Despite the interruption, a map entry from the relation oid to type cache
+-- entry should be created. This is validated by subsequent modification of
+-- the table schema, then type casts which use new schema implying
+-- successful type cache invalidation by relation oid.
+--
+CREATE EXTENSION injection_points;
+CREATE TABLE t (i int);
+SELECT injection_points_attach('typecache-before-rel-type-cache-insert', 'error');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+SELECT '(1)'::t;
+ERROR: error triggered for injection point typecache-before-rel-type-cache-insert
+LINE 1: SELECT '(1)'::t;
+ ^
+SELECT injection_points_detach('typecache-before-rel-type-cache-insert');
+ injection_points_detach
+-------------------------
+
+(1 row)
+
+ALTER TABLE t ADD COLUMN j int;
+SELECT '(1,2)'::t;
+ t
+-------
+ (1,2)
+(1 row)
+
diff --git a/src/test/modules/typcache/meson.build b/src/test/modules/typcache/meson.build
new file mode 100644
index 00000000000..cb2e34c0d2b
--- /dev/null
+++ b/src/test/modules/typcache/meson.build
@@ -0,0 +1,16 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+if not get_option('injection_points')
+ subdir_done()
+endif
+
+tests += {
+ 'name': 'typcache',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'typcache_rel_type_cache',
+ ],
+ },
+}
diff --git a/src/test/modules/typcache/sql/typcache_rel_type_cache.sql b/src/test/modules/typcache/sql/typcache_rel_type_cache.sql
new file mode 100644
index 00000000000..2c0a434d988
--- /dev/null
+++ b/src/test/modules/typcache/sql/typcache_rel_type_cache.sql
@@ -0,0 +1,18 @@
+--
+-- This test checks that lookup_type_cache() can correctly handle an
+-- interruption. We use the injection point to simulate an error but note
+-- that a similar situation could happen due to user query interruption.
+-- Despite the interruption, a map entry from the relation oid to type cache
+-- entry should be created. This is validated by subsequent modification of
+-- the table schema, then type casts which use new schema implying
+-- successful type cache invalidation by relation oid.
+--
+
+CREATE EXTENSION injection_points;
+
+CREATE TABLE t (i int);
+SELECT injection_points_attach('typecache-before-rel-type-cache-insert', 'error');
+SELECT '(1)'::t;
+SELECT injection_points_detach('typecache-before-rel-type-cache-insert');
+ALTER TABLE t ADD COLUMN j int;
+SELECT '(1,2)'::t;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 57de1acff3a..9b3e7fd104b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.5 (Apple Git-154)
Alexander Korotkov <aekorotkov@gmail.com> writes:
On Mon, Oct 21, 2024 at 8:40 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 21/10/2024 06:32, Dagfinn Ilmari Mannsåker wrote:
Alexander Korotkov <aekorotkov@gmail.com> writes:
+static Oid *in_progress_list; +static int in_progress_list_len; +static int in_progress_list_maxlen;Is there any particular reason not to use pg_list.h for this?
Sure. The type cache lookup has to be as much optimal as possible.
Using an array and relating sequential access to it, we avoid memory
allocations and deallocations 99.9% of the time. Also, quick access to
the single element (which we will have in real life almost all of the
time) is much faster than employing list machinery.
Lists are actually dynamically resized arrays these days (see commit
1cff1b95ab6ddae32faa3efe0d95a820dbfdc164), not linked lists, so
accessing arbitrary elements is O(1), not O(n). Just like this patch,
the size is doubled (starting at 16) whenever array is full.
+1,
List with zero elements has to be NIL. That means continuous
allocations/deallocations.
This however is a valid point (unless we keep a dummy zeroth element to
avoid it, which is even uglier than open-coding the array extension
logic), so objection withdrawn.
- ilmari
On Mon, Oct 21, 2024 at 1:16 PM Dagfinn Ilmari Mannsåker
<ilmari@ilmari.org> wrote:
Alexander Korotkov <aekorotkov@gmail.com> writes:
On Mon, Oct 21, 2024 at 8:40 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 21/10/2024 06:32, Dagfinn Ilmari Mannsåker wrote:
Alexander Korotkov <aekorotkov@gmail.com> writes:
+static Oid *in_progress_list; +static int in_progress_list_len; +static int in_progress_list_maxlen;Is there any particular reason not to use pg_list.h for this?
Sure. The type cache lookup has to be as much optimal as possible.
Using an array and relating sequential access to it, we avoid memory
allocations and deallocations 99.9% of the time. Also, quick access to
the single element (which we will have in real life almost all of the
time) is much faster than employing list machinery.Lists are actually dynamically resized arrays these days (see commit
1cff1b95ab6ddae32faa3efe0d95a820dbfdc164), not linked lists, so
accessing arbitrary elements is O(1), not O(n). Just like this patch,
the size is doubled (starting at 16) whenever array is full.+1,
List with zero elements has to be NIL. That means continuous
allocations/deallocations.This however is a valid point (unless we keep a dummy zeroth element to
avoid it, which is even uglier than open-coding the array extension
logic), so objection withdrawn.
OK, thank you!
The attached revision fixes EXTRA_INSTALL in
src/test/modules/typcache/Makefile. Spotted off-list by Arthur
Zakirov.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v17-0001-Update-header-comment-for-lookup_type_cache.patchapplication/octet-stream; name=v17-0001-Update-header-comment-for-lookup_type_cache.patchDownload
From d624b14c9a4bd93426a5e475d4503589927ef2ae Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Fri, 13 Sep 2024 02:10:04 +0300
Subject: [PATCH v17 1/2] Update header comment for lookup_type_cache()
Describe the way we handle concurrent invalidation messages.
---
src/backend/utils/cache/typcache.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 2ec136b7d30..11382547ec0 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -351,6 +351,15 @@ type_cache_syshash(const void *key, Size keysize)
* invalid. Note however that we may fail to find one or more of the
* values requested by 'flags'; the caller needs to check whether the fields
* are InvalidOid or not.
+ *
+ * Note that while filling TypeCacheEntry we might process concurrent
+ * invalidation messages, causing our not-yet-filled TypeCacheEntry to be
+ * invalidated. In this case, we typically only clear flags while values are
+ * still available for the caller. It's expected that the caller holds
+ * enough locks on type-depending objects that the values are still relevant.
+ * It's also important that the tupdesc is filled in the last for
+ * TYPTYPE_COMPOSITE. So, it can't get invalidated during the
+ * lookup_type_cache() call.
*/
TypeCacheEntry *
lookup_type_cache(Oid type_id, int flags)
--
2.39.5 (Apple Git-154)
v17-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchapplication/octet-stream; name=v17-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchDownload
From 41da564395bd147939d425657caacfab572a1fce Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 10 Sep 2024 23:25:04 +0300
Subject: [PATCH v17 2/2] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
There are many places in lookup_type_cache() where syscache invalidation,
user interruption, or even error could occur. In order to handle this, we
keep an array of in-progress type cache entries. In the case of
lookup_type_cache() interruption this array is processed to keep
RelIdToTypeIdCacheHash in a consistent state.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov, Jian He, Alexander Lakhin
Reviewed-by: Artur Zakirov
---
src/backend/access/transam/xact.c | 10 +
src/backend/utils/cache/typcache.c | 359 +++++++++++++++---
src/include/utils/typcache.h | 4 +
src/test/modules/Makefile | 4 +-
src/test/modules/meson.build | 1 +
src/test/modules/typcache/.gitignore | 4 +
src/test/modules/typcache/Makefile | 28 ++
.../expected/typcache_rel_type_cache.out | 34 ++
src/test/modules/typcache/meson.build | 16 +
.../typcache/sql/typcache_rel_type_cache.sql | 18 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 433 insertions(+), 46 deletions(-)
create mode 100644 src/test/modules/typcache/.gitignore
create mode 100644 src/test/modules/typcache/Makefile
create mode 100644 src/test/modules/typcache/expected/typcache_rel_type_cache.out
create mode 100644 src/test/modules/typcache/meson.build
create mode 100644 src/test/modules/typcache/sql/typcache_rel_type_cache.sql
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 87700c7c5c7..b0b05e28790 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -70,6 +70,7 @@
#include "utils/snapmgr.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
+#include "utils/typcache.h"
/*
* User-tweakable parameters
@@ -2407,6 +2408,9 @@ CommitTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/*
* Make catalog changes visible to all backends. This has to happen after
* relcache references are dropped (see comments for
@@ -2709,6 +2713,9 @@ PrepareTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/* notify doesn't need a postprepare call */
PostPrepare_PgStat();
@@ -2951,6 +2958,7 @@ AbortTransaction(void)
false, true);
AtEOXact_Buffers(false);
AtEOXact_RelationCache(false);
+ AtEOXact_TypeCache();
AtEOXact_Inval(false);
AtEOXact_MultiXact();
ResourceOwnerRelease(TopTransactionResourceOwner,
@@ -5153,6 +5161,7 @@ CommitSubTransaction(void)
true, false);
AtEOSubXact_RelationCache(true, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(true);
AtSubCommit_smgr();
@@ -5328,6 +5337,7 @@ AbortSubTransaction(void)
AtEOSubXact_RelationCache(false, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(false);
ResourceOwnerRelease(s->curTransactionOwner,
RESOURCE_RELEASE_LOCKS,
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 11382547ec0..094d3ca00c1 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -66,6 +66,7 @@
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -77,6 +78,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The mapping of relation's OID to the corresponding composite type OID.
+ * We're keeping the map entry when the corresponding typentry has something
+ * to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or
+ * TCFLAGS_OPERATOR_FLAGS, or tupdesc.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -208,6 +223,10 @@ typedef struct SharedTypmodTableEntry
dsa_pointer shared_tupdesc;
} SharedTypmodTableEntry;
+static Oid *in_progress_list;
+static int in_progress_list_len;
+static int in_progress_list_maxlen;
+
/*
* A comparator function for SharedRecordTableKey.
*/
@@ -329,6 +348,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+static void insert_rel_type_cache_if_needed(TypeCacheEntry *typentry);
+static void delete_rel_type_cache_if_needed(TypeCacheEntry *typentry);
/*
@@ -366,11 +387,13 @@ lookup_type_cache(Oid type_id, int flags)
{
TypeCacheEntry *typentry;
bool found;
+ int in_progress_offset;
if (TypeCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
+ int allocsize;
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
@@ -385,6 +408,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to OID of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -394,8 +424,32 @@ lookup_type_cache(Oid type_id, int flags)
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
+
+ /*
+ * reserve enough in_progress_list slots for many cases
+ */
+ allocsize = 4;
+ in_progress_list =
+ MemoryContextAlloc(CacheMemoryContext,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
+ /* Register to catch invalidation messages */
+ if (in_progress_list_len >= in_progress_list_maxlen)
+ {
+ int allocsize;
+
+ allocsize = in_progress_list_maxlen * 2;
+ in_progress_list = repalloc(in_progress_list,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
+ }
+ in_progress_offset = in_progress_list_len++;
+ in_progress_list[in_progress_offset] = type_id;
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -896,6 +950,13 @@ lookup_type_cache(Oid type_id, int flags)
load_domaintype_info(typentry);
}
+ INJECTION_POINT("typecache-before-rel-type-cache-insert");
+
+ Assert(in_progress_offset + 1 == in_progress_list_len);
+ in_progress_list_len--;
+
+ insert_rel_type_cache_if_needed(typentry);
+
return typentry;
}
@@ -2290,6 +2351,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ delete_rel_type_cache_if_needed(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2299,63 +2407,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus we use
+ * the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically this shouldn't
+ * affect performance since domain types are less tended to bloat.
+ * Domain types are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2367,6 +2467,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid is invalid. By convention we need to reset all composite
+ * types in cache. Also, we should reset flags for domain types, and
+ * we loop over all entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2397,6 +2527,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2406,6 +2538,13 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /*
+ * Call delete_rel_type_cache() if we cleaned
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
+ */
+ if (hadPgTypeData)
+ delete_rel_type_cache_if_needed(typentry);
}
}
@@ -2914,3 +3053,135 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry if needed.
+ */
+static void
+insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry have any
+ * information indicating it should be here.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS flags,
+ * or tupDesc.
+ */
+static void
+delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+#ifdef USE_ASSERT_CHECKING
+ int i;
+ bool is_in_progress = false;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ if (in_progress_list[i] == typentry->type_id)
+ {
+ is_in_progress = true;
+ break;
+ }
+ }
+#endif
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found || is_in_progress);
+ }
+ else
+ {
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ bool found;
+
+ if (!is_in_progress)
+ {
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+ }
+}
+
+/*
+ * Add possibly missing RelIdToTypeId entries related to TypeCacheHash
+ * entries, marked as in-progress by lookup_type_cache(). It may happen
+ * in case of an error or interruption during the lookup_type_cache() call.
+ */
+static void
+finalize_in_progress_typentries(void)
+{
+ int i;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ TypeCacheEntry *typentry;
+
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &in_progress_list[i],
+ HASH_FIND, NULL);
+ if (typentry)
+ insert_rel_type_cache_if_needed(typentry);
+ }
+
+ in_progress_list_len = 0;
+}
+
+void
+AtEOXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
+
+void
+AtEOSubXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
diff --git a/src/include/utils/typcache.h b/src/include/utils/typcache.h
index f506cc4aa35..f3d73ecee3a 100644
--- a/src/include/utils/typcache.h
+++ b/src/include/utils/typcache.h
@@ -207,4 +207,8 @@ extern void SharedRecordTypmodRegistryInit(SharedRecordTypmodRegistry *,
extern void SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *);
+extern void AtEOXact_TypeCache(void);
+
+extern void AtEOSubXact_TypeCache(void);
+
#endif /* TYPCACHE_H */
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 256799f520a..c0d3cf0e14b 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -43,9 +43,9 @@ SUBDIRS = \
ifeq ($(enable_injection_points),yes)
-SUBDIRS += injection_points gin
+SUBDIRS += injection_points gin typcache
else
-ALWAYS_SUBDIRS += injection_points gin
+ALWAYS_SUBDIRS += injection_points gin typcache
endif
ifeq ($(with_ssl),openssl)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index d8fe059d236..c829b619530 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -36,6 +36,7 @@ subdir('test_rls_hooks')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
+subdir('typcache')
subdir('unsafe_tests')
subdir('worker_spi')
subdir('xid_wraparound')
diff --git a/src/test/modules/typcache/.gitignore b/src/test/modules/typcache/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/typcache/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/typcache/Makefile b/src/test/modules/typcache/Makefile
new file mode 100644
index 00000000000..1f03de83890
--- /dev/null
+++ b/src/test/modules/typcache/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/typcache/Makefile
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+REGRESS = typcache_rel_type_cache
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/typcache
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+# XXX: This test is conditional on enable_injection_points in the
+# parent Makefile, so we should never get here in the first place if
+# injection points are not enabled. But the buildfarm 'misc-check'
+# step doesn't pay attention to the if-condition in the parent
+# Makefile. To work around that, disable running the test here too.
+ifeq ($(enable_injection_points),yes)
+include $(top_srcdir)/contrib/contrib-global.mk
+else
+check:
+ @echo "injection points are disabled in this build"
+endif
+
+endif
diff --git a/src/test/modules/typcache/expected/typcache_rel_type_cache.out b/src/test/modules/typcache/expected/typcache_rel_type_cache.out
new file mode 100644
index 00000000000..b113e0bbd5d
--- /dev/null
+++ b/src/test/modules/typcache/expected/typcache_rel_type_cache.out
@@ -0,0 +1,34 @@
+--
+-- This test checks that lookup_type_cache() can correctly handle an
+-- interruption. We use the injection point to simulate an error but note
+-- that a similar situation could happen due to user query interruption.
+-- Despite the interruption, a map entry from the relation oid to type cache
+-- entry should be created. This is validated by subsequent modification of
+-- the table schema, then type casts which use new schema implying
+-- successful type cache invalidation by relation oid.
+--
+CREATE EXTENSION injection_points;
+CREATE TABLE t (i int);
+SELECT injection_points_attach('typecache-before-rel-type-cache-insert', 'error');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+SELECT '(1)'::t;
+ERROR: error triggered for injection point typecache-before-rel-type-cache-insert
+LINE 1: SELECT '(1)'::t;
+ ^
+SELECT injection_points_detach('typecache-before-rel-type-cache-insert');
+ injection_points_detach
+-------------------------
+
+(1 row)
+
+ALTER TABLE t ADD COLUMN j int;
+SELECT '(1,2)'::t;
+ t
+-------
+ (1,2)
+(1 row)
+
diff --git a/src/test/modules/typcache/meson.build b/src/test/modules/typcache/meson.build
new file mode 100644
index 00000000000..cb2e34c0d2b
--- /dev/null
+++ b/src/test/modules/typcache/meson.build
@@ -0,0 +1,16 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+if not get_option('injection_points')
+ subdir_done()
+endif
+
+tests += {
+ 'name': 'typcache',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'typcache_rel_type_cache',
+ ],
+ },
+}
diff --git a/src/test/modules/typcache/sql/typcache_rel_type_cache.sql b/src/test/modules/typcache/sql/typcache_rel_type_cache.sql
new file mode 100644
index 00000000000..2c0a434d988
--- /dev/null
+++ b/src/test/modules/typcache/sql/typcache_rel_type_cache.sql
@@ -0,0 +1,18 @@
+--
+-- This test checks that lookup_type_cache() can correctly handle an
+-- interruption. We use the injection point to simulate an error but note
+-- that a similar situation could happen due to user query interruption.
+-- Despite the interruption, a map entry from the relation oid to type cache
+-- entry should be created. This is validated by subsequent modification of
+-- the table schema, then type casts which use new schema implying
+-- successful type cache invalidation by relation oid.
+--
+
+CREATE EXTENSION injection_points;
+
+CREATE TABLE t (i int);
+SELECT injection_points_attach('typecache-before-rel-type-cache-insert', 'error');
+SELECT '(1)'::t;
+SELECT injection_points_detach('typecache-before-rel-type-cache-insert');
+ALTER TABLE t ADD COLUMN j int;
+SELECT '(1,2)'::t;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 57de1acff3a..9b3e7fd104b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.5 (Apple Git-154)
On Mon, Oct 21, 2024 at 2:30 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, Oct 21, 2024 at 1:16 PM Dagfinn Ilmari Mannsåker
<ilmari@ilmari.org> wrote:Alexander Korotkov <aekorotkov@gmail.com> writes:
On Mon, Oct 21, 2024 at 8:40 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 21/10/2024 06:32, Dagfinn Ilmari Mannsåker wrote:
Alexander Korotkov <aekorotkov@gmail.com> writes:
+static Oid *in_progress_list; +static int in_progress_list_len; +static int in_progress_list_maxlen;Is there any particular reason not to use pg_list.h for this?
Sure. The type cache lookup has to be as much optimal as possible.
Using an array and relating sequential access to it, we avoid memory
allocations and deallocations 99.9% of the time. Also, quick access to
the single element (which we will have in real life almost all of the
time) is much faster than employing list machinery.Lists are actually dynamically resized arrays these days (see commit
1cff1b95ab6ddae32faa3efe0d95a820dbfdc164), not linked lists, so
accessing arbitrary elements is O(1), not O(n). Just like this patch,
the size is doubled (starting at 16) whenever array is full.+1,
List with zero elements has to be NIL. That means continuous
allocations/deallocations.This however is a valid point (unless we keep a dummy zeroth element to
avoid it, which is even uglier than open-coding the array extension
logic), so objection withdrawn.OK, thank you!
The attached revision fixes EXTRA_INSTALL in
src/test/modules/typcache/Makefile. Spotted off-list by Arthur
Zakirov.
I've re-checked that regression tests pass with
-DCLOBBER_CACHE_ALWAYS. Also did some grammar corrections for
comments and commit message. I'm going to push this if no objections.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v18-0001-Update-header-comment-for-lookup_type_cache.patchapplication/octet-stream; name=v18-0001-Update-header-comment-for-lookup_type_cache.patchDownload
From ab098661c407355c07aacb7821221bfcbf10637b Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 22 Oct 2024 10:30:40 +0300
Subject: [PATCH v18 1/2] Update header comment for lookup_type_cache()
Describe the way we handle concurrent invalidation messages.
---
src/backend/utils/cache/typcache.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 2ec136b7d30..11382547ec0 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -351,6 +351,15 @@ type_cache_syshash(const void *key, Size keysize)
* invalid. Note however that we may fail to find one or more of the
* values requested by 'flags'; the caller needs to check whether the fields
* are InvalidOid or not.
+ *
+ * Note that while filling TypeCacheEntry we might process concurrent
+ * invalidation messages, causing our not-yet-filled TypeCacheEntry to be
+ * invalidated. In this case, we typically only clear flags while values are
+ * still available for the caller. It's expected that the caller holds
+ * enough locks on type-depending objects that the values are still relevant.
+ * It's also important that the tupdesc is filled in the last for
+ * TYPTYPE_COMPOSITE. So, it can't get invalidated during the
+ * lookup_type_cache() call.
*/
TypeCacheEntry *
lookup_type_cache(Oid type_id, int flags)
--
2.39.5 (Apple Git-154)
v18-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchapplication/octet-stream; name=v18-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchDownload
From aba5740d647bc16aa4bbe5cd82d59bd7b8df7ee3 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 22 Oct 2024 10:30:46 +0300
Subject: [PATCH v18 2/2] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently, when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
There are many places in lookup_type_cache() where syscache invalidation,
user interruption, or even error could occur. In order to handle this, we
keep an array of in-progress type cache entries. In the case of
lookup_type_cache() interruption this array is processed to keep
RelIdToTypeIdCacheHash in a consistent state.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov, Jian He, Alexander Lakhin
Reviewed-by: Artur Zakirov
---
src/backend/access/transam/xact.c | 10 +
src/backend/utils/cache/typcache.c | 359 +++++++++++++++---
src/include/utils/typcache.h | 4 +
src/test/modules/Makefile | 4 +-
src/test/modules/meson.build | 1 +
src/test/modules/typcache/.gitignore | 4 +
src/test/modules/typcache/Makefile | 28 ++
.../expected/typcache_rel_type_cache.out | 34 ++
src/test/modules/typcache/meson.build | 16 +
.../typcache/sql/typcache_rel_type_cache.sql | 18 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 433 insertions(+), 46 deletions(-)
create mode 100644 src/test/modules/typcache/.gitignore
create mode 100644 src/test/modules/typcache/Makefile
create mode 100644 src/test/modules/typcache/expected/typcache_rel_type_cache.out
create mode 100644 src/test/modules/typcache/meson.build
create mode 100644 src/test/modules/typcache/sql/typcache_rel_type_cache.sql
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 87700c7c5c7..b0b05e28790 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -70,6 +70,7 @@
#include "utils/snapmgr.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
+#include "utils/typcache.h"
/*
* User-tweakable parameters
@@ -2407,6 +2408,9 @@ CommitTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/*
* Make catalog changes visible to all backends. This has to happen after
* relcache references are dropped (see comments for
@@ -2709,6 +2713,9 @@ PrepareTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/* notify doesn't need a postprepare call */
PostPrepare_PgStat();
@@ -2951,6 +2958,7 @@ AbortTransaction(void)
false, true);
AtEOXact_Buffers(false);
AtEOXact_RelationCache(false);
+ AtEOXact_TypeCache();
AtEOXact_Inval(false);
AtEOXact_MultiXact();
ResourceOwnerRelease(TopTransactionResourceOwner,
@@ -5153,6 +5161,7 @@ CommitSubTransaction(void)
true, false);
AtEOSubXact_RelationCache(true, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(true);
AtSubCommit_smgr();
@@ -5328,6 +5337,7 @@ AbortSubTransaction(void)
AtEOSubXact_RelationCache(false, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(false);
ResourceOwnerRelease(s->curTransactionOwner,
RESOURCE_RELEASE_LOCKS,
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 11382547ec0..2037e2f1c16 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -66,6 +66,7 @@
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -77,6 +78,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The mapping of relation's OID to the corresponding composite type OID.
+ * We're keeping the map entry when the corresponding typentry has something
+ * to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or
+ * TCFLAGS_OPERATOR_FLAGS, or tupdesc.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -208,6 +223,10 @@ typedef struct SharedTypmodTableEntry
dsa_pointer shared_tupdesc;
} SharedTypmodTableEntry;
+static Oid *in_progress_list;
+static int in_progress_list_len;
+static int in_progress_list_maxlen;
+
/*
* A comparator function for SharedRecordTableKey.
*/
@@ -329,6 +348,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+static void insert_rel_type_cache_if_needed(TypeCacheEntry *typentry);
+static void delete_rel_type_cache_if_needed(TypeCacheEntry *typentry);
/*
@@ -366,11 +387,13 @@ lookup_type_cache(Oid type_id, int flags)
{
TypeCacheEntry *typentry;
bool found;
+ int in_progress_offset;
if (TypeCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
+ int allocsize;
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
@@ -385,6 +408,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to OID of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -394,8 +424,32 @@ lookup_type_cache(Oid type_id, int flags)
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
+
+ /*
+ * reserve enough in_progress_list slots for many cases
+ */
+ allocsize = 4;
+ in_progress_list =
+ MemoryContextAlloc(CacheMemoryContext,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
+ /* Register to catch invalidation messages */
+ if (in_progress_list_len >= in_progress_list_maxlen)
+ {
+ int allocsize;
+
+ allocsize = in_progress_list_maxlen * 2;
+ in_progress_list = repalloc(in_progress_list,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
+ }
+ in_progress_offset = in_progress_list_len++;
+ in_progress_list[in_progress_offset] = type_id;
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -896,6 +950,13 @@ lookup_type_cache(Oid type_id, int flags)
load_domaintype_info(typentry);
}
+ INJECTION_POINT("typecache-before-rel-type-cache-insert");
+
+ Assert(in_progress_offset + 1 == in_progress_list_len);
+ in_progress_list_len--;
+
+ insert_rel_type_cache_if_needed(typentry);
+
return typentry;
}
@@ -2290,6 +2351,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ delete_rel_type_cache_if_needed(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2299,63 +2407,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus, we
+ * use the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically, this shouldn't
+ * affect performance since domain types are less tended to bloat.
+ * Domain types are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2367,6 +2467,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid is invalid. By convention, we need to reset all composite
+ * types in cache. Also, we should reset flags for domain types, and
+ * we loop over all entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2397,6 +2527,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2406,6 +2538,13 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /*
+ * Call delete_rel_type_cache() if we cleaned
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
+ */
+ if (hadPgTypeData)
+ delete_rel_type_cache_if_needed(typentry);
}
}
@@ -2914,3 +3053,135 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry if needed.
+ */
+static void
+insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry have any
+ * information indicating it should be here.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS,
+ * or tupDesc.
+ */
+static void
+delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+#ifdef USE_ASSERT_CHECKING
+ int i;
+ bool is_in_progress = false;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ if (in_progress_list[i] == typentry->type_id)
+ {
+ is_in_progress = true;
+ break;
+ }
+ }
+#endif
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found || is_in_progress);
+ }
+ else
+ {
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ bool found;
+
+ if (!is_in_progress)
+ {
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+ }
+}
+
+/*
+ * Add possibly missing RelIdToTypeId entries related to TypeCacheHash
+ * entries, marked as in-progress by lookup_type_cache(). It may happen
+ * in case of an error or interruption during the lookup_type_cache() call.
+ */
+static void
+finalize_in_progress_typentries(void)
+{
+ int i;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ TypeCacheEntry *typentry;
+
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &in_progress_list[i],
+ HASH_FIND, NULL);
+ if (typentry)
+ insert_rel_type_cache_if_needed(typentry);
+ }
+
+ in_progress_list_len = 0;
+}
+
+void
+AtEOXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
+
+void
+AtEOSubXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
diff --git a/src/include/utils/typcache.h b/src/include/utils/typcache.h
index f506cc4aa35..f3d73ecee3a 100644
--- a/src/include/utils/typcache.h
+++ b/src/include/utils/typcache.h
@@ -207,4 +207,8 @@ extern void SharedRecordTypmodRegistryInit(SharedRecordTypmodRegistry *,
extern void SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *);
+extern void AtEOXact_TypeCache(void);
+
+extern void AtEOSubXact_TypeCache(void);
+
#endif /* TYPCACHE_H */
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 256799f520a..c0d3cf0e14b 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -43,9 +43,9 @@ SUBDIRS = \
ifeq ($(enable_injection_points),yes)
-SUBDIRS += injection_points gin
+SUBDIRS += injection_points gin typcache
else
-ALWAYS_SUBDIRS += injection_points gin
+ALWAYS_SUBDIRS += injection_points gin typcache
endif
ifeq ($(with_ssl),openssl)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index d8fe059d236..c829b619530 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -36,6 +36,7 @@ subdir('test_rls_hooks')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
+subdir('typcache')
subdir('unsafe_tests')
subdir('worker_spi')
subdir('xid_wraparound')
diff --git a/src/test/modules/typcache/.gitignore b/src/test/modules/typcache/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/typcache/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/typcache/Makefile b/src/test/modules/typcache/Makefile
new file mode 100644
index 00000000000..1f03de83890
--- /dev/null
+++ b/src/test/modules/typcache/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/typcache/Makefile
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+REGRESS = typcache_rel_type_cache
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/typcache
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+# XXX: This test is conditional on enable_injection_points in the
+# parent Makefile, so we should never get here in the first place if
+# injection points are not enabled. But the buildfarm 'misc-check'
+# step doesn't pay attention to the if-condition in the parent
+# Makefile. To work around that, disable running the test here too.
+ifeq ($(enable_injection_points),yes)
+include $(top_srcdir)/contrib/contrib-global.mk
+else
+check:
+ @echo "injection points are disabled in this build"
+endif
+
+endif
diff --git a/src/test/modules/typcache/expected/typcache_rel_type_cache.out b/src/test/modules/typcache/expected/typcache_rel_type_cache.out
new file mode 100644
index 00000000000..b113e0bbd5d
--- /dev/null
+++ b/src/test/modules/typcache/expected/typcache_rel_type_cache.out
@@ -0,0 +1,34 @@
+--
+-- This test checks that lookup_type_cache() can correctly handle an
+-- interruption. We use the injection point to simulate an error but note
+-- that a similar situation could happen due to user query interruption.
+-- Despite the interruption, a map entry from the relation oid to type cache
+-- entry should be created. This is validated by subsequent modification of
+-- the table schema, then type casts which use new schema implying
+-- successful type cache invalidation by relation oid.
+--
+CREATE EXTENSION injection_points;
+CREATE TABLE t (i int);
+SELECT injection_points_attach('typecache-before-rel-type-cache-insert', 'error');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+SELECT '(1)'::t;
+ERROR: error triggered for injection point typecache-before-rel-type-cache-insert
+LINE 1: SELECT '(1)'::t;
+ ^
+SELECT injection_points_detach('typecache-before-rel-type-cache-insert');
+ injection_points_detach
+-------------------------
+
+(1 row)
+
+ALTER TABLE t ADD COLUMN j int;
+SELECT '(1,2)'::t;
+ t
+-------
+ (1,2)
+(1 row)
+
diff --git a/src/test/modules/typcache/meson.build b/src/test/modules/typcache/meson.build
new file mode 100644
index 00000000000..cb2e34c0d2b
--- /dev/null
+++ b/src/test/modules/typcache/meson.build
@@ -0,0 +1,16 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+if not get_option('injection_points')
+ subdir_done()
+endif
+
+tests += {
+ 'name': 'typcache',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'typcache_rel_type_cache',
+ ],
+ },
+}
diff --git a/src/test/modules/typcache/sql/typcache_rel_type_cache.sql b/src/test/modules/typcache/sql/typcache_rel_type_cache.sql
new file mode 100644
index 00000000000..2c0a434d988
--- /dev/null
+++ b/src/test/modules/typcache/sql/typcache_rel_type_cache.sql
@@ -0,0 +1,18 @@
+--
+-- This test checks that lookup_type_cache() can correctly handle an
+-- interruption. We use the injection point to simulate an error but note
+-- that a similar situation could happen due to user query interruption.
+-- Despite the interruption, a map entry from the relation oid to type cache
+-- entry should be created. This is validated by subsequent modification of
+-- the table schema, then type casts which use new schema implying
+-- successful type cache invalidation by relation oid.
+--
+
+CREATE EXTENSION injection_points;
+
+CREATE TABLE t (i int);
+SELECT injection_points_attach('typecache-before-rel-type-cache-insert', 'error');
+SELECT '(1)'::t;
+SELECT injection_points_detach('typecache-before-rel-type-cache-insert');
+ALTER TABLE t ADD COLUMN j int;
+SELECT '(1,2)'::t;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 57de1acff3a..9b3e7fd104b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.5 (Apple Git-154)
Hi, Alexander!
On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com>
wrote:
On Mon, Oct 21, 2024 at 2:30 PM Alexander Korotkov <aekorotkov@gmail.com>
wrote:On Mon, Oct 21, 2024 at 1:16 PM Dagfinn Ilmari Mannsåker
<ilmari@ilmari.org> wrote:Alexander Korotkov <aekorotkov@gmail.com> writes:
On Mon, Oct 21, 2024 at 8:40 AM Andrei Lepikhov <lepihov@gmail.com>
wrote:
On 21/10/2024 06:32, Dagfinn Ilmari Mannsåker wrote:
Alexander Korotkov <aekorotkov@gmail.com> writes:
+static Oid *in_progress_list; +static int in_progress_list_len; +static int in_progress_list_maxlen;Is there any particular reason not to use pg_list.h for this?
Sure. The type cache lookup has to be as much optimal as possible.
Using an array and relating sequential access to it, we avoid memory
allocations and deallocations 99.9% of the time. Also, quick accessto
the single element (which we will have in real life almost all of
the
time) is much faster than employing list machinery.
Lists are actually dynamically resized arrays these days (see commit
1cff1b95ab6ddae32faa3efe0d95a820dbfdc164), not linked lists, so
accessing arbitrary elements is O(1), not O(n). Just like this patch,
the size is doubled (starting at 16) whenever array is full.+1,
List with zero elements has to be NIL. That means continuous
allocations/deallocations.This however is a valid point (unless we keep a dummy zeroth element to
avoid it, which is even uglier than open-coding the array extension
logic), so objection withdrawn.OK, thank you!
The attached revision fixes EXTRA_INSTALL in
src/test/modules/typcache/Makefile. Spotted off-list by Arthur
Zakirov.I've re-checked that regression tests pass with
-DCLOBBER_CACHE_ALWAYS. Also did some grammar corrections for
comments and commit message. I'm going to push this if no objections.
Thank you for working on this patch!
Looked through the patchset once more.
Patch 0001 (minor): "in the last" -> "after everything else" or "after
other TypeCacheEntry contents"
Patch 0002 looks ready to me.
Regards,
Pavel Borisov
Supabase
On Tue, Oct 22, 2024 at 6:10 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, Oct 21, 2024 at 2:30 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Mon, Oct 21, 2024 at 1:16 PM Dagfinn Ilmari Mannsåker
<ilmari@ilmari.org> wrote:Alexander Korotkov <aekorotkov@gmail.com> writes:
On Mon, Oct 21, 2024 at 8:40 AM Andrei Lepikhov <lepihov@gmail.com> wrote:
On 21/10/2024 06:32, Dagfinn Ilmari Mannsåker wrote:
Alexander Korotkov <aekorotkov@gmail.com> writes:
+static Oid *in_progress_list; +static int in_progress_list_len; +static int in_progress_list_maxlen;Is there any particular reason not to use pg_list.h for this?
Sure. The type cache lookup has to be as much optimal as possible.
Using an array and relating sequential access to it, we avoid memory
allocations and deallocations 99.9% of the time. Also, quick access to
the single element (which we will have in real life almost all of the
time) is much faster than employing list machinery.Lists are actually dynamically resized arrays these days (see commit
1cff1b95ab6ddae32faa3efe0d95a820dbfdc164), not linked lists, so
accessing arbitrary elements is O(1), not O(n). Just like this patch,
the size is doubled (starting at 16) whenever array is full.+1,
List with zero elements has to be NIL. That means continuous
allocations/deallocations.This however is a valid point (unless we keep a dummy zeroth element to
avoid it, which is even uglier than open-coding the array extension
logic), so objection withdrawn.OK, thank you!
The attached revision fixes EXTRA_INSTALL in
src/test/modules/typcache/Makefile. Spotted off-list by Arthur
Zakirov.I've re-checked that regression tests pass with
-DCLOBBER_CACHE_ALWAYS. Also did some grammar corrections for
comments and commit message. I'm going to push this if no objections.Thank you for working on this patch!
Looked through the patchset once more.Patch 0001 (minor): "in the last" -> "after everything else" or "after other TypeCacheEntry contents"
Patch 0002 looks ready to me.
Thank you, Pavel! 0001 revised according to your suggestion.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v19-0001-Update-header-comment-for-lookup_type_cache.patchapplication/octet-stream; name=v19-0001-Update-header-comment-for-lookup_type_cache.patchDownload
From 5f9c300e039854949cbd1337200b227018426c05 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 22 Oct 2024 10:30:40 +0300
Subject: [PATCH v19 1/2] Update header comment for lookup_type_cache()
Describe the way we handle concurrent invalidation messages.
Discussion: https://postgr.es/m/CAPpHfdsQhwUrnB3of862j9RgHoJM--eRbifvBMvtQxpC57dxCA%40mail.gmail.com
Reviewed-by: Andrei Lepikhov, Artur Zakirov, Pavel Borisov
---
src/backend/utils/cache/typcache.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index 2ec136b7d30..f142624ad2e 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -351,6 +351,15 @@ type_cache_syshash(const void *key, Size keysize)
* invalid. Note however that we may fail to find one or more of the
* values requested by 'flags'; the caller needs to check whether the fields
* are InvalidOid or not.
+ *
+ * Note that while filling TypeCacheEntry we might process concurrent
+ * invalidation messages, causing our not-yet-filled TypeCacheEntry to be
+ * invalidated. In this case, we typically only clear flags while values are
+ * still available for the caller. It's expected that the caller holds
+ * enough locks on type-depending objects that the values are still relevant.
+ * It's also important that the tupdesc is filled after all other
+ * TypeCacheEntry items for TYPTYPE_COMPOSITE. So, tupdesc can't get
+ * invalidated during the lookup_type_cache() call.
*/
TypeCacheEntry *
lookup_type_cache(Oid type_id, int flags)
--
2.39.5 (Apple Git-154)
v19-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchapplication/octet-stream; name=v19-0002-Avoid-looping-over-all-type-cache-entries-in-Typ.patchDownload
From af2a07a7cc52f3335c939e0ed84bcca70a50b885 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Tue, 22 Oct 2024 10:30:46 +0300
Subject: [PATCH v19 2/2] Avoid looping over all type cache entries in
TypeCacheRelCallback()
Currently, when a single relcache entry gets invalidated,
TypeCacheRelCallback() has to loop over all type cache entries to find
appropriate typentry to invalidate. Unfortunately, using the syscache here
is impossible, because this callback could be called outside a transaction
and this makes impossible catalog lookups. This is why present commit
introduces RelIdToTypeIdCacheHash to map relation OID to its composite type
OID.
We are keeping RelIdToTypeIdCacheHash entry while corresponding type cache
entry have something to clean. Therefore, RelIdToTypeIdCacheHash shouldn't
get bloat in the case of temporary tables flood.
There are many places in lookup_type_cache() where syscache invalidation,
user interruption, or even error could occur. In order to handle this, we
keep an array of in-progress type cache entries. In the case of
lookup_type_cache() interruption this array is processed to keep
RelIdToTypeIdCacheHash in a consistent state.
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1%40sigaev.ru
Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Tom Lane, Michael Paquier, Roman Zharkov
Reviewed-by: Andrei Lepikhov, Pavel Borisov, Jian He, Alexander Lakhin
Reviewed-by: Artur Zakirov
---
src/backend/access/transam/xact.c | 10 +
src/backend/utils/cache/typcache.c | 359 +++++++++++++++---
src/include/utils/typcache.h | 4 +
src/test/modules/Makefile | 4 +-
src/test/modules/meson.build | 1 +
src/test/modules/typcache/.gitignore | 4 +
src/test/modules/typcache/Makefile | 28 ++
.../expected/typcache_rel_type_cache.out | 34 ++
src/test/modules/typcache/meson.build | 16 +
.../typcache/sql/typcache_rel_type_cache.sql | 18 +
src/tools/pgindent/typedefs.list | 1 +
11 files changed, 433 insertions(+), 46 deletions(-)
create mode 100644 src/test/modules/typcache/.gitignore
create mode 100644 src/test/modules/typcache/Makefile
create mode 100644 src/test/modules/typcache/expected/typcache_rel_type_cache.out
create mode 100644 src/test/modules/typcache/meson.build
create mode 100644 src/test/modules/typcache/sql/typcache_rel_type_cache.sql
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 87700c7c5c7..b0b05e28790 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -70,6 +70,7 @@
#include "utils/snapmgr.h"
#include "utils/timeout.h"
#include "utils/timestamp.h"
+#include "utils/typcache.h"
/*
* User-tweakable parameters
@@ -2407,6 +2408,9 @@ CommitTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/*
* Make catalog changes visible to all backends. This has to happen after
* relcache references are dropped (see comments for
@@ -2709,6 +2713,9 @@ PrepareTransaction(void)
/* Clean up the relation cache */
AtEOXact_RelationCache(true);
+ /* Clean up the type cache */
+ AtEOXact_TypeCache();
+
/* notify doesn't need a postprepare call */
PostPrepare_PgStat();
@@ -2951,6 +2958,7 @@ AbortTransaction(void)
false, true);
AtEOXact_Buffers(false);
AtEOXact_RelationCache(false);
+ AtEOXact_TypeCache();
AtEOXact_Inval(false);
AtEOXact_MultiXact();
ResourceOwnerRelease(TopTransactionResourceOwner,
@@ -5153,6 +5161,7 @@ CommitSubTransaction(void)
true, false);
AtEOSubXact_RelationCache(true, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(true);
AtSubCommit_smgr();
@@ -5328,6 +5337,7 @@ AbortSubTransaction(void)
AtEOSubXact_RelationCache(false, s->subTransactionId,
s->parent->subTransactionId);
+ AtEOSubXact_TypeCache();
AtEOSubXact_Inval(false);
ResourceOwnerRelease(s->curTransactionOwner,
RESOURCE_RELEASE_LOCKS,
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index f142624ad2e..1972bd1944b 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -66,6 +66,7 @@
#include "utils/builtins.h"
#include "utils/catcache.h"
#include "utils/fmgroids.h"
+#include "utils/injection_point.h"
#include "utils/inval.h"
#include "utils/lsyscache.h"
#include "utils/memutils.h"
@@ -77,6 +78,20 @@
/* The main type cache hashtable searched by lookup_type_cache */
static HTAB *TypeCacheHash = NULL;
+/*
+ * The mapping of relation's OID to the corresponding composite type OID.
+ * We're keeping the map entry when the corresponding typentry has something
+ * to clear i.e it has either TCFLAGS_HAVE_PG_TYPE_DATA, or
+ * TCFLAGS_OPERATOR_FLAGS, or tupdesc.
+ */
+static HTAB *RelIdToTypeIdCacheHash = NULL;
+
+typedef struct RelIdToTypeIdCacheEntry
+{
+ Oid relid; /* OID of the relation */
+ Oid composite_typid; /* OID of the relation's composite type */
+} RelIdToTypeIdCacheEntry;
+
/* List of type cache entries for domain types */
static TypeCacheEntry *firstDomainTypeEntry = NULL;
@@ -208,6 +223,10 @@ typedef struct SharedTypmodTableEntry
dsa_pointer shared_tupdesc;
} SharedTypmodTableEntry;
+static Oid *in_progress_list;
+static int in_progress_list_len;
+static int in_progress_list_maxlen;
+
/*
* A comparator function for SharedRecordTableKey.
*/
@@ -329,6 +348,8 @@ static void shared_record_typmod_registry_detach(dsm_segment *segment,
static TupleDesc find_or_make_matching_shared_tupledesc(TupleDesc tupdesc);
static dsa_pointer share_tupledesc(dsa_area *area, TupleDesc tupdesc,
uint32 typmod);
+static void insert_rel_type_cache_if_needed(TypeCacheEntry *typentry);
+static void delete_rel_type_cache_if_needed(TypeCacheEntry *typentry);
/*
@@ -366,11 +387,13 @@ lookup_type_cache(Oid type_id, int flags)
{
TypeCacheEntry *typentry;
bool found;
+ int in_progress_offset;
if (TypeCacheHash == NULL)
{
/* First time through: initialize the hash table */
HASHCTL ctl;
+ int allocsize;
ctl.keysize = sizeof(Oid);
ctl.entrysize = sizeof(TypeCacheEntry);
@@ -385,6 +408,13 @@ lookup_type_cache(Oid type_id, int flags)
TypeCacheHash = hash_create("Type information cache", 64,
&ctl, HASH_ELEM | HASH_FUNCTION);
+ Assert(RelIdToTypeIdCacheHash == NULL);
+
+ ctl.keysize = sizeof(Oid);
+ ctl.entrysize = sizeof(RelIdToTypeIdCacheEntry);
+ RelIdToTypeIdCacheHash = hash_create("Map from relid to OID of cached composite type", 64,
+ &ctl, HASH_ELEM | HASH_BLOBS);
+
/* Also set up callbacks for SI invalidations */
CacheRegisterRelcacheCallback(TypeCacheRelCallback, (Datum) 0);
CacheRegisterSyscacheCallback(TYPEOID, TypeCacheTypCallback, (Datum) 0);
@@ -394,8 +424,32 @@ lookup_type_cache(Oid type_id, int flags)
/* Also make sure CacheMemoryContext exists */
if (!CacheMemoryContext)
CreateCacheMemoryContext();
+
+ /*
+ * reserve enough in_progress_list slots for many cases
+ */
+ allocsize = 4;
+ in_progress_list =
+ MemoryContextAlloc(CacheMemoryContext,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
}
+ Assert(TypeCacheHash != NULL && RelIdToTypeIdCacheHash != NULL);
+
+ /* Register to catch invalidation messages */
+ if (in_progress_list_len >= in_progress_list_maxlen)
+ {
+ int allocsize;
+
+ allocsize = in_progress_list_maxlen * 2;
+ in_progress_list = repalloc(in_progress_list,
+ allocsize * sizeof(*in_progress_list));
+ in_progress_list_maxlen = allocsize;
+ }
+ in_progress_offset = in_progress_list_len++;
+ in_progress_list[in_progress_offset] = type_id;
+
/* Try to look up an existing entry */
typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
&type_id,
@@ -896,6 +950,13 @@ lookup_type_cache(Oid type_id, int flags)
load_domaintype_info(typentry);
}
+ INJECTION_POINT("typecache-before-rel-type-cache-insert");
+
+ Assert(in_progress_offset + 1 == in_progress_list_len);
+ in_progress_list_len--;
+
+ insert_rel_type_cache_if_needed(typentry);
+
return typentry;
}
@@ -2290,6 +2351,53 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
CurrentSession->shared_typmod_table = typmod_table;
}
+/*
+ * InvalidateCompositeTypeCacheEntry
+ * Invalidate particular TypeCacheEntry on Relcache inval callback
+ *
+ * Delete the cached tuple descriptor (if any) for the given composite
+ * type, and reset whatever info we have cached about the composite type's
+ * comparability.
+ */
+static void
+InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
+{
+ bool hadTupDescOrOpclass;
+
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE &&
+ OidIsValid(typentry->typrelid));
+
+ hadTupDescOrOpclass = (typentry->tupDesc != NULL) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
+ /* Delete tupdesc if we have it */
+ if (typentry->tupDesc != NULL)
+ {
+ /*
+ * Release our refcount and free the tupdesc if none remain. We can't
+ * use DecrTupleDescRefCount here because this reference is not logged
+ * by the current resource owner.
+ */
+ Assert(typentry->tupDesc->tdrefcount > 0);
+ if (--typentry->tupDesc->tdrefcount == 0)
+ FreeTupleDesc(typentry->tupDesc);
+ typentry->tupDesc = NULL;
+
+ /*
+ * Also clear tupDesc_identifier, so that anyone watching it will
+ * realize that the tupdesc has changed.
+ */
+ typentry->tupDesc_identifier = 0;
+ }
+
+ /* Reset equality/comparison/hashing validity information */
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /* Call delete_rel_type_cache() if we actually cleared something */
+ if (hadTupDescOrOpclass)
+ delete_rel_type_cache_if_needed(typentry);
+}
+
/*
* TypeCacheRelCallback
* Relcache inval callback function
@@ -2299,63 +2407,55 @@ SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *registry)
* whatever info we have cached about the composite type's comparability.
*
* This is called when a relcache invalidation event occurs for the given
- * relid. We must scan the whole typcache hash since we don't know the
- * type OID corresponding to the relid. We could do a direct search if this
- * were a syscache-flush callback on pg_type, but then we would need all
- * ALTER-TABLE-like commands that could modify a rowtype to issue syscache
- * invals against the rel's pg_type OID. The extra SI signaling could very
- * well cost more than we'd save, since in most usages there are not very
- * many entries in a backend's typcache. The risk of bugs-of-omission seems
- * high, too.
- *
- * Another possibility, with only localized impact, is to maintain a second
- * hashtable that indexes composite-type typcache entries by their typrelid.
- * But it's still not clear it's worth the trouble.
+ * relid. We can't use syscache to find a type corresponding to the given
+ * relation because the code can be called outside of transaction. Thus, we
+ * use the RelIdToTypeIdCacheHash map to locate appropriate typcache entry.
*/
static void
TypeCacheRelCallback(Datum arg, Oid relid)
{
- HASH_SEQ_STATUS status;
TypeCacheEntry *typentry;
- /* TypeCacheHash must exist, else this callback wouldn't be registered */
- hash_seq_init(&status, TypeCacheHash);
- while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ /*
+ * RelIdToTypeIdCacheHash and TypeCacheHash should exist, otherwise this
+ * callback wouldn't be registered
+ */
+ if (OidIsValid(relid))
{
- if (typentry->typtype == TYPTYPE_COMPOSITE)
+ RelIdToTypeIdCacheEntry *relentry;
+
+ /*
+ * Find an RelIdToTypeIdCacheHash entry, which should exist as soon as
+ * corresponding typcache entry has something to clean.
+ */
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &relid,
+ HASH_FIND, NULL);
+
+ if (relentry != NULL)
{
- /* Skip if no match, unless we're zapping all composite types */
- if (relid != typentry->typrelid && relid != InvalidOid)
- continue;
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &relentry->composite_typid,
+ HASH_FIND, NULL);
- /* Delete tupdesc if we have it */
- if (typentry->tupDesc != NULL)
+ if (typentry != NULL)
{
- /*
- * Release our refcount, and free the tupdesc if none remain.
- * (Can't use DecrTupleDescRefCount because this reference is
- * not logged in current resource owner.)
- */
- Assert(typentry->tupDesc->tdrefcount > 0);
- if (--typentry->tupDesc->tdrefcount == 0)
- FreeTupleDesc(typentry->tupDesc);
- typentry->tupDesc = NULL;
+ Assert(typentry->typtype == TYPTYPE_COMPOSITE);
+ Assert(relid == typentry->typrelid);
- /*
- * Also clear tupDesc_identifier, so that anything watching
- * that will realize that the tupdesc has possibly changed.
- * (Alternatively, we could specify that to detect possible
- * tupdesc change, one must check for tupDesc != NULL as well
- * as tupDesc_identifier being the same as what was previously
- * seen. That seems error-prone.)
- */
- typentry->tupDesc_identifier = 0;
+ InvalidateCompositeTypeCacheEntry(typentry);
}
-
- /* Reset equality/comparison/hashing validity information */
- typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
- else if (typentry->typtype == TYPTYPE_DOMAIN)
+
+ /*
+ * Visit all the domain types sequentially. Typically, this shouldn't
+ * affect performance since domain types are less tended to bloat.
+ * Domain types are created manually, unlike composite types which are
+ * automatically created for every temporary table.
+ */
+ for (typentry = firstDomainTypeEntry;
+ typentry != NULL;
+ typentry = typentry->nextDomain)
{
/*
* If it's domain over composite, reset flags. (We don't bother
@@ -2367,6 +2467,36 @@ TypeCacheRelCallback(Datum arg, Oid relid)
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
}
}
+ else
+ {
+ HASH_SEQ_STATUS status;
+
+ /*
+ * Relid is invalid. By convention, we need to reset all composite
+ * types in cache. Also, we should reset flags for domain types, and
+ * we loop over all entries in hash, so, do it in a single scan.
+ */
+ hash_seq_init(&status, TypeCacheHash);
+ while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
+ {
+ if (typentry->typtype == TYPTYPE_COMPOSITE)
+ {
+ InvalidateCompositeTypeCacheEntry(typentry);
+ }
+ else if (typentry->typtype == TYPTYPE_DOMAIN)
+ {
+ /*
+ * If it's domain over composite, reset flags. (We don't
+ * bother trying to determine whether the specific base type
+ * needs a reset.) Note that if we haven't determined whether
+ * the base type is composite, we don't need to reset
+ * anything.
+ */
+ if (typentry->flags & TCFLAGS_DOMAIN_BASE_IS_COMPOSITE)
+ typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+ }
+ }
+ }
}
/*
@@ -2397,6 +2527,8 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadPgTypeData = (typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA);
+
Assert(hashvalue == 0 || typentry->type_id_hash == hashvalue);
/*
@@ -2406,6 +2538,13 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
*/
typentry->flags &= ~(TCFLAGS_HAVE_PG_TYPE_DATA |
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
+
+ /*
+ * Call delete_rel_type_cache() if we cleaned
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
+ */
+ if (hadPgTypeData)
+ delete_rel_type_cache_if_needed(typentry);
}
}
@@ -2914,3 +3053,135 @@ shared_record_typmod_registry_detach(dsm_segment *segment, Datum datum)
}
CurrentSession->shared_typmod_registry = NULL;
}
+
+/*
+ * Insert RelIdToTypeIdCacheHash entry if needed.
+ */
+static void
+insert_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Insert a RelIdToTypeIdCacheHash entry if the typentry have any
+ * information indicating it should be here.
+ */
+ if ((typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) ||
+ (typentry->flags & TCFLAGS_OPERATOR_FLAGS) ||
+ typentry->tupDesc != NULL)
+ {
+ RelIdToTypeIdCacheEntry *relentry;
+ bool found;
+
+ relentry = (RelIdToTypeIdCacheEntry *) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_ENTER, &found);
+ relentry->relid = typentry->typrelid;
+ relentry->composite_typid = typentry->type_id;
+ }
+}
+
+/*
+ * Delete entry RelIdToTypeIdCacheHash if needed after resetting of the
+ * TCFLAGS_HAVE_PG_TYPE_DATA flag, or any of TCFLAGS_OPERATOR_FLAGS,
+ * or tupDesc.
+ */
+static void
+delete_rel_type_cache_if_needed(TypeCacheEntry *typentry)
+{
+#ifdef USE_ASSERT_CHECKING
+ int i;
+ bool is_in_progress = false;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ if (in_progress_list[i] == typentry->type_id)
+ {
+ is_in_progress = true;
+ break;
+ }
+ }
+#endif
+
+ /* Immediately quit for non-composite types */
+ if (typentry->typtype != TYPTYPE_COMPOSITE)
+ return;
+
+ /* typrelid should be given for composite types */
+ Assert(OidIsValid(typentry->typrelid));
+
+ /*
+ * Delete a RelIdToTypeIdCacheHash entry if the typentry doesn't have any
+ * information indicating entry should be still there.
+ */
+ if (!(typentry->flags & TCFLAGS_HAVE_PG_TYPE_DATA) &&
+ !(typentry->flags & TCFLAGS_OPERATOR_FLAGS) &&
+ typentry->tupDesc == NULL)
+ {
+ bool found;
+
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_REMOVE, &found);
+ Assert(found || is_in_progress);
+ }
+ else
+ {
+#ifdef USE_ASSERT_CHECKING
+ /*
+ * In assert-enabled builds otherwise check for RelIdToTypeIdCacheHash
+ * entry if it should exist.
+ */
+ bool found;
+
+ if (!is_in_progress)
+ {
+ (void) hash_search(RelIdToTypeIdCacheHash,
+ &typentry->typrelid,
+ HASH_FIND, &found);
+ Assert(found);
+ }
+#endif
+ }
+}
+
+/*
+ * Add possibly missing RelIdToTypeId entries related to TypeCacheHash
+ * entries, marked as in-progress by lookup_type_cache(). It may happen
+ * in case of an error or interruption during the lookup_type_cache() call.
+ */
+static void
+finalize_in_progress_typentries(void)
+{
+ int i;
+
+ for (i = 0; i < in_progress_list_len; i++)
+ {
+ TypeCacheEntry *typentry;
+
+ typentry = (TypeCacheEntry *) hash_search(TypeCacheHash,
+ &in_progress_list[i],
+ HASH_FIND, NULL);
+ if (typentry)
+ insert_rel_type_cache_if_needed(typentry);
+ }
+
+ in_progress_list_len = 0;
+}
+
+void
+AtEOXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
+
+void
+AtEOSubXact_TypeCache(void)
+{
+ finalize_in_progress_typentries();
+}
diff --git a/src/include/utils/typcache.h b/src/include/utils/typcache.h
index f506cc4aa35..f3d73ecee3a 100644
--- a/src/include/utils/typcache.h
+++ b/src/include/utils/typcache.h
@@ -207,4 +207,8 @@ extern void SharedRecordTypmodRegistryInit(SharedRecordTypmodRegistry *,
extern void SharedRecordTypmodRegistryAttach(SharedRecordTypmodRegistry *);
+extern void AtEOXact_TypeCache(void);
+
+extern void AtEOSubXact_TypeCache(void);
+
#endif /* TYPCACHE_H */
diff --git a/src/test/modules/Makefile b/src/test/modules/Makefile
index 256799f520a..c0d3cf0e14b 100644
--- a/src/test/modules/Makefile
+++ b/src/test/modules/Makefile
@@ -43,9 +43,9 @@ SUBDIRS = \
ifeq ($(enable_injection_points),yes)
-SUBDIRS += injection_points gin
+SUBDIRS += injection_points gin typcache
else
-ALWAYS_SUBDIRS += injection_points gin
+ALWAYS_SUBDIRS += injection_points gin typcache
endif
ifeq ($(with_ssl),openssl)
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index d8fe059d236..c829b619530 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -36,6 +36,7 @@ subdir('test_rls_hooks')
subdir('test_shm_mq')
subdir('test_slru')
subdir('test_tidstore')
+subdir('typcache')
subdir('unsafe_tests')
subdir('worker_spi')
subdir('xid_wraparound')
diff --git a/src/test/modules/typcache/.gitignore b/src/test/modules/typcache/.gitignore
new file mode 100644
index 00000000000..5dcb3ff9723
--- /dev/null
+++ b/src/test/modules/typcache/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/src/test/modules/typcache/Makefile b/src/test/modules/typcache/Makefile
new file mode 100644
index 00000000000..1f03de83890
--- /dev/null
+++ b/src/test/modules/typcache/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/typcache/Makefile
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+REGRESS = typcache_rel_type_cache
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/typcache
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+# XXX: This test is conditional on enable_injection_points in the
+# parent Makefile, so we should never get here in the first place if
+# injection points are not enabled. But the buildfarm 'misc-check'
+# step doesn't pay attention to the if-condition in the parent
+# Makefile. To work around that, disable running the test here too.
+ifeq ($(enable_injection_points),yes)
+include $(top_srcdir)/contrib/contrib-global.mk
+else
+check:
+ @echo "injection points are disabled in this build"
+endif
+
+endif
diff --git a/src/test/modules/typcache/expected/typcache_rel_type_cache.out b/src/test/modules/typcache/expected/typcache_rel_type_cache.out
new file mode 100644
index 00000000000..b113e0bbd5d
--- /dev/null
+++ b/src/test/modules/typcache/expected/typcache_rel_type_cache.out
@@ -0,0 +1,34 @@
+--
+-- This test checks that lookup_type_cache() can correctly handle an
+-- interruption. We use the injection point to simulate an error but note
+-- that a similar situation could happen due to user query interruption.
+-- Despite the interruption, a map entry from the relation oid to type cache
+-- entry should be created. This is validated by subsequent modification of
+-- the table schema, then type casts which use new schema implying
+-- successful type cache invalidation by relation oid.
+--
+CREATE EXTENSION injection_points;
+CREATE TABLE t (i int);
+SELECT injection_points_attach('typecache-before-rel-type-cache-insert', 'error');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+SELECT '(1)'::t;
+ERROR: error triggered for injection point typecache-before-rel-type-cache-insert
+LINE 1: SELECT '(1)'::t;
+ ^
+SELECT injection_points_detach('typecache-before-rel-type-cache-insert');
+ injection_points_detach
+-------------------------
+
+(1 row)
+
+ALTER TABLE t ADD COLUMN j int;
+SELECT '(1,2)'::t;
+ t
+-------
+ (1,2)
+(1 row)
+
diff --git a/src/test/modules/typcache/meson.build b/src/test/modules/typcache/meson.build
new file mode 100644
index 00000000000..cb2e34c0d2b
--- /dev/null
+++ b/src/test/modules/typcache/meson.build
@@ -0,0 +1,16 @@
+# Copyright (c) 2022-2024, PostgreSQL Global Development Group
+
+if not get_option('injection_points')
+ subdir_done()
+endif
+
+tests += {
+ 'name': 'typcache',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'typcache_rel_type_cache',
+ ],
+ },
+}
diff --git a/src/test/modules/typcache/sql/typcache_rel_type_cache.sql b/src/test/modules/typcache/sql/typcache_rel_type_cache.sql
new file mode 100644
index 00000000000..2c0a434d988
--- /dev/null
+++ b/src/test/modules/typcache/sql/typcache_rel_type_cache.sql
@@ -0,0 +1,18 @@
+--
+-- This test checks that lookup_type_cache() can correctly handle an
+-- interruption. We use the injection point to simulate an error but note
+-- that a similar situation could happen due to user query interruption.
+-- Despite the interruption, a map entry from the relation oid to type cache
+-- entry should be created. This is validated by subsequent modification of
+-- the table schema, then type casts which use new schema implying
+-- successful type cache invalidation by relation oid.
+--
+
+CREATE EXTENSION injection_points;
+
+CREATE TABLE t (i int);
+SELECT injection_points_attach('typecache-before-rel-type-cache-insert', 'error');
+SELECT '(1)'::t;
+SELECT injection_points_detach('typecache-before-rel-type-cache-insert');
+ALTER TABLE t ADD COLUMN j int;
+SELECT '(1,2)'::t;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 57de1acff3a..9b3e7fd104b 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -2378,6 +2378,7 @@ RelFileLocator
RelFileLocatorBackend
RelFileNumber
RelIdCacheEnt
+RelIdToTypeIdCacheEntry
RelInfo
RelInfoArr
RelMapFile
--
2.39.5 (Apple Git-154)
Hi,
On 2024-10-22 20:33:24 +0300, Alexander Korotkov wrote:
Thank you, Pavel! 0001 revised according to your suggestion.
Starting with this commit CI fails.
https://cirrus-ci.com/task/6668851469877248
https://api.cirrus-ci.com/v1/artifact/task/6668851469877248/testrun/build/testrun/regress-running/regress/regression.diffs
diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out /tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out 2024-10-24 11:38:43.829712000 +0000
+++ /tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out 2024-10-24 11:44:57.154238000 +0000
@@ -1338,14 +1338,9 @@
ERROR: cannot drop inherited constraint "f1_pos" of relation "p1_c1"
alter table p1 drop constraint f1_pos;
\d p1_c1
- Table "public.p1_c1"
- Column | Type | Collation | Nullable | Default
---------+---------+-----------+----------+---------
- f1 | integer | | |
-Check constraints:
- "f1_pos" CHECK (f1 > 0)
-Inherits: p1
-
+ERROR: error triggered for injection point typecache-before-rel-type-cache-insert
+LINE 4: ORDER BY 1;
+ ^
drop table p1 cascade;
NOTICE: drop cascades to table p1_c1
create table p1(f1 int constraint f1_pos CHECK (f1 > 0));
Greetings,
Andres
On Fri, Oct 25, 2024 at 11:35 AM Andres Freund <andres@anarazel.de> wrote:
On 2024-10-22 20:33:24 +0300, Alexander Korotkov wrote:
Thank you, Pavel! 0001 revised according to your suggestion.
Starting with this commit CI fails.
https://cirrus-ci.com/task/6668851469877248
https://api.cirrus-ci.com/v1/artifact/task/6668851469877248/testrun/build/testrun/regress-running/regress/regression.diffsdiff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out /tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out --- /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out 2024-10-24 11:38:43.829712000 +0000 +++ /tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out 2024-10-24 11:44:57.154238000 +0000 @@ -1338,14 +1338,9 @@ ERROR: cannot drop inherited constraint "f1_pos" of relation "p1_c1" alter table p1 drop constraint f1_pos; \d p1_c1 - Table "public.p1_c1" - Column | Type | Collation | Nullable | Default ---------+---------+-----------+----------+--------- - f1 | integer | | | -Check constraints: - "f1_pos" CHECK (f1 > 0) -Inherits: p1 - +ERROR: error triggered for injection point typecache-before-rel-type-cache-insert +LINE 4: ORDER BY 1; + ^ drop table p1 cascade; NOTICE: drop cascades to table p1_c1 create table p1(f1 int constraint f1_pos CHECK (f1 > 0));
Thank you for reporting this.
Looks weird that injection point, which isn't used in these tests, got
triggered here.
I'm looking into this.
------
Regards,
Alexander Korotkov
Supabase
On Fri, Oct 25, 2024 at 12:48 PM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
On Fri, Oct 25, 2024 at 11:35 AM Andres Freund <andres@anarazel.de> wrote:
On 2024-10-22 20:33:24 +0300, Alexander Korotkov wrote:
Thank you, Pavel! 0001 revised according to your suggestion.
Starting with this commit CI fails.
https://cirrus-ci.com/task/6668851469877248
https://api.cirrus-ci.com/v1/artifact/task/6668851469877248/testrun/build/testrun/regress-running/regress/regression.diffsdiff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out /tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out --- /tmp/cirrus-ci-build/src/test/regress/expected/inherit.out 2024-10-24 11:38:43.829712000 +0000 +++ /tmp/cirrus-ci-build/build/testrun/regress-running/regress/results/inherit.out 2024-10-24 11:44:57.154238000 +0000 @@ -1338,14 +1338,9 @@ ERROR: cannot drop inherited constraint "f1_pos" of relation "p1_c1" alter table p1 drop constraint f1_pos; \d p1_c1 - Table "public.p1_c1" - Column | Type | Collation | Nullable | Default ---------+---------+-----------+----------+--------- - f1 | integer | | | -Check constraints: - "f1_pos" CHECK (f1 > 0) -Inherits: p1 - +ERROR: error triggered for injection point typecache-before-rel-type-cache-insert +LINE 4: ORDER BY 1; + ^ drop table p1 cascade; NOTICE: drop cascades to table p1_c1 create table p1(f1 int constraint f1_pos CHECK (f1 > 0));Thank you for reporting this.
Looks weird that injection point, which isn't used in these tests, got
triggered here.
I'm looking into this.
Oh, I forgot to make injection points in typcache_rel_type_cache.sql
local. Thus, it affects concurrent tests. Must be fixed in
aa1e898dea.
------
Regards,
Alexander Korotkov
Supabase
On Tue, Oct 22, 2024 at 08:33:24PM +0300, Alexander Korotkov wrote:
On Tue, Oct 22, 2024 at 6:10 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:
I'm going to push this if no objections.
(This became commit b85a9d0.)
+ /* Call delete_rel_type_cache() if we actually cleared something */ + if (hadTupDescOrOpclass) + delete_rel_type_cache_if_needed(typentry);
I think the intent was to maintain the invariant that a RelIdToTypeIdCacheHash
entry exists if and only if certain kinds of data appear in the TypeCacheHash
entry. However, TypeCacheOpcCallback() clears TCFLAGS_OPERATOR_FLAGS without
maintaining RelIdToTypeIdCacheHash. Is it right to do that?
On Fri, Apr 11, 2025 at 11:32 PM Noah Misch <noah@leadboat.com> wrote:
On Tue, Oct 22, 2024 at 08:33:24PM +0300, Alexander Korotkov wrote:
On Tue, Oct 22, 2024 at 6:10 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:
I'm going to push this if no objections.
(This became commit b85a9d0.)
+ /* Call delete_rel_type_cache() if we actually cleared something */ + if (hadTupDescOrOpclass) + delete_rel_type_cache_if_needed(typentry);I think the intent was to maintain the invariant that a RelIdToTypeIdCacheHash
entry exists if and only if certain kinds of data appear in the TypeCacheHash
entry. However, TypeCacheOpcCallback() clears TCFLAGS_OPERATOR_FLAGS without
maintaining RelIdToTypeIdCacheHash. Is it right to do that?
Thank you for the question. I'll recheck this in next couple of days.
------
Regards,
Alexander Korotkov
Supabase
Hi, Noah!
On Sat, Apr 12, 2025 at 12:43 AM Alexander Korotkov
<aekorotkov@gmail.com> wrote:
On Fri, Apr 11, 2025 at 11:32 PM Noah Misch <noah@leadboat.com> wrote:
On Tue, Oct 22, 2024 at 08:33:24PM +0300, Alexander Korotkov wrote:
On Tue, Oct 22, 2024 at 6:10 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:
I'm going to push this if no objections.
(This became commit b85a9d0.)
+ /* Call delete_rel_type_cache() if we actually cleared something */ + if (hadTupDescOrOpclass) + delete_rel_type_cache_if_needed(typentry);I think the intent was to maintain the invariant that a RelIdToTypeIdCacheHash
entry exists if and only if certain kinds of data appear in the TypeCacheHash
entry. However, TypeCacheOpcCallback() clears TCFLAGS_OPERATOR_FLAGS without
maintaining RelIdToTypeIdCacheHash. Is it right to do that?Thank you for the question. I'll recheck this in next couple of days.
Sorry for the delay. Generally, your finding is correct. But, I
didn't manage to reproduce the situation, where existing code leads to
real error. In order to have it, we must have typcache entry without
TCFLAGS_HAVE_PG_TYPE_DATA and tupDesc, but with some of
TCFLAGS_OPERATOR_FLAGS. Reseting TCFLAGS_HAVE_PG_TYPE_DATA for a
composite type doesn't seem to be possible without resetting the rest
at the same time.
Nevertheless, I think it would be fragile to leave the current code
"as is". If even there is no case of real error (or it's just me
didn't manage to find it), it could appear after further changes of
type cache code. So, the fix is attached.
------
Regards,
Alexander Korotkov
Supabase
Attachments:
v1-0001-Maintain-RelIdToTypeIdCacheHash-in-TypeCacheOpcCa.patchapplication/octet-stream; name=v1-0001-Maintain-RelIdToTypeIdCacheHash-in-TypeCacheOpcCa.patchDownload
From f44dac0d623783aa7bb3ab03eb9c91bb76d1ae87 Mon Sep 17 00:00:00 2001
From: Alexander Korotkov <akorotkov@postgresql.org>
Date: Mon, 21 Apr 2025 01:40:32 +0300
Subject: [PATCH v1] Maintain RelIdToTypeIdCacheHash in TypeCacheOpcCallback()
b85a9d046efd introduced a new RelIdToTypeIdCacheHash, whose entries should
exist for typecache entries with TCFLAGS_HAVE_PG_TYPE_DATA flag set or any
of TCFLAGS_OPERATOR_FLAGS set or tupDesc set. However, TypeCacheOpcCallback(),
which resets TCFLAGS_OPERATOR_FLAGS, was forgotten to update
RelIdToTypeIdCacheHash.
This commit adds a delete_rel_type_cache_if_needed() call to the
TypeCacheOpcCallback() function to maintain RelIdToTypeIdCacheHash after
resetting TCFLAGS_OPERATOR_FLAGS.
Also, this commit fixes the name of the delete_rel_type_cache_if_needed()
function in its mentions in the comments.
Reported-by: Noah Misch
Discussion: https://postgr.es/m/20250411203241.e9.nmisch%40google.com
---
src/backend/utils/cache/typcache.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index ae65a1cce06..560f5595fda 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -2395,7 +2395,7 @@ InvalidateCompositeTypeCacheEntry(TypeCacheEntry *typentry)
/* Reset equality/comparison/hashing validity information */
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
- /* Call delete_rel_type_cache() if we actually cleared something */
+ /* Call delete_rel_type_cache_if_needed() if we actually cleared something */
if (hadTupDescOrOpclass)
delete_rel_type_cache_if_needed(typentry);
}
@@ -2542,7 +2542,7 @@ TypeCacheTypCallback(Datum arg, int cacheid, uint32 hashvalue)
TCFLAGS_CHECKED_DOMAIN_CONSTRAINTS);
/*
- * Call delete_rel_type_cache() if we cleaned
+ * Call delete_rel_type_cache_if_needed() if we cleaned
* TCFLAGS_HAVE_PG_TYPE_DATA flag previously.
*/
if (hadPgTypeData)
@@ -2576,8 +2576,17 @@ TypeCacheOpcCallback(Datum arg, int cacheid, uint32 hashvalue)
hash_seq_init(&status, TypeCacheHash);
while ((typentry = (TypeCacheEntry *) hash_seq_search(&status)) != NULL)
{
+ bool hadOpclass = (typentry->flags & TCFLAGS_OPERATOR_FLAGS);
+
/* Reset equality/comparison/hashing validity information */
typentry->flags &= ~TCFLAGS_OPERATOR_FLAGS;
+
+ /*
+ * Call delete_rel_type_cache_if_needed() if we actually cleared
+ * something
+ */
+ if (hadOpclass)
+ delete_rel_type_cache_if_needed(typentry);
}
}
--
2.39.5 (Apple Git-154)
On Mon, Apr 21, 2025 at 04:54:08AM +0300, Alexander Korotkov wrote:
On Sat, Apr 12, 2025 at 12:43 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Fri, Apr 11, 2025 at 11:32 PM Noah Misch <noah@leadboat.com> wrote:
On Tue, Oct 22, 2024 at 08:33:24PM +0300, Alexander Korotkov wrote:
On Tue, Oct 22, 2024 at 6:10 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:
I'm going to push this if no objections.
(This became commit b85a9d0.)
+ /* Call delete_rel_type_cache() if we actually cleared something */ + if (hadTupDescOrOpclass) + delete_rel_type_cache_if_needed(typentry);I think the intent was to maintain the invariant that a RelIdToTypeIdCacheHash
entry exists if and only if certain kinds of data appear in the TypeCacheHash
entry. However, TypeCacheOpcCallback() clears TCFLAGS_OPERATOR_FLAGS without
maintaining RelIdToTypeIdCacheHash. Is it right to do that?
Sorry for the delay. Generally, your finding is correct. But, I
didn't manage to reproduce the situation, where existing code leads to
real error. In order to have it, we must have typcache entry without
TCFLAGS_HAVE_PG_TYPE_DATA and tupDesc, but with some of
TCFLAGS_OPERATOR_FLAGS.
That makes sense.
Reseting TCFLAGS_HAVE_PG_TYPE_DATA for a
composite type doesn't seem to be possible without resetting the rest
at the same time.Nevertheless, I think it would be fragile to leave the current code
"as is". If even there is no case of real error (or it's just me
didn't manage to find it), it could appear after further changes of
type cache code. So, the fix is attached.
This change looks appropriate. Thanks.
Hi, Noah!
On Tue, Apr 29, 2025 at 3:56 AM Noah Misch <noah@leadboat.com> wrote:
On Mon, Apr 21, 2025 at 04:54:08AM +0300, Alexander Korotkov wrote:
On Sat, Apr 12, 2025 at 12:43 AM Alexander Korotkov <aekorotkov@gmail.com> wrote:
On Fri, Apr 11, 2025 at 11:32 PM Noah Misch <noah@leadboat.com> wrote:
On Tue, Oct 22, 2024 at 08:33:24PM +0300, Alexander Korotkov wrote:
On Tue, Oct 22, 2024 at 6:10 PM Pavel Borisov <pashkin.elfe@gmail.com> wrote:
On Tue, 22 Oct 2024 at 11:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:
I'm going to push this if no objections.
(This became commit b85a9d0.)
+ /* Call delete_rel_type_cache() if we actually cleared something */ + if (hadTupDescOrOpclass) + delete_rel_type_cache_if_needed(typentry);I think the intent was to maintain the invariant that a RelIdToTypeIdCacheHash
entry exists if and only if certain kinds of data appear in the TypeCacheHash
entry. However, TypeCacheOpcCallback() clears TCFLAGS_OPERATOR_FLAGS without
maintaining RelIdToTypeIdCacheHash. Is it right to do that?Sorry for the delay. Generally, your finding is correct. But, I
didn't manage to reproduce the situation, where existing code leads to
real error. In order to have it, we must have typcache entry without
TCFLAGS_HAVE_PG_TYPE_DATA and tupDesc, but with some of
TCFLAGS_OPERATOR_FLAGS.That makes sense.
Reseting TCFLAGS_HAVE_PG_TYPE_DATA for a
composite type doesn't seem to be possible without resetting the rest
at the same time.Nevertheless, I think it would be fragile to leave the current code
"as is". If even there is no case of real error (or it's just me
didn't manage to find it), it could appear after further changes of
type cache code. So, the fix is attached.This change looks appropriate. Thanks.
Thank you for your feedback!
------
Regards,
Alexander Korotkov
Supabase