Huge memory consumption on partitioned table with FKs

keisuke.kuroda.3862@gmail.com

about 5 years ago

In reply to: Tatsuro Yamada (#1)

1 attachment(s)

Re: Huge memory consumption on partitioned table with FKs

Hi Hackers,

Analyzed the problem and created a patch to resolve it.

# Problem 1

When you create a foreign key to a partitioned table, referential
integrity function is created for the number of partitions.
Internally, SPI_prepare() creates a plan and SPI_keepplan()
caches the plan.

The more partitions in the referencing table, the more plans will
be cached.

# Problem 2

When referenced table is partitioned table, the larger the number
of partitions, the larger the plan size to be cached.

The actual plan processed is simple and small if pruning is
enabled. However, the cached plan will include all partition
information.

The more partitions in the referenced table, the larger the plan
size to be cached.

# Idea for solution

Constraints with the same pg_constraint.parentid can be combined
into one plan with the same comparentid if we can guarantee that
all their contents are the same.

Problem 1 can be solved
and significant memory bloat can be avoided.
CachedPlan: 710MB -> 1466kB

Solving Problem 2 could also reduce memory,
but I don't have a good idea.

Currently, DISCARD ALL does not discard CachedPlan by SPI as in
this case. It may be possible to modify DISCARD ALL to discard
CachedPlan and run it periodically. However, we think the burden
on the user is high.

# result with patch(PG14 HEAD(e522024b) + patch)

name | bytes | pg_size_pretty
------------------+---------+----------------
CachedPlanQuery | 12912 | 13 kB
CachedPlanSource | 17448 | 17 kB
CachedPlan | 1501192 | 1466 kB

CachedPlan * 1 Record

postgres=# SELECT count(*) FROM pg_backend_memory_contexts WHERE name
= 'CachedPlan' AND ident LIKE 'SELECT 1 FROM%';
count
-------
1

postgres=# SELECT * FROM pg_backend_memory_contexts WHERE name =
'CachedPlan' AND ident LIKE 'SELECT 1 FROM%';
-[ RECORD 1 ]-+--------------------------------------------------------------------------------------
name | CachedPlan
ident | SELECT 1 FROM "public"."ps" x WHERE "c1"
OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x
parent | CacheMemoryContext
level | 2
total_bytes | 2101248
total_nblocks | 12
free_bytes | 613256
free_chunks | 1
used_bytes | 1487992
(1 record)

# result without patch(PG14 HEAD(e522024b))

name | bytes | pg_size_pretty
------------------+-----------+----------------
CachedPlanQuery | 1326280 | 1295 kB
CachedPlanSource | 1474528 | 1440 kB
CachedPlan | 744009200 | 710 MB

CachedPlan * 500 Records

postgres=# SELECT count(*) FROM pg_backend_memory_contexts WHERE name
= 'CachedPlan' AND ident LIKE 'SELECT 1 FROM%';
count
-------
500

Best Regards,
--
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3862@gmail.com

Attachments:

v1_reduce_ri_SPI-plan-hash.patchapplication/octet-stream; name=v1_reduce_ri_SPI-plan-hash.patchDownload

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 7e2b2e3..b0c7f18 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -105,6 +105,7 @@ typedef struct RI_ConstraintInfo
 	NameData	conname;		/* name of the FK constraint */
 	Oid			pk_relid;		/* referenced relation */
 	Oid			fk_relid;		/* referencing relation */
+	Oid			conparentid;	/* parent constraint */
 	char		confupdtype;	/* foreign key's ON UPDATE action */
 	char		confdeltype;	/* foreign key's ON DELETE action */
 	char		confmatchtype;	/* foreign key's match type */
@@ -221,6 +222,7 @@ static void ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 							   Relation pk_rel, Relation fk_rel,
 							   TupleTableSlot *violatorslot, TupleDesc tupdesc,
 							   int queryno, bool partgone) pg_attribute_noreturn();
+static Oid ri_GetParentConstOid(Oid constraintOid);
 
 
 /*
@@ -1904,8 +1906,12 @@ ri_BuildQueryKey(RI_QueryKey *key, const RI_ConstraintInfo *riinfo,
 	/*
 	 * We assume struct RI_QueryKey contains no padding bytes, else we'd need
 	 * to use memset to clear them.
+	 * If this constraint has conparentid, we use it instead of constraint_id.
+	 * Because constraints which has same conparentid are able to share SPI
+	 * Plan, so this is useful to reduce hashtable size..
 	 */
-	key->constr_id = riinfo->constraint_id;
+	key->constr_id = (riinfo->conparentid == InvalidOid) ?
+				riinfo->constraint_id : ri_GetParentConstOid(riinfo->conparentid);
 	key->constr_queryno = constr_queryno;
 }
 
@@ -2059,6 +2065,7 @@ ri_LoadConstraintInfo(Oid constraintOid)
 	riinfo->confupdtype = conForm->confupdtype;
 	riinfo->confdeltype = conForm->confdeltype;
 	riinfo->confmatchtype = conForm->confmatchtype;
+	riinfo->conparentid = conForm->conparentid;
 
 	DeconstructFkConstraintRow(tup,
 							   &riinfo->nkeys,
@@ -2864,6 +2871,22 @@ ri_HashCompareOp(Oid eq_opr, Oid typeid)
 	return entry;
 }
 
+/*
+ * Find top of parent constraint oid.
+ */
+static Oid
+ri_GetParentConstOid(Oid constraintOid)
+{
+	const RI_ConstraintInfo *riinfo;
+	Oid result;
+
+	riinfo = ri_LoadConstraintInfo(constraintOid);
+
+	result = (riinfo->conparentid == InvalidOid) ?
+		constraintOid :  ri_GetParentConstOid(riinfo->conparentid);
+
+	return result;
+}
 
 /*
  * Given a trigger function OID, determine whether it is an RI trigger,

horikyota.ntt@gmail.com

about 5 years ago

In reply to: Keisuke Kuroda (#2)

Re: Huge memory consumption on partitioned table with FKs

At Thu, 26 Nov 2020 09:59:28 +0900, Keisuke Kuroda <keisuke.kuroda.3862@gmail.com> wrote in

Hi Hackers,

Analyzed the problem and created a patch to resolve it.

# Problem 1

When you create a foreign key to a partitioned table, referential
integrity function is created for the number of partitions.
Internally, SPI_prepare() creates a plan and SPI_keepplan()
caches the plan.

The more partitions in the referencing table, the more plans will
be cached.

# Problem 2

When referenced table is partitioned table, the larger the number
of partitions, the larger the plan size to be cached.

The actual plan processed is simple and small if pruning is
enabled. However, the cached plan will include all partition
information.

The more partitions in the referenced table, the larger the plan
size to be cached.

# Idea for solution

Constraints with the same pg_constraint.parentid can be combined
into one plan with the same comparentid if we can guarantee that
all their contents are the same.

The memory reduction this patch gives seems quite high with a small
footprint.

This shares RI_ConstraintInfo cache by constraints that shares the
same parent constraints. But you forgot that the cache contains some
members that can differ among partitions.

Consider the case of attaching a partition that have experienced a
column deletion.

create table t (a int primary key);
create table p (a int, r int references t(a)) partition by range(a);
create table c1 partition of p for values from (0) to (10);
create table c2 (a int, r int);
alter table c2 drop column r;
alter table c2 add column r int;
alter table p attach partition c2 for values from (10) to (20);

In that case riinfo->fk_attnums has different values from other
partitions.

=# select oid, conrelid::regclass, confrelid::regclass, conparentid, conname, conkey from pg_constraint where confrelid = 't'::regclass;

oid | conrelid | confrelid | conparentid | conname | conkey
-------+----------+-----------+-------------+----------+--------
16620 | p | t | 0 | p_r_fkey | {2}
16626 | c1 | t | 16620 | p_r_fkey | {2}
16632 | c2 | t | 16620 | p_r_fkey | {3}
(3 rows)

conkey is copied onto riinfo->fk_attnums.

Problem 1 can be solved
and significant memory bloat can be avoided.
CachedPlan: 710MB -> 1466kB

Solving Problem 2 could also reduce memory,
but I don't have a good idea.

Currently, DISCARD ALL does not discard CachedPlan by SPI as in
this case. It may be possible to modify DISCARD ALL to discard
CachedPlan and run it periodically. However, we think the burden
on the user is high.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

alvherre@alvh.no-ip.org

about 5 years ago

In reply to: Kyotaro Horiguchi (#3)

Re: Huge memory consumption on partitioned table with FKs

On 2020-Nov-26, Kyotaro Horiguchi wrote:

This shares RI_ConstraintInfo cache by constraints that shares the
same parent constraints. But you forgot that the cache contains some
members that can differ among partitions.

Consider the case of attaching a partition that have experienced a
column deletion.

I think this can be solved easily in the patch, by having
ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
they are equal then use the parent's constaint_id, otherwise use the
child constraint. That way, the cache entry is reused in the common
case where they are identical.

I would embed all this knowledge in ri_BuildQueryKey though, without
adding the new function ri_GetParentConstOid. I don't think that
function meaningful abstraction value, and instead it would make what I
suggest more difficult.

horikyota.ntt@gmail.com

about 5 years ago

In reply to: Alvaro Herrera (#4)

Re: Huge memory consumption on partitioned table with FKs

At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

On 2020-Nov-26, Kyotaro Horiguchi wrote:

This shares RI_ConstraintInfo cache by constraints that shares the
same parent constraints. But you forgot that the cache contains some
members that can differ among partitions.

Consider the case of attaching a partition that have experienced a
column deletion.

I think this can be solved easily in the patch, by having
ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
they are equal then use the parent's constaint_id, otherwise use the
child constraint. That way, the cache entry is reused in the common
case where they are identical.

*I* think it's the direction. After an off-list discussion, we
confirmed that even in that case the patch works as is because
fk_attnum (or contuple.conkey) always stores key attnums compatible
to the topmost parent when conparent has a valid value (assuming the
current usage of fk_attnum), but I still feel uneasy to rely on that
unclear behavior.

I would embed all this knowledge in ri_BuildQueryKey though, without
adding the new function ri_GetParentConstOid. I don't think that
function meaningful abstraction value, and instead it would make what I
suggest more difficult.

It seems to me reasonable.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Corey Huinker

corey.huinker@gmail.com

about 5 years ago

In reply to: Alvaro Herrera (#4)

Re: Huge memory consumption on partitioned table with FKs

I think this can be solved easily in the patch, by having
ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
they are equal then use the parent's constaint_id, otherwise use the
child constraint. That way, the cache entry is reused in the common
case where they are identical.

Somewhat of a detour, but in reviewing the patch for Statement-Level RI
checks, Andres and I observed that SPI made for a large portion of the RI
overhead.

Given that we're already looking at these checks, I was wondering if this
might be the time to consider implementing these checks by directly
scanning the constraint index.

tgl@sss.pgh.pa.us

about 5 years ago

In reply to: Corey Huinker (#6)

Re: Huge memory consumption on partitioned table with FKs

Corey Huinker <corey.huinker@gmail.com> writes:

Given that we're already looking at these checks, I was wondering if this
might be the time to consider implementing these checks by directly
scanning the constraint index.

Yeah, maybe. Certainly ri_triggers is putting a huge amount of effort
into working around the SPI/parser/planner layer, to not a lot of gain.

However, it's not clear to me that that line of thought will work well
for the statement-level-trigger approach. In that case you might be
dealing with enough tuples to make a different plan advisable.

regards, tom lane

Corey Huinker

corey.huinker@gmail.com

about 5 years ago

In reply to: Tom Lane (#7)

Re: Huge memory consumption on partitioned table with FKs

On Mon, Nov 30, 2020 at 9:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Corey Huinker <corey.huinker@gmail.com> writes:

Given that we're already looking at these checks, I was wondering if this
might be the time to consider implementing these checks by directly
scanning the constraint index.

Yeah, maybe. Certainly ri_triggers is putting a huge amount of effort
into working around the SPI/parser/planner layer, to not a lot of gain.

However, it's not clear to me that that line of thought will work well
for the statement-level-trigger approach. In that case you might be
dealing with enough tuples to make a different plan advisable.

regards, tom lane

Bypassing SPI would probably mean that we stay with row level triggers, and
the cached query plan would go away, perhaps replaced by an
already-looked-up-this-tuple hash sorta like what the cached nested loops
effort is doing.

I've been meaning to give this a try when I got some spare time. This may
inspire me to try again.

amitlangote09@gmail.com

about 5 years ago

In reply to: Kyotaro Horiguchi (#5)

1 attachment(s)

Re: Huge memory consumption on partitioned table with FKs

Hello,

On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

On 2020-Nov-26, Kyotaro Horiguchi wrote:

This shares RI_ConstraintInfo cache by constraints that shares the
same parent constraints. But you forgot that the cache contains some
members that can differ among partitions.

Consider the case of attaching a partition that have experienced a
column deletion.

I think this can be solved easily in the patch, by having
ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
they are equal then use the parent's constaint_id, otherwise use the
child constraint. That way, the cache entry is reused in the common
case where they are identical.

*I* think it's the direction. After an off-list discussion, we
confirmed that even in that case the patch works as is because
fk_attnum (or contuple.conkey) always stores key attnums compatible
to the topmost parent when conparent has a valid value (assuming the
current usage of fk_attnum), but I still feel uneasy to rely on that
unclear behavior.

Hmm, I don't see what the problem is here, because it's not
RI_ConstraintInfo that is being shared among partitions of the same
parent, but the RI query (and the SPIPlanPtr for it) that is issued
through SPI. The query (and the plan) turns out to be the same no
matter what partition's RI_ConstraintInfo is first used to generate
it. What am I missing?

BTW, one problem with Kuroda-san's patch is that it's using
conparentid as the shared key, which can still lead to multiple
queries being generated when using multi-level partitioning, because
there would be many (intermediate) parent constraints in that case.
We should really be using the "root" constraint OID as the key,
because there would only be one root from which all constraints in a
given partition hierarchy would've originated. Actually, I had
written a patch a few months back to do exactly that, a polished
version of which I've attached with this email. Please take a look.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachments:

0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-ri_q.patchapplication/octet-stream; name=0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-ri_q.patchDownload

From 04ee4ce8f0d772a4d0d3200a53212391fb67908e Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Thu, 23 Apr 2020 21:11:58 +0900
Subject: [PATCH] ri_triggers.c: Use root constraint OID as key to
 ri_query_cache

All child constraints of a given foreign key constraint can use the
same RI query and the resulting plan, that is, no need to create as
many copies of the query and the plan as there are partitions, as
happens now due to the child constraint OID being used in the key
for ri_query_cache.
---
 src/backend/utils/adt/ri_triggers.c | 54 ++++++++++++++++++++++++++++++++++---
 1 file changed, 50 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 02b1a38..851cb97 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -95,11 +95,17 @@
  * RI_ConstraintInfo
  *
  * Information extracted from an FK pg_constraint entry.  This is cached in
- * ri_constraint_cache.
+ * ri_constraint_cache using constraint_id or constraint_root_id if the latter
+ * is not 0.
  */
 typedef struct RI_ConstraintInfo
 {
-	Oid			constraint_id;	/* OID of pg_constraint entry (hash key) */
+	Oid			constraint_id;	/* OID of pg_constraint entry */
+	Oid			constraint_root_id;	/* OID of the constraint in some ancestor
+									 * of the partition (most commonly root
+									 * ancestor) from which this constraint was
+									 * inherited; 0 if the constraint has no
+									 * parent */
 	bool		valid;			/* successfully initialized? */
 	uint32		oidHashValue;	/* hash value of pg_constraint OID */
 	NameData	conname;		/* name of the FK constraint */
@@ -221,6 +227,8 @@ static void ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 							   Relation pk_rel, Relation fk_rel,
 							   TupleTableSlot *violatorslot, TupleDesc tupdesc,
 							   int queryno, bool partgone) pg_attribute_noreturn();
+static Oid get_ri_constraint_root(Oid constrOid);
+static Oid get_ri_constraint_root_recurse(Oid constrOid);
 
 
 /*
@@ -1892,7 +1900,7 @@ ri_GenerateQualCollation(StringInfo buf, Oid collation)
  *	Construct a hashtable key for a prepared SPI plan of an FK constraint.
  *
  *		key: output argument, *key is filled in based on the other arguments
- *		riinfo: info from pg_constraint entry
+ *		riinfo: info derived from pg_constraint entry
  *		constr_queryno: an internal number identifying the query type
  *			(see RI_PLAN_XXX constants at head of file)
  * ----------
@@ -1902,10 +1910,18 @@ ri_BuildQueryKey(RI_QueryKey *key, const RI_ConstraintInfo *riinfo,
 				 int32 constr_queryno)
 {
 	/*
+	 * Using constraint_root_id instead of constraint_id will mean that for
+	 * all partitions descending from the same parent in which the constraint
+	 * originates will share the same SPI plan.  It is okay to share the plan,
+	 * because the same query is used for all partitions for all RI query
+	 * types.
+	 *
 	 * We assume struct RI_QueryKey contains no padding bytes, else we'd need
 	 * to use memset to clear them.
 	 */
-	key->constr_id = riinfo->constraint_id;
+	key->constr_id = OidIsValid(riinfo->constraint_root_id) ?
+						riinfo->constraint_root_id :
+						riinfo->constraint_id;
 	key->constr_queryno = constr_queryno;
 }
 
@@ -2010,6 +2026,35 @@ ri_FetchConstraintInfo(Trigger *trigger, Relation trig_rel, bool rel_is_pk)
 }
 
 /*
+ * get_ri_constraint_root
+ *		Returns a given RI constraint's root parent or 0 if it's not inherited
+ */
+static Oid
+get_ri_constraint_root(Oid constrOid)
+{
+	Oid		constr_root = get_ri_constraint_root_recurse(constrOid);
+
+	return constr_root == constrOid ? InvalidOid : constr_root;
+}
+
+static Oid
+get_ri_constraint_root_recurse(Oid constrOid)
+{
+	HeapTuple	tuple;
+	Oid			constrParentOid;
+
+	tuple = SearchSysCache1(CONSTROID, ObjectIdGetDatum(constrOid));
+	if (!HeapTupleIsValid(tuple))
+		elog(ERROR, "cache lookup failed for constraint %u", constrOid);
+	constrParentOid = ((Form_pg_constraint) GETSTRUCT(tuple))->conparentid;
+	ReleaseSysCache(tuple);
+	if (OidIsValid(constrParentOid))
+		return get_ri_constraint_root_recurse(constrParentOid);
+
+	return constrOid;
+}
+
+/*
  * Fetch or create the RI_ConstraintInfo struct for an FK constraint.
  */
 static const RI_ConstraintInfo *
@@ -2051,6 +2096,7 @@ ri_LoadConstraintInfo(Oid constraintOid)
 
 	/* And extract data */
 	Assert(riinfo->constraint_id == constraintOid);
+	riinfo->constraint_root_id = get_ri_constraint_root(riinfo->constraint_id);
 	riinfo->oidHashValue = GetSysCacheHashValue1(CONSTROID,
 												 ObjectIdGetDatum(constraintOid));
 	memcpy(&riinfo->conname, &conForm->conname, sizeof(NameData));
-- 
1.8.3.1

#10

Simon Riggs

simon@2ndquadrant.com

about 5 years ago

In reply to: Alvaro Herrera (#4)

Re: Huge memory consumption on partitioned table with FKs

On Tue, 1 Dec 2020 at 00:03, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

cache entry is reused in the common case where they are identical.

Does a similar situation exist for partition statistics accessed
during planning? Or planning itself?

It would be useful to avoid repeated access to similar statistics and
repeated planning of sub-plans for similar partitions.

--
Simon Riggs http://www.EnterpriseDB.com/

#11

horikyota.ntt@gmail.com

about 5 years ago

In reply to: Amit Langote (#9)

Re: Huge memory consumption on partitioned table with FKs

At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

Hello,

On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

On 2020-Nov-26, Kyotaro Horiguchi wrote:

This shares RI_ConstraintInfo cache by constraints that shares the
same parent constraints. But you forgot that the cache contains some
members that can differ among partitions.

Consider the case of attaching a partition that have experienced a
column deletion.

I think this can be solved easily in the patch, by having
ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
they are equal then use the parent's constaint_id, otherwise use the
child constraint. That way, the cache entry is reused in the common
case where they are identical.

*I* think it's the direction. After an off-list discussion, we
confirmed that even in that case the patch works as is because
fk_attnum (or contuple.conkey) always stores key attnums compatible
to the topmost parent when conparent has a valid value (assuming the
current usage of fk_attnum), but I still feel uneasy to rely on that
unclear behavior.

Hmm, I don't see what the problem is here, because it's not
RI_ConstraintInfo that is being shared among partitions of the same
parent, but the RI query (and the SPIPlanPtr for it) that is issued
through SPI. The query (and the plan) turns out to be the same no
matter what partition's RI_ConstraintInfo is first used to generate
it. What am I missing?

I don't think you're missing something. As I (tried to..) mentioned in
another branch of this thread, you're right. On the otherhand
fk_attnums is actually used to generate the query, thus *issue*
potentially affects the query. Although that might be paranoic, that
possibility is eliminated by checking the congruency(?) of fk_attnums
(or other members). We might even be able to share a riinfo among
such children.

BTW, one problem with Kuroda-san's patch is that it's using
conparentid as the shared key, which can still lead to multiple
queries being generated when using multi-level partitioning, because
there would be many (intermediate) parent constraints in that case.
We should really be using the "root" constraint OID as the key,
because there would only be one root from which all constraints in a
given partition hierarchy would've originated. Actually, I had
written a patch a few months back to do exactly that, a polished
version of which I've attached with this email. Please take a look.

I don't like that the function self-recurses but
ri_GetParentConstrOid() actually climbs up to the root constraint, so
the patch covers the multilevel partitioning cases.

About your patch, it calculates the root constrid at the time an
riinfo is created, but when the root-partition is further attached to
another partitioned-table after the riinfo creation,
constraint_root_id gets stale. Of course that dones't matter
practically, though.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#12

keisuke.kuroda.3862@gmail.com

about 5 years ago

In reply to: Kyotaro Horiguchi (#11)

1 attachment(s)

Re: Huge memory consumption on partitioned table with FKs

Hi Hackers,

I would embed all this knowledge in ri_BuildQueryKey though, without
adding the new function ri_GetParentConstOid. I don't think that
function meaningful abstraction value, and instead it would make what I
suggest more difficult.

It seems to me reasonable.

Indeed, I tried to get the conparentid with ri_BuildQueryKey.
(v2_reduce_ri_SPI-plan-hash.patch)

Somewhat of a detour, but in reviewing the patch for
Statement-Level RI checks, Andres and I observed that SPI
made for a large portion of the RI overhead.

If this can be resolved by reviewing the Statement-Level RI checks,
then I think it would be better.

BTW, one problem with Kuroda-san's patch is that it's using
conparentid as the shared key, which can still lead to multiple
queries being generated when using multi-level partitioning, because
there would be many (intermediate) parent constraints in that case.

Horiguchi-san also mentiond,
In my patch, in ri_GetParentConstOid, iterate on getting
the constraint OID until comparentid == InvalidOid.
This should allow to get the "root constraint OID".

However, the names "comparentid" and "ri_GetParentConstOid"
are confusing. We should use names like "constraint_root_id",
as in Amit-san's patch.

Best Regards,
--
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3862@gmail.com

Attachments:

v2_reduce_ri_SPI-plan-hash.patchapplication/octet-stream; name=v2_reduce_ri_SPI-plan-hash.patchDownload

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 7e2b2e3..7139c8b 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -105,6 +105,7 @@ typedef struct RI_ConstraintInfo
 	NameData	conname;		/* name of the FK constraint */
 	Oid			pk_relid;		/* referenced relation */
 	Oid			fk_relid;		/* referencing relation */
+	Oid			conparentid;	/* parent constraint */
 	char		confupdtype;	/* foreign key's ON UPDATE action */
 	char		confdeltype;	/* foreign key's ON DELETE action */
 	char		confmatchtype;	/* foreign key's match type */
@@ -221,6 +222,7 @@ static void ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 							   Relation pk_rel, Relation fk_rel,
 							   TupleTableSlot *violatorslot, TupleDesc tupdesc,
 							   int queryno, bool partgone) pg_attribute_noreturn();
+static Oid ri_GetParentConstOid(Oid constraintOid);
 
 
 /*
@@ -366,6 +368,14 @@ RI_FKey_check(TriggerData *trigdata)
 		querysep = "WHERE";
 		for (int i = 0; i < riinfo->nkeys; i++)
 		{
+
+			/*
+			* We share the same plan among all relations in a partition
+			* hierarchy.  The plan is guaranteed to be compatible since all of
+			* the member relations are guaranteed to have the equivalent set
+			* of foreign keys in fk_attnums[].
+			*/
+
 			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
 			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
 
@@ -1904,7 +1914,16 @@ ri_BuildQueryKey(RI_QueryKey *key, const RI_ConstraintInfo *riinfo,
 	/*
 	 * We assume struct RI_QueryKey contains no padding bytes, else we'd need
 	 * to use memset to clear them.
+	 * If this constraint has conparentid, we use it instead of constraint_id.
+	 * Because constraints which has same conparentid are able to share SPI
+	 * Plan, so this is useful to reduce hashtable size.
+	 * Consider sub-partitioning and get the top comparentid.
 	 */
+
+	while (riinfo->conparentid != InvalidOid) {
+		riinfo = ri_LoadConstraintInfo(riinfo->conparentid); 
+	}
+
 	key->constr_id = riinfo->constraint_id;
 	key->constr_queryno = constr_queryno;
 }
@@ -2059,6 +2078,7 @@ ri_LoadConstraintInfo(Oid constraintOid)
 	riinfo->confupdtype = conForm->confupdtype;
 	riinfo->confdeltype = conForm->confdeltype;
 	riinfo->confmatchtype = conForm->confmatchtype;
+	riinfo->conparentid = conForm->conparentid;
 
 	DeconstructFkConstraintRow(tup,
 							   &riinfo->nkeys,

#13

amitlangote09@gmail.com

about 5 years ago

In reply to: Kyotaro Horiguchi (#11)

Re: Huge memory consumption on partitioned table with FKs

Hello,

On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

Hello,

On Tue, Dec 1, 2020 at 9:40 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Mon, 30 Nov 2020 21:03:45 -0300, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

On 2020-Nov-26, Kyotaro Horiguchi wrote:

This shares RI_ConstraintInfo cache by constraints that shares the
same parent constraints. But you forgot that the cache contains some
members that can differ among partitions.

Consider the case of attaching a partition that have experienced a
column deletion.

I think this can be solved easily in the patch, by having
ri_BuildQueryKey() compare the parent's fk_attnums to the parent; if
they are equal then use the parent's constaint_id, otherwise use the
child constraint. That way, the cache entry is reused in the common
case where they are identical.

*I* think it's the direction. After an off-list discussion, we
confirmed that even in that case the patch works as is because
fk_attnum (or contuple.conkey) always stores key attnums compatible
to the topmost parent when conparent has a valid value (assuming the
current usage of fk_attnum), but I still feel uneasy to rely on that
unclear behavior.

Hmm, I don't see what the problem is here, because it's not
RI_ConstraintInfo that is being shared among partitions of the same
parent, but the RI query (and the SPIPlanPtr for it) that is issued
through SPI. The query (and the plan) turns out to be the same no
matter what partition's RI_ConstraintInfo is first used to generate
it. What am I missing?

I don't think you're missing something. As I (tried to..) mentioned in
another branch of this thread, you're right. On the otherhand
fk_attnums is actually used to generate the query, thus *issue*
potentially affects the query. Although that might be paranoic, that
possibility is eliminated by checking the congruency(?) of fk_attnums
(or other members). We might even be able to share a riinfo among
such children.

Just to be sure I've not misunderstood you, let me mention why I think
the queries used by different partitions all turn out to be the same
despite attribute number differences among partitions. The places
that uses fk_attnums when generating a query are these among others:

Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
...
Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
...
quoteOneName(attname,
RIAttName(fk_rel, riinfo->fk_attnums[i]));

For the queries on the referencing side ("check" side),
type/collation/attribute name determined using the above are going to
be the same for all partitions in a given tree irrespective of the
attribute number, because they're logically the same column. On the
referenced side ("restrict", "cascade", "set" side), as you already
mentioned, fk_attnums refers to the top parent table of the
referencing side, so no possibility of they being different in the
various referenced partitions' RI_ConstraintInfos.

On the topic of how we'd be able to share even the RI_ConstraintInfos
among partitions, that would indeed look a bit more elaborate than the
patch we have right now.

BTW, one problem with Kuroda-san's patch is that it's using
conparentid as the shared key, which can still lead to multiple
queries being generated when using multi-level partitioning, because
there would be many (intermediate) parent constraints in that case.
We should really be using the "root" constraint OID as the key,
because there would only be one root from which all constraints in a
given partition hierarchy would've originated. Actually, I had
written a patch a few months back to do exactly that, a polished
version of which I've attached with this email. Please take a look.

I don't like that the function self-recurses but
ri_GetParentConstrOid() actually climbs up to the root constraint, so
the patch covers the multilevel partitioning cases.

Sorry, I got too easily distracted by the name Kuroda-san chose for
the field in RI_ConstraintInfo and the function.

About your patch, it calculates the root constrid at the time an
riinfo is created, but when the root-partition is further attached to
another partitioned-table after the riinfo creation,
constraint_root_id gets stale. Of course that dones't matter
practically, though.

Maybe we could also store the hash value of the root constraint OID as
rootHashValue and check for that one too in
InvalidateConstraintCacheCallBack(). That would take care of this
unless I'm missing something.

--
Amit Langote
EDB: http://www.enterprisedb.com

#14

horikyota.ntt@gmail.com

about 5 years ago

In reply to: Amit Langote (#13)

Re: Huge memory consumption on partitioned table with FKs

At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

Hello,

On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Wed, 2 Dec 2020 22:02:50 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

Hmm, I don't see what the problem is here, because it's not
RI_ConstraintInfo that is being shared among partitions of the same
parent, but the RI query (and the SPIPlanPtr for it) that is issued
through SPI. The query (and the plan) turns out to be the same no
matter what partition's RI_ConstraintInfo is first used to generate
it. What am I missing?

I don't think you're missing something. As I (tried to..) mentioned in
another branch of this thread, you're right. On the otherhand
fk_attnums is actually used to generate the query, thus *issue*
potentially affects the query. Although that might be paranoic, that
possibility is eliminated by checking the congruency(?) of fk_attnums
(or other members). We might even be able to share a riinfo among
such children.

Just to be sure I've not misunderstood you, let me mention why I think
the queries used by different partitions all turn out to be the same
despite attribute number differences among partitions. The places
that uses fk_attnums when generating a query are these among others:

Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
...
Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
...
quoteOneName(attname,
RIAttName(fk_rel, riinfo->fk_attnums[i]));

For the queries on the referencing side ("check" side),
type/collation/attribute name determined using the above are going to
be the same for all partitions in a given tree irrespective of the
attribute number, because they're logically the same column. On the

Yes, I know that, which is what I meant by "practically" or
"actually", but it is not explicitly defined AFAICS.

Thus that would be no longer an issue if we explicitly define that
"When conpraentid stores a valid value, each element of fk_attnums
points to logically the same attribute with the RI_ConstraintInfo for
the parent constraint." Or I'd be happy if we have such a comment
there instead.

referenced side ("restrict", "cascade", "set" side), as you already
mentioned, fk_attnums refers to the top parent table of the
referencing side, so no possibility of they being different in the
various referenced partitions' RI_ConstraintInfos.

Right. (I'm not sure I have mention that here, though:p)A

On the topic of how we'd be able to share even the RI_ConstraintInfos
among partitions, that would indeed look a bit more elaborate than the
patch we have right now.

Maybe just letting the hash entry for the child riinfo point to the
parent riinfo if all members (other than constraint_id, of course)
share the exactly the same values. No need to count references since
we don't going to remove riinfos.

BTW, one problem with Kuroda-san's patch is that it's using
conparentid as the shared key, which can still lead to multiple
queries being generated when using multi-level partitioning, because
there would be many (intermediate) parent constraints in that case.
We should really be using the "root" constraint OID as the key,
because there would only be one root from which all constraints in a
given partition hierarchy would've originated. Actually, I had
written a patch a few months back to do exactly that, a polished
version of which I've attached with this email. Please take a look.

I don't like that the function self-recurses but
ri_GetParentConstrOid() actually climbs up to the root constraint, so
the patch covers the multilevel partitioning cases.

Sorry, I got too easily distracted by the name Kuroda-san chose for
the field in RI_ConstraintInfo and the function.

I thought the same at the first look to the function.

About your patch, it calculates the root constrid at the time an
riinfo is created, but when the root-partition is further attached to
another partitioned-table after the riinfo creation,
constraint_root_id gets stale. Of course that dones't matter
practically, though.

Maybe we could also store the hash value of the root constraint OID as
rootHashValue and check for that one too in
InvalidateConstraintCacheCallBack(). That would take care of this
unless I'm missing something.

Seems to be sound.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#15

amitlangote09@gmail.com

about 5 years ago

In reply to: Kyotaro Horiguchi (#14)

1 attachment(s)

Re: Huge memory consumption on partitioned table with FKs

On Thu, Dec 3, 2020 at 2:29 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

I don't think you're missing something. As I (tried to..) mentioned in
another branch of this thread, you're right. On the otherhand
fk_attnums is actually used to generate the query, thus *issue*
potentially affects the query. Although that might be paranoic, that
possibility is eliminated by checking the congruency(?) of fk_attnums
(or other members). We might even be able to share a riinfo among
such children.

Just to be sure I've not misunderstood you, let me mention why I think
the queries used by different partitions all turn out to be the same
despite attribute number differences among partitions. The places
that uses fk_attnums when generating a query are these among others:

Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
...
Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
...
quoteOneName(attname,
RIAttName(fk_rel, riinfo->fk_attnums[i]));

For the queries on the referencing side ("check" side),
type/collation/attribute name determined using the above are going to
be the same for all partitions in a given tree irrespective of the
attribute number, because they're logically the same column. On the

Yes, I know that, which is what I meant by "practically" or
"actually", but it is not explicitly defined AFAICS.

Well, I think it's great that we don't have to worry *in this part of
the code* about partition's fk_attnums not being congruent with the
root parent's, because ensuring that is the responsibility of the
other parts of the system such as DDL. If we have any problems in
this area, they should be dealt with by ensuring that there are no
bugs in those other parts.

Thus that would be no longer an issue if we explicitly define that
"When conpraentid stores a valid value, each element of fk_attnums
points to logically the same attribute with the RI_ConstraintInfo for
the parent constraint." Or I'd be happy if we have such a comment
there instead.

I saw a comment in Kuroda-san's v2 patch that is perhaps meant to
address this point, but the placement needs to be reconsidered:

@@ -366,6 +368,14 @@ RI_FKey_check(TriggerData *trigdata)
        querysep = "WHERE";
        for (int i = 0; i < riinfo->nkeys; i++)
        {
+
+           /*
+           * We share the same plan among all relations in a partition
+           * hierarchy.  The plan is guaranteed to be compatible since all of
+           * the member relations are guaranteed to have the equivalent set
+           * of foreign keys in fk_attnums[].
+           */
+
            Oid         pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
            Oid         fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);

A more appropriate place for this kind of comment would be where
fk_attnums is defined or in ri_BuildQueryKey() that is shared by
different RI query issuing functions.

referenced side ("restrict", "cascade", "set" side), as you already
mentioned, fk_attnums refers to the top parent table of the
referencing side, so no possibility of they being different in the
various referenced partitions' RI_ConstraintInfos.

Right. (I'm not sure I have mention that here, though:p)A

Maybe I misread but I think you did in your email dated Dec 1 where you said:

"After an off-list discussion, we confirmed that even in that case the
patch works as is because fk_attnum (or contuple.conkey) always stores
key attnums compatible to the topmost parent when conparent has a
valid value (assuming the current usage of fk_attnum), but I still
feel uneasy to rely on that unclear behavior."

On the topic of how we'd be able to share even the RI_ConstraintInfos
among partitions, that would indeed look a bit more elaborate than the
patch we have right now.

Maybe just letting the hash entry for the child riinfo point to the
parent riinfo if all members (other than constraint_id, of course)
share the exactly the same values. No need to count references since
we don't going to remove riinfos.

Ah, something maybe worth trying. Although the memory we'd save by
sharing the RI_ConstraintInfos would not add that much to the savings
we're having by sharing the plan, because it's the plans that are a
memory hog AFAIK.

About your patch, it calculates the root constrid at the time an
riinfo is created, but when the root-partition is further attached to
another partitioned-table after the riinfo creation,
constraint_root_id gets stale. Of course that dones't matter
practically, though.

Maybe we could also store the hash value of the root constraint OID as
rootHashValue and check for that one too in
InvalidateConstraintCacheCallBack(). That would take care of this
unless I'm missing something.

Seems to be sound.

Okay, thanks.

I have attached a patch in which I've tried to merge the ideas from
both my patch and Kuroda-san's. I liked that his patch added
conparentid to RI_ConstraintInfo because that saves a needless
syscache lookup for constraints that don't have a parent. I've kept
my idea to compute the root constraint id only once in
ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey().
Kuroda-san, anything you'd like to add to that?

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachments:

v2-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patchapplication/octet-stream; name=v2-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patchDownload

From d9fe2afcc9abe096334c058585cc577a1af86221 Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Thu, 23 Apr 2020 21:11:58 +0900
Subject: [PATCH v2] ri_triggers.c: Use root constraint OID as key to
 ri_query_cache

All child constraints of a given foreign key constraint can use the
same RI query and the resulting plan, that is, no need to create as
many copies of the query and the plan as there are partitions, as
happens now due to the child constraint OID being used in the key
for ri_query_cache.

Keisuke Kuroda, Amit Langote
---
 src/backend/utils/adt/ri_triggers.c | 75 ++++++++++++++++++++++++++++++++++---
 1 file changed, 69 insertions(+), 6 deletions(-)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 02b1a38..223225a 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -95,13 +95,21 @@
  * RI_ConstraintInfo
  *
  * Information extracted from an FK pg_constraint entry.  This is cached in
- * ri_constraint_cache.
+ * ri_constraint_cache using constraint_id or constraint_root_id if the latter
+ * is not 0.
  */
 typedef struct RI_ConstraintInfo
 {
-	Oid			constraint_id;	/* OID of pg_constraint entry (hash key) */
+	Oid			constraint_id;	/* OID of pg_constraint entry */
+	Oid			constraint_parent;	/* OID of parent of this constraint */
+	Oid			constraint_root_id;	/* OID of the constraint in some ancestor
+									 * of the partition (most commonly root
+									 * ancestor) from which this constraint was
+									 * inherited; 0 if the constraint has no
+									 * parent */
 	bool		valid;			/* successfully initialized? */
-	uint32		oidHashValue;	/* hash value of pg_constraint OID */
+	uint32		oidHashValue;	/* hash value of constraint_id */
+	uint32		rootHashValue;	/* hash value of constraint_root_id */
 	NameData	conname;		/* name of the FK constraint */
 	Oid			pk_relid;		/* referenced relation */
 	Oid			fk_relid;		/* referencing relation */
@@ -221,6 +229,8 @@ static void ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 							   Relation pk_rel, Relation fk_rel,
 							   TupleTableSlot *violatorslot, TupleDesc tupdesc,
 							   int queryno, bool partgone) pg_attribute_noreturn();
+static Oid get_ri_constraint_root(const RI_ConstraintInfo *riinfo);
+static Oid get_ri_constraint_root_recurse(Oid constrOid);
 
 
 /*
@@ -1892,7 +1902,7 @@ ri_GenerateQualCollation(StringInfo buf, Oid collation)
  *	Construct a hashtable key for a prepared SPI plan of an FK constraint.
  *
  *		key: output argument, *key is filled in based on the other arguments
- *		riinfo: info from pg_constraint entry
+ *		riinfo: info derived from pg_constraint entry
  *		constr_queryno: an internal number identifying the query type
  *			(see RI_PLAN_XXX constants at head of file)
  * ----------
@@ -1902,10 +1912,20 @@ ri_BuildQueryKey(RI_QueryKey *key, const RI_ConstraintInfo *riinfo,
 				 int32 constr_queryno)
 {
 	/*
+	 * Using constraint_root_id instead of constraint_id means that for all
+	 * partitions descending from the same ancestor in which the constraint
+	 * originates will share the same SPI plan.  It is okay to share the plan,
+	 * because the same query is used for all partitions for all RI query
+	 * types.  That is true despite the fact that riinfo->fk_attnums may
+	 * differ among partitions sharing the query, because they all refer to
+	 * the same logical column set.
+	 *
 	 * We assume struct RI_QueryKey contains no padding bytes, else we'd need
 	 * to use memset to clear them.
 	 */
-	key->constr_id = riinfo->constraint_id;
+	key->constr_id = OidIsValid(riinfo->constraint_root_id) ?
+						riinfo->constraint_root_id :
+						riinfo->constraint_id;
 	key->constr_queryno = constr_queryno;
 }
 
@@ -2010,6 +2030,36 @@ ri_FetchConstraintInfo(Trigger *trigger, Relation trig_rel, bool rel_is_pk)
 }
 
 /*
+ * get_ri_constraint_root
+ *		Returns a given RI constraint's root parent or 0 if it's not inherited
+ */
+static Oid
+get_ri_constraint_root(const RI_ConstraintInfo *riinfo)
+{
+	if (!OidIsValid(riinfo->constraint_parent))
+		return InvalidOid;
+
+	return get_ri_constraint_root_recurse(riinfo->constraint_parent);
+}
+
+static Oid
+get_ri_constraint_root_recurse(Oid constrOid)
+{
+	HeapTuple	tuple;
+	Oid			constrParentOid;
+
+	tuple = SearchSysCache1(CONSTROID, ObjectIdGetDatum(constrOid));
+	if (!HeapTupleIsValid(tuple))
+		elog(ERROR, "cache lookup failed for constraint %u", constrOid);
+	constrParentOid = ((Form_pg_constraint) GETSTRUCT(tuple))->conparentid;
+	ReleaseSysCache(tuple);
+	if (OidIsValid(constrParentOid))
+		return get_ri_constraint_root_recurse(constrParentOid);
+
+	return constrOid;
+}
+
+/*
  * Fetch or create the RI_ConstraintInfo struct for an FK constraint.
  */
 static const RI_ConstraintInfo *
@@ -2051,8 +2101,12 @@ ri_LoadConstraintInfo(Oid constraintOid)
 
 	/* And extract data */
 	Assert(riinfo->constraint_id == constraintOid);
+	riinfo->constraint_parent = conForm->conparentid;
+	riinfo->constraint_root_id = get_ri_constraint_root(riinfo);
 	riinfo->oidHashValue = GetSysCacheHashValue1(CONSTROID,
 												 ObjectIdGetDatum(constraintOid));
+	riinfo->rootHashValue = GetSysCacheHashValue1(CONSTROID,
+												  ObjectIdGetDatum(riinfo->constraint_root_id));
 	memcpy(&riinfo->conname, &conForm->conname, sizeof(NameData));
 	riinfo->pk_relid = conForm->confrelid;
 	riinfo->fk_relid = conForm->conrelid;
@@ -2117,7 +2171,16 @@ InvalidateConstraintCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
 		RI_ConstraintInfo *riinfo = dlist_container(RI_ConstraintInfo,
 													valid_link, iter.cur);
 
-		if (hashvalue == 0 || riinfo->oidHashValue == hashvalue)
+		/*
+		 * We look for changes to two specific pg_constraint entries here --
+		 * the one matching the constraint given by riinfo->constraint_id and
+		 * also the one given by riinfo->constraint_root_id.  The latter too
+		 * because if its parent is updated, it is no longer the root
+		 * constraint.
+		 */
+		if (hashvalue == 0 ||
+			riinfo->oidHashValue == hashvalue ||
+			riinfo->rootHashValue == hashvalue)
 		{
 			riinfo->valid = false;
 			/* Remove invalidated entries from the list, too */
-- 
1.8.3.1

#16

horikyota.ntt@gmail.com

about 5 years ago

In reply to: Amit Langote (#15)

2 attachment(s)

Re: Huge memory consumption on partitioned table with FKs

At Thu, 3 Dec 2020 16:41:45 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

On Thu, Dec 3, 2020 at 2:29 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Thu, 3 Dec 2020 12:27:53 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

On Thu, Dec 3, 2020 at 10:15 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
For the queries on the referencing side ("check" side),
type/collation/attribute name determined using the above are going to
be the same for all partitions in a given tree irrespective of the
attribute number, because they're logically the same column. On the

Yes, I know that, which is what I meant by "practically" or
"actually", but it is not explicitly defined AFAICS.

Well, I think it's great that we don't have to worry *in this part of
the code* about partition's fk_attnums not being congruent with the
root parent's, because ensuring that is the responsibility of the
other parts of the system such as DDL. If we have any problems in
this area, they should be dealt with by ensuring that there are no
bugs in those other parts.

Agreed.

Thus that would be no longer an issue if we explicitly define that
"When conpraentid stores a valid value, each element of fk_attnums
points to logically the same attribute with the RI_ConstraintInfo for
the parent constraint." Or I'd be happy if we have such a comment
there instead.

I saw a comment in Kuroda-san's v2 patch that is perhaps meant to
address this point, but the placement needs to be reconsidered:

Ah, yes, that comes from my proposal.

@@ -366,6 +368,14 @@ RI_FKey_check(TriggerData *trigdata)
querysep = "WHERE";
for (int i = 0; i < riinfo->nkeys; i++)
{
+
+           /*
+           * We share the same plan among all relations in a partition
+           * hierarchy.  The plan is guaranteed to be compatible since all of
+           * the member relations are guaranteed to have the equivalent set
+           * of foreign keys in fk_attnums[].
+           */
+
Oid         pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid         fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);

A more appropriate place for this kind of comment would be where
fk_attnums is defined or in ri_BuildQueryKey() that is shared by
different RI query issuing functions.

Yeah, I wanted more appropriate place for the comment. That place
seems reasonable.

referenced side ("restrict", "cascade", "set" side), as you already
mentioned, fk_attnums refers to the top parent table of the
referencing side, so no possibility of they being different in the
various referenced partitions' RI_ConstraintInfos.

Right. (I'm not sure I have mention that here, though:p)A

Maybe I misread but I think you did in your email dated Dec 1 where you said:

"After an off-list discussion, we confirmed that even in that case the
patch works as is because fk_attnum (or contuple.conkey) always stores
key attnums compatible to the topmost parent when conparent has a
valid value (assuming the current usage of fk_attnum), but I still
feel uneasy to rely on that unclear behavior."

fk_attnums *doesn't* refers to to top parent talbe of the referencing
side. it refers to attributes of the partition that is compatible with
the same element of fk_attnums of the topmost parent. Maybe I'm
misreading.

On the topic of how we'd be able to share even the RI_ConstraintInfos
among partitions, that would indeed look a bit more elaborate than the
patch we have right now.

Maybe just letting the hash entry for the child riinfo point to the
parent riinfo if all members (other than constraint_id, of course)
share the exactly the same values. No need to count references since
we don't going to remove riinfos.

Ah, something maybe worth trying. Although the memory we'd save by
sharing the RI_ConstraintInfos would not add that much to the savings
we're having by sharing the plan, because it's the plans that are a
memory hog AFAIK.

I agree that plans are rather large but the sharable part of the
RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough
comparing to the plans. But that has somewhat large footprint.. (See
the attached)

About your patch, it calculates the root constrid at the time an
riinfo is created, but when the root-partition is further attached to
another partitioned-table after the riinfo creation,
constraint_root_id gets stale. Of course that dones't matter
practically, though.

Maybe we could also store the hash value of the root constraint OID as
rootHashValue and check for that one too in
InvalidateConstraintCacheCallBack(). That would take care of this
unless I'm missing something.

Seems to be sound.

Okay, thanks.

I have attached a patch in which I've tried to merge the ideas from
both my patch and Kuroda-san's. I liked that his patch added
conparentid to RI_ConstraintInfo because that saves a needless
syscache lookup for constraints that don't have a parent. I've kept
my idea to compute the root constraint id only once in
ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey().
Kuroda-san, anything you'd like to add to that?

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-separte-riinfo.patchtext/x-patch; charset=us-asciiDownload

From bbdd0647452a26fdf7da313967f85241c6111fcb Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 3 Dec 2020 16:15:57 +0900
Subject: [PATCH 1/2] separte riinfo

---
 src/backend/utils/adt/ri_triggers.c | 281 ++++++++++++++++------------
 1 file changed, 161 insertions(+), 120 deletions(-)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 02b1a3868f..3f8407c311 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -97,14 +97,10 @@
  * Information extracted from an FK pg_constraint entry.  This is cached in
  * ri_constraint_cache.
  */
-typedef struct RI_ConstraintInfo
+typedef struct RI_ConstraintParam
 {
-	Oid			constraint_id;	/* OID of pg_constraint entry (hash key) */
-	bool		valid;			/* successfully initialized? */
-	uint32		oidHashValue;	/* hash value of pg_constraint OID */
-	NameData	conname;		/* name of the FK constraint */
+	Oid			query_key;
 	Oid			pk_relid;		/* referenced relation */
-	Oid			fk_relid;		/* referencing relation */
 	char		confupdtype;	/* foreign key's ON UPDATE action */
 	char		confdeltype;	/* foreign key's ON DELETE action */
 	char		confmatchtype;	/* foreign key's match type */
@@ -114,6 +110,17 @@ typedef struct RI_ConstraintInfo
 	Oid			pf_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (PK = FK) */
 	Oid			pp_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (PK = PK) */
 	Oid			ff_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (FK = FK) */
+	struct RI_ConstraintInfo *ownerinfo; /* owner info of this param */
+} RI_ConstraintParam;
+
+typedef struct RI_ConstraintInfo
+{
+	Oid			constraint_id;
+	bool		valid;			/* successfully initialized? */
+	uint32		oidHashValue;	/* hash value of pg_constraint OID */
+	NameData	conname;		/* name of the FK constraint */
+	Oid			fk_relid;		/* referencing relation */
+	RI_ConstraintParam *param;	/* sharable parameters  */
 	dlist_node	valid_link;		/* Link in list of valid entries */
 } RI_ConstraintInfo;
 
@@ -264,7 +271,7 @@ RI_FKey_check(TriggerData *trigdata)
 	 * SELECT FOR KEY SHARE will get on it.
 	 */
 	fk_rel = trigdata->tg_relation;
-	pk_rel = table_open(riinfo->pk_relid, RowShareLock);
+	pk_rel = table_open(riinfo->param->pk_relid, RowShareLock);
 
 	switch (ri_NullCheck(RelationGetDescr(fk_rel), newslot, riinfo, false))
 	{
@@ -283,7 +290,7 @@ RI_FKey_check(TriggerData *trigdata)
 			 * This is the only case that differs between the three kinds of
 			 * MATCH.
 			 */
-			switch (riinfo->confmatchtype)
+			switch (riinfo->param->confmatchtype)
 			{
 				case FKCONSTR_MATCH_FULL:
 
@@ -364,17 +371,17 @@ RI_FKey_check(TriggerData *trigdata)
 		appendStringInfo(&querybuf, "SELECT 1 FROM %s%s x",
 						 pk_only, pkrelname);
 		querysep = "WHERE";
-		for (int i = 0; i < riinfo->nkeys; i++)
+		for (int i = 0; i < riinfo->param->nkeys; i++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
+			Oid			pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid			fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(pk_rel, riinfo->pk_attnums[i]));
+						 RIAttName(pk_rel, riinfo->param->pk_attnums[i]));
 			sprintf(paramname, "$%d", i + 1);
 			ri_GenerateQual(&querybuf, querysep,
 							attname, pk_type,
-							riinfo->pf_eq_oprs[i],
+							riinfo->param->pf_eq_oprs[i],
 							paramname, fk_type);
 			querysep = "AND";
 			queryoids[i] = fk_type;
@@ -382,7 +389,7 @@ RI_FKey_check(TriggerData *trigdata)
 		appendStringInfoString(&querybuf, " FOR KEY SHARE OF x");
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -492,16 +499,16 @@ ri_Check_Pk_Match(Relation pk_rel, Relation fk_rel,
 		appendStringInfo(&querybuf, "SELECT 1 FROM %s%s x",
 						 pk_only, pkrelname);
 		querysep = "WHERE";
-		for (int i = 0; i < riinfo->nkeys; i++)
+		for (int i = 0; i < riinfo->param->nkeys; i++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
+			Oid			pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(pk_rel, riinfo->pk_attnums[i]));
+						 RIAttName(pk_rel, riinfo->param->pk_attnums[i]));
 			sprintf(paramname, "$%d", i + 1);
 			ri_GenerateQual(&querybuf, querysep,
 							attname, pk_type,
-							riinfo->pp_eq_oprs[i],
+							riinfo->param->pp_eq_oprs[i],
 							paramname, pk_type);
 			querysep = "AND";
 			queryoids[i] = pk_type;
@@ -509,7 +516,7 @@ ri_Check_Pk_Match(Relation pk_rel, Relation fk_rel,
 		appendStringInfoString(&querybuf, " FOR KEY SHARE OF x");
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -679,19 +686,19 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
 		appendStringInfo(&querybuf, "SELECT 1 FROM %s%s x",
 						 fk_only, fkrelname);
 		querysep = "WHERE";
-		for (int i = 0; i < riinfo->nkeys; i++)
+		for (int i = 0; i < riinfo->param->nkeys; i++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+			Oid			pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid			fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+						 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 			sprintf(paramname, "$%d", i + 1);
 			ri_GenerateQual(&querybuf, querysep,
 							paramname, pk_type,
-							riinfo->pf_eq_oprs[i],
+							riinfo->param->pf_eq_oprs[i],
 							attname, fk_type);
 			if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
 				ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -701,7 +708,7 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
 		appendStringInfoString(&querybuf, " FOR KEY SHARE OF x");
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -785,19 +792,19 @@ RI_FKey_cascade_del(PG_FUNCTION_ARGS)
 		appendStringInfo(&querybuf, "DELETE FROM %s%s",
 						 fk_only, fkrelname);
 		querysep = "WHERE";
-		for (int i = 0; i < riinfo->nkeys; i++)
+		for (int i = 0; i < riinfo->param->nkeys; i++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+			Oid			pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid			fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+						 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 			sprintf(paramname, "$%d", i + 1);
 			ri_GenerateQual(&querybuf, querysep,
 							paramname, pk_type,
-							riinfo->pf_eq_oprs[i],
+							riinfo->param->pf_eq_oprs[i],
 							attname, fk_type);
 			if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
 				ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -806,7 +813,7 @@ RI_FKey_cascade_del(PG_FUNCTION_ARGS)
 		}
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -901,22 +908,22 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS)
 						 fk_only, fkrelname);
 		querysep = "";
 		qualsep = "WHERE";
-		for (int i = 0, j = riinfo->nkeys; i < riinfo->nkeys; i++, j++)
+		for (int i = 0, j = riinfo->param->nkeys; i < riinfo->param->nkeys; i++, j++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+			Oid			pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid			fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+						 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 			appendStringInfo(&querybuf,
 							 "%s %s = $%d",
 							 querysep, attname, i + 1);
 			sprintf(paramname, "$%d", j + 1);
 			ri_GenerateQual(&qualbuf, qualsep,
 							paramname, pk_type,
-							riinfo->pf_eq_oprs[i],
+							riinfo->param->pf_eq_oprs[i],
 							attname, fk_type);
 			if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
 				ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -928,7 +935,7 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS)
 		appendBinaryStringInfo(&querybuf, qualbuf.data, qualbuf.len);
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys * 2, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys * 2, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -1080,15 +1087,15 @@ ri_set(TriggerData *trigdata, bool is_set_null)
 						 fk_only, fkrelname);
 		querysep = "";
 		qualsep = "WHERE";
-		for (int i = 0; i < riinfo->nkeys; i++)
+		for (int i = 0; i < riinfo->param->nkeys; i++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+			Oid			pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid			fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+						 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 			appendStringInfo(&querybuf,
 							 "%s %s = %s",
 							 querysep, attname,
@@ -1096,7 +1103,7 @@ ri_set(TriggerData *trigdata, bool is_set_null)
 			sprintf(paramname, "$%d", i + 1);
 			ri_GenerateQual(&qualbuf, qualsep,
 							paramname, pk_type,
-							riinfo->pf_eq_oprs[i],
+							riinfo->param->pf_eq_oprs[i],
 							attname, fk_type);
 			if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
 				ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -1107,7 +1114,7 @@ ri_set(TriggerData *trigdata, bool is_set_null)
 		appendBinaryStringInfo(&querybuf, qualbuf.data, qualbuf.len);
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -1217,7 +1224,7 @@ RI_FKey_fk_upd_check_required(Trigger *trigger, Relation fk_rel,
 	 */
 	else if (ri_nullcheck == RI_KEYS_SOME_NULL)
 	{
-		switch (riinfo->confmatchtype)
+		switch (riinfo->param->confmatchtype)
 		{
 			case FKCONSTR_MATCH_SIMPLE:
 
@@ -1333,14 +1340,14 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	fkrte->rellockmode = AccessShareLock;
 	fkrte->requiredPerms = ACL_SELECT;
 
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
 		int			attno;
 
-		attno = riinfo->pk_attnums[i] - FirstLowInvalidHeapAttributeNumber;
+		attno = riinfo->param->pk_attnums[i] - FirstLowInvalidHeapAttributeNumber;
 		pkrte->selectedCols = bms_add_member(pkrte->selectedCols, attno);
 
-		attno = riinfo->fk_attnums[i] - FirstLowInvalidHeapAttributeNumber;
+		attno = riinfo->param->fk_attnums[i] - FirstLowInvalidHeapAttributeNumber;
 		fkrte->selectedCols = bms_add_member(fkrte->selectedCols, attno);
 	}
 
@@ -1377,10 +1384,10 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	initStringInfo(&querybuf);
 	appendStringInfoString(&querybuf, "SELECT ");
 	sep = "";
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
 		quoteOneName(fkattname,
-					 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+					 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		appendStringInfo(&querybuf, "%sfk.%s", sep, fkattname);
 		sep = ", ";
 	}
@@ -1398,20 +1405,20 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	strcpy(pkattname, "pk.");
 	strcpy(fkattname, "fk.");
 	sep = "(";
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
-		Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-		Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-		Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-		Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+		Oid			pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+		Oid			fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+		Oid			pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+		Oid			fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 		quoteOneName(pkattname + 3,
-					 RIAttName(pk_rel, riinfo->pk_attnums[i]));
+					 RIAttName(pk_rel, riinfo->param->pk_attnums[i]));
 		quoteOneName(fkattname + 3,
-					 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+					 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		ri_GenerateQual(&querybuf, sep,
 						pkattname, pk_type,
-						riinfo->pf_eq_oprs[i],
+						riinfo->param->pf_eq_oprs[i],
 						fkattname, fk_type);
 		if (pk_coll != fk_coll)
 			ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -1422,17 +1429,17 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	 * It's sufficient to test any one pk attribute for null to detect a join
 	 * failure.
 	 */
-	quoteOneName(pkattname, RIAttName(pk_rel, riinfo->pk_attnums[0]));
+	quoteOneName(pkattname, RIAttName(pk_rel, riinfo->param->pk_attnums[0]));
 	appendStringInfo(&querybuf, ") WHERE pk.%s IS NULL AND (", pkattname);
 
 	sep = "";
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
-		quoteOneName(fkattname, RIAttName(fk_rel, riinfo->fk_attnums[i]));
+		quoteOneName(fkattname, RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		appendStringInfo(&querybuf,
 						 "%sfk.%s IS NOT NULL",
 						 sep, fkattname);
-		switch (riinfo->confmatchtype)
+		switch (riinfo->param->confmatchtype)
 		{
 			case FKCONSTR_MATCH_SIMPLE:
 				sep = " AND ";
@@ -1505,7 +1512,10 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 		HeapTuple	tuple = SPI_tuptable->vals[0];
 		TupleDesc	tupdesc = SPI_tuptable->tupdesc;
 		RI_ConstraintInfo fake_riinfo;
+		RI_ConstraintParam fake_riparam;
 
+		fake_riinfo.param = &fake_riparam;
+		
 		slot = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
 
 		heap_deform_tuple(tuple, tupdesc,
@@ -1522,15 +1532,15 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 		 * or fk_rel's tupdesc.
 		 */
 		memcpy(&fake_riinfo, riinfo, sizeof(RI_ConstraintInfo));
-		for (int i = 0; i < fake_riinfo.nkeys; i++)
-			fake_riinfo.fk_attnums[i] = i + 1;
+		for (int i = 0; i < fake_riinfo.param->nkeys; i++)
+			fake_riinfo.param->fk_attnums[i] = i + 1;
 
 		/*
 		 * If it's MATCH FULL, and there are any nulls in the FK keys,
 		 * complain about that rather than the lack of a match.  MATCH FULL
 		 * disallows partially-null FK rows.
 		 */
-		if (fake_riinfo.confmatchtype == FKCONSTR_MATCH_FULL &&
+		if (fake_riinfo.param->confmatchtype == FKCONSTR_MATCH_FULL &&
 			ri_NullCheck(tupdesc, slot, &fake_riinfo, false) != RI_KEYS_NONE_NULL)
 			ereport(ERROR,
 					(errcode(ERRCODE_FOREIGN_KEY_VIOLATION),
@@ -1615,10 +1625,10 @@ RI_PartitionRemove_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	initStringInfo(&querybuf);
 	appendStringInfoString(&querybuf, "SELECT ");
 	sep = "";
-	for (i = 0; i < riinfo->nkeys; i++)
+	for (i = 0; i < riinfo->param->nkeys; i++)
 	{
 		quoteOneName(fkattname,
-					 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+					 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		appendStringInfo(&querybuf, "%sfk.%s", sep, fkattname);
 		sep = ", ";
 	}
@@ -1633,20 +1643,20 @@ RI_PartitionRemove_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	strcpy(pkattname, "pk.");
 	strcpy(fkattname, "fk.");
 	sep = "(";
-	for (i = 0; i < riinfo->nkeys; i++)
+	for (i = 0; i < riinfo->param->nkeys; i++)
 	{
-		Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-		Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-		Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-		Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+		Oid			pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+		Oid			fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+		Oid			pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+		Oid			fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 		quoteOneName(pkattname + 3,
-					 RIAttName(pk_rel, riinfo->pk_attnums[i]));
+					 RIAttName(pk_rel, riinfo->param->pk_attnums[i]));
 		quoteOneName(fkattname + 3,
-					 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+					 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		ri_GenerateQual(&querybuf, sep,
 						pkattname, pk_type,
-						riinfo->pf_eq_oprs[i],
+						riinfo->param->pf_eq_oprs[i],
 						fkattname, fk_type);
 		if (pk_coll != fk_coll)
 			ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -1666,13 +1676,13 @@ RI_PartitionRemove_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 		appendStringInfoString(&querybuf, ") WHERE (");
 
 	sep = "";
-	for (i = 0; i < riinfo->nkeys; i++)
+	for (i = 0; i < riinfo->param->nkeys; i++)
 	{
-		quoteOneName(fkattname, RIAttName(fk_rel, riinfo->fk_attnums[i]));
+		quoteOneName(fkattname, RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		appendStringInfo(&querybuf,
 						 "%sfk.%s IS NOT NULL",
 						 sep, fkattname);
-		switch (riinfo->confmatchtype)
+		switch (riinfo->param->confmatchtype)
 		{
 			case FKCONSTR_MATCH_SIMPLE:
 				sep = " AND ";
@@ -1762,8 +1772,8 @@ RI_PartitionRemove_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 		 * or fk_rel's tupdesc.
 		 */
 		memcpy(&fake_riinfo, riinfo, sizeof(RI_ConstraintInfo));
-		for (i = 0; i < fake_riinfo.nkeys; i++)
-			fake_riinfo.pk_attnums[i] = i + 1;
+		for (i = 0; i < fake_riinfo.param->nkeys; i++)
+			fake_riinfo.param->pk_attnums[i] = i + 1;
 
 		ri_ReportViolation(&fake_riinfo, pk_rel, fk_rel,
 						   slot, tupdesc, 0, true);
@@ -1905,7 +1915,7 @@ ri_BuildQueryKey(RI_QueryKey *key, const RI_ConstraintInfo *riinfo,
 	 * We assume struct RI_QueryKey contains no padding bytes, else we'd need
 	 * to use memset to clear them.
 	 */
-	key->constr_id = riinfo->constraint_id;
+	key->constr_id = riinfo->param->query_key;
 	key->constr_queryno = constr_queryno;
 }
 
@@ -1983,25 +1993,25 @@ ri_FetchConstraintInfo(Trigger *trigger, Relation trig_rel, bool rel_is_pk)
 	if (rel_is_pk)
 	{
 		if (riinfo->fk_relid != trigger->tgconstrrelid ||
-			riinfo->pk_relid != RelationGetRelid(trig_rel))
+			riinfo->param->pk_relid != RelationGetRelid(trig_rel))
 			elog(ERROR, "wrong pg_constraint entry for trigger \"%s\" on table \"%s\"",
 				 trigger->tgname, RelationGetRelationName(trig_rel));
 	}
 	else
 	{
 		if (riinfo->fk_relid != RelationGetRelid(trig_rel) ||
-			riinfo->pk_relid != trigger->tgconstrrelid)
+			riinfo->param->pk_relid != trigger->tgconstrrelid)
 			elog(ERROR, "wrong pg_constraint entry for trigger \"%s\" on table \"%s\"",
 				 trigger->tgname, RelationGetRelationName(trig_rel));
 	}
 
-	if (riinfo->confmatchtype != FKCONSTR_MATCH_FULL &&
-		riinfo->confmatchtype != FKCONSTR_MATCH_PARTIAL &&
-		riinfo->confmatchtype != FKCONSTR_MATCH_SIMPLE)
+	if (riinfo->param->confmatchtype != FKCONSTR_MATCH_FULL &&
+		riinfo->param->confmatchtype != FKCONSTR_MATCH_PARTIAL &&
+		riinfo->param->confmatchtype != FKCONSTR_MATCH_SIMPLE)
 		elog(ERROR, "unrecognized confmatchtype: %d",
-			 riinfo->confmatchtype);
+			 riinfo->param->confmatchtype);
 
-	if (riinfo->confmatchtype == FKCONSTR_MATCH_PARTIAL)
+	if (riinfo->param->confmatchtype == FKCONSTR_MATCH_PARTIAL)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("MATCH PARTIAL not yet implemented")));
@@ -2009,6 +2019,27 @@ ri_FetchConstraintInfo(Trigger *trigger, Relation trig_rel, bool rel_is_pk)
 	return riinfo;
 }
 
+/*
+ * get_ri_constraint_root
+ *		Returns a given RI constraint's root parent
+ */
+static Oid
+get_ri_constraint_root(Oid constrOid)
+{
+	HeapTuple	tuple;
+	Oid			constrParentOid;
+
+	tuple = SearchSysCache1(CONSTROID, ObjectIdGetDatum(constrOid));
+	if (!HeapTupleIsValid(tuple))
+		elog(ERROR, "cache lookup failed for constraint %u", constrOid);
+	constrParentOid = ((Form_pg_constraint) GETSTRUCT(tuple))->conparentid;
+	ReleaseSysCache(tuple);
+	if (OidIsValid(constrParentOid))
+		return get_ri_constraint_root(constrParentOid);
+
+	return constrOid;
+}
+
 /*
  * Fetch or create the RI_ConstraintInfo struct for an FK constraint.
  */
@@ -2032,11 +2063,20 @@ ri_LoadConstraintInfo(Oid constraintOid)
 	riinfo = (RI_ConstraintInfo *) hash_search(ri_constraint_cache,
 											   (void *) &constraintOid,
 											   HASH_ENTER, &found);
+
 	if (!found)
 		riinfo->valid = false;
 	else if (riinfo->valid)
 		return riinfo;
 
+	/* allocate riinfo's own param memory if needed */
+	if (!found || riinfo->param->ownerinfo != riinfo)
+	{
+		riinfo->param = (RI_ConstraintParam *)
+			MemoryContextAlloc(TopMemoryContext, sizeof(RI_ConstraintParam));
+		riinfo->param->ownerinfo = riinfo;
+	}
+
 	/*
 	 * Fetch the pg_constraint row so we can fill in the entry.
 	 */
@@ -2051,22 +2091,23 @@ ri_LoadConstraintInfo(Oid constraintOid)
 
 	/* And extract data */
 	Assert(riinfo->constraint_id == constraintOid);
+	riinfo->param->query_key = get_ri_constraint_root(riinfo->constraint_id);
 	riinfo->oidHashValue = GetSysCacheHashValue1(CONSTROID,
 												 ObjectIdGetDatum(constraintOid));
 	memcpy(&riinfo->conname, &conForm->conname, sizeof(NameData));
-	riinfo->pk_relid = conForm->confrelid;
+	riinfo->param->pk_relid = conForm->confrelid;
 	riinfo->fk_relid = conForm->conrelid;
-	riinfo->confupdtype = conForm->confupdtype;
-	riinfo->confdeltype = conForm->confdeltype;
-	riinfo->confmatchtype = conForm->confmatchtype;
+	riinfo->param->confupdtype = conForm->confupdtype;
+	riinfo->param->confdeltype = conForm->confdeltype;
+	riinfo->param->confmatchtype = conForm->confmatchtype;
 
 	DeconstructFkConstraintRow(tup,
-							   &riinfo->nkeys,
-							   riinfo->fk_attnums,
-							   riinfo->pk_attnums,
-							   riinfo->pf_eq_oprs,
-							   riinfo->pp_eq_oprs,
-							   riinfo->ff_eq_oprs);
+							   &riinfo->param->nkeys,
+							   riinfo->param->fk_attnums,
+							   riinfo->param->pk_attnums,
+							   riinfo->param->pf_eq_oprs,
+							   riinfo->param->pp_eq_oprs,
+							   riinfo->param->ff_eq_oprs);
 
 	ReleaseSysCache(tup);
 
@@ -2227,7 +2268,7 @@ ri_PerformCheck(const RI_ConstraintInfo *riinfo,
 						 vals, nulls);
 		if (oldslot)
 			ri_ExtractValues(source_rel, oldslot, riinfo, source_is_pk,
-							 vals + riinfo->nkeys, nulls + riinfo->nkeys);
+							 vals + riinfo->param->nkeys, nulls + riinfo->param->nkeys);
 	}
 	else
 	{
@@ -2320,11 +2361,11 @@ ri_ExtractValues(Relation rel, TupleTableSlot *slot,
 	bool		isnull;
 
 	if (rel_is_pk)
-		attnums = riinfo->pk_attnums;
+		attnums = riinfo->param->pk_attnums;
 	else
-		attnums = riinfo->fk_attnums;
+		attnums = riinfo->param->fk_attnums;
 
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
 		vals[i] = slot_getattr(slot, attnums[i], &isnull);
 		nulls[i] = isnull ? 'n' : ' ';
@@ -2361,14 +2402,14 @@ ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 	onfk = (queryno == RI_PLAN_CHECK_LOOKUPPK);
 	if (onfk)
 	{
-		attnums = riinfo->fk_attnums;
+		attnums = riinfo->param->fk_attnums;
 		rel_oid = fk_rel->rd_id;
 		if (tupdesc == NULL)
 			tupdesc = fk_rel->rd_att;
 	}
 	else
 	{
-		attnums = riinfo->pk_attnums;
+		attnums = riinfo->param->pk_attnums;
 		rel_oid = pk_rel->rd_id;
 		if (tupdesc == NULL)
 			tupdesc = pk_rel->rd_att;
@@ -2395,7 +2436,7 @@ ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 		if (aclresult != ACLCHECK_OK)
 		{
 			/* Try for column-level permissions */
-			for (int idx = 0; idx < riinfo->nkeys; idx++)
+			for (int idx = 0; idx < riinfo->param->nkeys; idx++)
 			{
 				aclresult = pg_attribute_aclcheck(rel_oid, attnums[idx],
 												  GetUserId(),
@@ -2418,7 +2459,7 @@ ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 		/* Get printable versions of the keys involved */
 		initStringInfo(&key_names);
 		initStringInfo(&key_values);
-		for (int idx = 0; idx < riinfo->nkeys; idx++)
+		for (int idx = 0; idx < riinfo->param->nkeys; idx++)
 		{
 			int			fnum = attnums[idx];
 			Form_pg_attribute att = TupleDescAttr(tupdesc, fnum - 1);
@@ -2508,11 +2549,11 @@ ri_NullCheck(TupleDesc tupDesc,
 	bool		nonenull = true;
 
 	if (rel_is_pk)
-		attnums = riinfo->pk_attnums;
+		attnums = riinfo->param->pk_attnums;
 	else
-		attnums = riinfo->fk_attnums;
+		attnums = riinfo->param->fk_attnums;
 
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
 		if (slot_attisnull(slot, attnums[i]))
 			nonenull = false;
@@ -2667,12 +2708,12 @@ ri_KeysEqual(Relation rel, TupleTableSlot *oldslot, TupleTableSlot *newslot,
 	const int16 *attnums;
 
 	if (rel_is_pk)
-		attnums = riinfo->pk_attnums;
+		attnums = riinfo->param->pk_attnums;
 	else
-		attnums = riinfo->fk_attnums;
+		attnums = riinfo->param->fk_attnums;
 
 	/* XXX: could be worthwhile to fetch all necessary attrs at once */
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
 		Datum		oldvalue;
 		Datum		newvalue;
@@ -2714,7 +2755,7 @@ ri_KeysEqual(Relation rel, TupleTableSlot *oldslot, TupleTableSlot *newslot,
 			 * operator.  Changes that compare equal will still satisfy the
 			 * constraint after the update.
 			 */
-			if (!ri_AttributesEqual(riinfo->ff_eq_oprs[i], RIAttType(rel, attnums[i]),
+			if (!ri_AttributesEqual(riinfo->param->ff_eq_oprs[i], RIAttType(rel, attnums[i]),
 									oldvalue, newvalue))
 				return false;
 		}
-- 
2.18.4

0002-share-param-part-of-riinfo.patchtext/x-patch; charset=us-asciiDownload

From ae7f7ab7df353044b8b2878253d1bdb8adbcd054 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 3 Dec 2020 16:51:07 +0900
Subject: [PATCH 2/2] share param part of riinfo

---
 src/backend/utils/adt/ri_triggers.c | 33 +++++++++++++++++++++++++++++
 1 file changed, 33 insertions(+)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 3f8407c311..080e8af1a5 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -110,6 +110,8 @@ typedef struct RI_ConstraintParam
 	Oid			pf_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (PK = FK) */
 	Oid			pp_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (PK = PK) */
 	Oid			ff_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (FK = FK) */
+
+	/* This should follow the aboves, see ri_LoadConstraintInfo */
 	struct RI_ConstraintInfo *ownerinfo; /* owner info of this param */
 } RI_ConstraintParam;
 
@@ -2111,6 +2113,37 @@ ri_LoadConstraintInfo(Oid constraintOid)
 
 	ReleaseSysCache(tup);
 
+	if (OidIsValid(conForm->conparentid))
+	{
+		Oid pconid = conForm->conparentid;
+		const RI_ConstraintInfo *priinfo = ri_LoadConstraintInfo(pconid);
+		RI_ConstraintParam		*p = riinfo->param;
+		RI_ConstraintParam		*pp = priinfo->param;
+
+		/* share the same parameters if identical */
+		if (memcmp(p, pp, offsetof(RI_ConstraintParam, pk_attnums)) == 0)
+		{
+			int i;
+
+			for (i = 0 ; i < p->nkeys ; i++)
+			{
+				if (p->pk_attnums[i] != pp->pk_attnums[i] ||
+					p->fk_attnums[i] != pp->fk_attnums[i] ||
+					p->pf_eq_oprs[i] != pp->pf_eq_oprs[i] ||
+					p->pp_eq_oprs[i] != pp->pp_eq_oprs[i] ||
+					p->ff_eq_oprs[i] != pp->ff_eq_oprs[i])
+					break;
+			}
+			if (i == p->nkeys)
+			{
+				/* this riinfo can share the parameters with parent */
+				Assert(p->ownerinfo == riinfo);
+				pfree(riinfo->param);
+				riinfo->param = pp;
+			}
+		}
+	}
+
 	/*
 	 * For efficient processing of invalidation messages below, we keep a
 	 * doubly-linked list, and a count, of all currently valid entries.
-- 
2.18.4

#17

horikyota.ntt@gmail.com

about 5 years ago

In reply to: Kyotaro Horiguchi (#16)

Re: Huge memory consumption on partitioned table with FKs

At Thu, 03 Dec 2020 17:13:16 +0900 (JST), Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote in
me> I agree that plans are rather large but the sharable part of the
me> RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough
me> comparing to the plans. But that has somewhat large footprint.. (See
me> the attached)

0001 contains a bug about query_key and get_ri_constaint_root (from
your patch) is not needed there, but the core part is 0002 so please
ignore them.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#18

amitlangote09@gmail.com

about 5 years ago

In reply to: Kyotaro Horiguchi (#16)

Re: Huge memory consumption on partitioned table with FKs

On Thu, Dec 3, 2020 at 5:13 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Thu, 3 Dec 2020 16:41:45 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

Maybe I misread but I think you did in your email dated Dec 1 where you said:

"After an off-list discussion, we confirmed that even in that case the
patch works as is because fk_attnum (or contuple.conkey) always stores
key attnums compatible to the topmost parent when conparent has a
valid value (assuming the current usage of fk_attnum), but I still
feel uneasy to rely on that unclear behavior."

fk_attnums *doesn't* refers to to top parent talbe of the referencing
side. it refers to attributes of the partition that is compatible with
the same element of fk_attnums of the topmost parent. Maybe I'm
misreading.

Yeah, no I am confused. Reading what I wrote, it seems I implied that
the referenced (PK relation's) partitions have RI_ConstraintInfo which
makes no sense, although there indeed is one pg_constraint entry that
is defined on the FK root table for every PK partition with its OID as
confrelid, which is in addition to an entry containing the root PK
table's OID as confrelid. I confused those PK-partition-referencing
entries as belonging to the partitions themselves. Although in my
defence, all of those entries' conkey contains the FK root table's
attributes, so at least that much holds. :)

On the topic of how we'd be able to share even the RI_ConstraintInfos
among partitions, that would indeed look a bit more elaborate than the
patch we have right now.

Maybe just letting the hash entry for the child riinfo point to the
parent riinfo if all members (other than constraint_id, of course)
share the exactly the same values. No need to count references since
we don't going to remove riinfos.

Ah, something maybe worth trying. Although the memory we'd save by
sharing the RI_ConstraintInfos would not add that much to the savings
we're having by sharing the plan, because it's the plans that are a
memory hog AFAIK.

I agree that plans are rather large but the sharable part of the
RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough
comparing to the plans. But that has somewhat large footprint.. (See
the attached)

Thanks for the patch.

--
Amit Langote
EDB: http://www.enterprisedb.com

#19

amitlangote09@gmail.com

about 5 years ago

In reply to: Corey Huinker (#8)

Re: Huge memory consumption on partitioned table with FKs

On Tue, Dec 1, 2020 at 12:25 PM Corey Huinker <corey.huinker@gmail.com> wrote:

On Mon, Nov 30, 2020 at 9:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Corey Huinker <corey.huinker@gmail.com> writes:

Given that we're already looking at these checks, I was wondering if this
might be the time to consider implementing these checks by directly
scanning the constraint index.

Yeah, maybe. Certainly ri_triggers is putting a huge amount of effort
into working around the SPI/parser/planner layer, to not a lot of gain.

However, it's not clear to me that that line of thought will work well
for the statement-level-trigger approach. In that case you might be
dealing with enough tuples to make a different plan advisable.

Bypassing SPI would probably mean that we stay with row level triggers, and the cached query plan would go away, perhaps replaced by an already-looked-up-this-tuple hash sorta like what the cached nested loops effort is doing.

I've been meaning to give this a try when I got some spare time. This may inspire me to try again.

+1 for this line of work.

--
Amit Langote
EDB: http://www.enterprisedb.com

#20

alvherre@alvh.no-ip.org

about 5 years ago

In reply to: Kyotaro Horiguchi (#16)

Re: Huge memory consumption on partitioned table with FKs

Hello

I haven't followed this thread's latest posts, but I'm unclear on the
lifetime of the new struct that's being allocated in TopMemoryContext.
At what point are those structs freed?

Also, the comment that was in RI_ConstraintInfo now appears in
RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
undocumented. What is the relationship between those two structs? I
see that they have pointers to each other, but I think the relationship
should be documented more clearly.

Thanks!

#21

horikyota.ntt@gmail.com

about 5 years ago

In reply to: Amit Langote (#18)

Re: Huge memory consumption on partitioned table with FKs

At Thu, 3 Dec 2020 21:40:29 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

On Thu, Dec 3, 2020 at 5:13 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Thu, 3 Dec 2020 16:41:45 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

Maybe I misread but I think you did in your email dated Dec 1 where you said:

"After an off-list discussion, we confirmed that even in that case the
patch works as is because fk_attnum (or contuple.conkey) always stores
key attnums compatible to the topmost parent when conparent has a
valid value (assuming the current usage of fk_attnum), but I still
feel uneasy to rely on that unclear behavior."

fk_attnums *doesn't* refers to to top parent talbe of the referencing
side. it refers to attributes of the partition that is compatible with
the same element of fk_attnums of the topmost parent. Maybe I'm
misreading.

Yeah, no I am confused. Reading what I wrote, it seems I implied that
the referenced (PK relation's) partitions have RI_ConstraintInfo which
makes no sense, although there indeed is one pg_constraint entry that
is defined on the FK root table for every PK partition with its OID as
confrelid, which is in addition to an entry containing the root PK
table's OID as confrelid. I confused those PK-partition-referencing
entries as belonging to the partitions themselves. Although in my
defence, all of those entries' conkey contains the FK root table's
attributes, so at least that much holds. :)

Yes. I think that that confusion doen't hurt the correctness of the
discussion:)

On the topic of how we'd be able to share even the RI_ConstraintInfos
among partitions, that would indeed look a bit more elaborate than the
patch we have right now.

Maybe just letting the hash entry for the child riinfo point to the
parent riinfo if all members (other than constraint_id, of course)
share the exactly the same values. No need to count references since
we don't going to remove riinfos.

Ah, something maybe worth trying. Although the memory we'd save by
sharing the RI_ConstraintInfos would not add that much to the savings
we're having by sharing the plan, because it's the plans that are a
memory hog AFAIK.

I agree that plans are rather large but the sharable part of the
RI_ConstraintInfos is 536 bytes, I'm not sure it is small enough
comparing to the plans. But that has somewhat large footprint.. (See
the attached)

Thanks for the patch.

That's only to show how that looks like.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#22

keisuke.kuroda.3862@gmail.com

about 5 years ago

In reply to: Amit Langote (#15)

Re: Huge memory consumption on partitioned table with FKs

Hi Amit,

I have attached a patch in which I've tried to merge the ideas from
both my patch and Kuroda-san's. I liked that his patch added
conparentid to RI_ConstraintInfo because that saves a needless
syscache lookup for constraints that don't have a parent. I've kept
my idea to compute the root constraint id only once in
ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey().
Kuroda-san, anything you'd like to add to that?

Thank you for the merge! It looks good to me.
I think a fix for InvalidateConstraintCacheCallBack() is also good.

I also confirmed that the patch passed the make check-world.

Best Regards,
--
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3862@gmail.com

#23

horikyota.ntt@gmail.com

about 5 years ago

In reply to: Alvaro Herrera (#20)

2 attachment(s)

Re: Huge memory consumption on partitioned table with FKs

Thanks, but sorry for the confusion.

I intended just to show how it looks like if we share
RI_ConstraintInfo among partition relations.

At Thu, 3 Dec 2020 10:22:47 -0300, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in

Hello

I haven't followed this thread's latest posts, but I'm unclear on the
lifetime of the new struct that's being allocated in TopMemoryContext.
At what point are those structs freed?

The choice of memory context is tentative and in order to shrink the
patch'es footprint. I think we don't use CurrentDynaHashCxt for the
additional struct so a context for this use is needed.

The struct is freed only when the parent struct (RI_ConstraintInfo) is
found to be able to share the child struct (RI_ConstraintParam) with
the parent constraint. It seems like inefficient (or tending to make
"hole"s in the heap area) but I chose it just to shrink the footprint.

We could create the new RI_ConstraintInfo on stack then copy it to the
cache after we find that the RI_ConstraintInfo needs its own
RI_ConstriantParam.

Also, the comment that was in RI_ConstraintInfo now appears in
RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
undocumented. What is the relationship between those two structs? I
see that they have pointers to each other, but I think the relationship
should be documented more clearly.

I'm not sure the footprint of this patch worth doing but here is a bit
more polished version.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

0001-separte-riinfo.patchtext/x-patch; charset=us-asciiDownload

From 55946e09d33fc7fa43bed04ef548bf8a3f67155d Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Thu, 3 Dec 2020 16:15:57 +0900
Subject: [PATCH 1/2] separte riinfo

---
 src/backend/utils/adt/ri_triggers.c | 290 ++++++++++++++++------------
 1 file changed, 168 insertions(+), 122 deletions(-)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 02b1a3868f..0306bf7739 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -97,14 +97,29 @@
  * Information extracted from an FK pg_constraint entry.  This is cached in
  * ri_constraint_cache.
  */
+struct RI_ConstraintParam;
+
 typedef struct RI_ConstraintInfo
 {
-	Oid			constraint_id;	/* OID of pg_constraint entry (hash key) */
+	Oid			constraint_id;
 	bool		valid;			/* successfully initialized? */
 	uint32		oidHashValue;	/* hash value of pg_constraint OID */
 	NameData	conname;		/* name of the FK constraint */
-	Oid			pk_relid;		/* referenced relation */
 	Oid			fk_relid;		/* referencing relation */
+	struct RI_ConstraintParam *param;	/* sharable part  */
+	dlist_node	valid_link;		/* Link in list of valid entries */
+} RI_ConstraintInfo;
+
+/*
+ * RI_ConstraintParam
+ *
+ * The part sharable among relations in a partitioned table of the cached
+ * constraint information.
+ */
+typedef struct RI_ConstraintParam
+{
+	/* begin with identity members */
+	Oid			pk_relid;		/* referenced relation */
 	char		confupdtype;	/* foreign key's ON UPDATE action */
 	char		confdeltype;	/* foreign key's ON DELETE action */
 	char		confmatchtype;	/* foreign key's match type */
@@ -114,8 +129,12 @@ typedef struct RI_ConstraintInfo
 	Oid			pf_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (PK = FK) */
 	Oid			pp_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (PK = PK) */
 	Oid			ff_eq_oprs[RI_MAX_NUMKEYS]; /* equality operators (FK = FK) */
-	dlist_node	valid_link;		/* Link in list of valid entries */
-} RI_ConstraintInfo;
+
+	/* These should be at the end of struct, see ri_LoadConstraintInfo */
+	Oid			query_key;		/* key for planned statment */
+	RI_ConstraintInfo *ownerinfo; /* owner RI_ConstraintInfo of this param */
+} RI_ConstraintParam;
+
 
 /*
  * RI_QueryKey
@@ -163,6 +182,7 @@ typedef struct RI_CompareHashEntry
 /*
  * Local data
  */
+static MemoryContext ri_constraint_cache_cxt = NULL;
 static HTAB *ri_constraint_cache = NULL;
 static HTAB *ri_query_cache = NULL;
 static HTAB *ri_compare_cache = NULL;
@@ -264,7 +284,7 @@ RI_FKey_check(TriggerData *trigdata)
 	 * SELECT FOR KEY SHARE will get on it.
 	 */
 	fk_rel = trigdata->tg_relation;
-	pk_rel = table_open(riinfo->pk_relid, RowShareLock);
+	pk_rel = table_open(riinfo->param->pk_relid, RowShareLock);
 
 	switch (ri_NullCheck(RelationGetDescr(fk_rel), newslot, riinfo, false))
 	{
@@ -283,7 +303,7 @@ RI_FKey_check(TriggerData *trigdata)
 			 * This is the only case that differs between the three kinds of
 			 * MATCH.
 			 */
-			switch (riinfo->confmatchtype)
+			switch (riinfo->param->confmatchtype)
 			{
 				case FKCONSTR_MATCH_FULL:
 
@@ -364,17 +384,17 @@ RI_FKey_check(TriggerData *trigdata)
 		appendStringInfo(&querybuf, "SELECT 1 FROM %s%s x",
 						 pk_only, pkrelname);
 		querysep = "WHERE";
-		for (int i = 0; i < riinfo->nkeys; i++)
+		for (int i = 0; i < riinfo->param->nkeys; i++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
+			Oid			pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid			fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(pk_rel, riinfo->pk_attnums[i]));
+						 RIAttName(pk_rel, riinfo->param->pk_attnums[i]));
 			sprintf(paramname, "$%d", i + 1);
 			ri_GenerateQual(&querybuf, querysep,
 							attname, pk_type,
-							riinfo->pf_eq_oprs[i],
+							riinfo->param->pf_eq_oprs[i],
 							paramname, fk_type);
 			querysep = "AND";
 			queryoids[i] = fk_type;
@@ -382,7 +402,7 @@ RI_FKey_check(TriggerData *trigdata)
 		appendStringInfoString(&querybuf, " FOR KEY SHARE OF x");
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -492,16 +512,16 @@ ri_Check_Pk_Match(Relation pk_rel, Relation fk_rel,
 		appendStringInfo(&querybuf, "SELECT 1 FROM %s%s x",
 						 pk_only, pkrelname);
 		querysep = "WHERE";
-		for (int i = 0; i < riinfo->nkeys; i++)
+		for (int i = 0; i < riinfo->param->nkeys; i++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
+			Oid	pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(pk_rel, riinfo->pk_attnums[i]));
+						 RIAttName(pk_rel, riinfo->param->pk_attnums[i]));
 			sprintf(paramname, "$%d", i + 1);
 			ri_GenerateQual(&querybuf, querysep,
 							attname, pk_type,
-							riinfo->pp_eq_oprs[i],
+							riinfo->param->pp_eq_oprs[i],
 							paramname, pk_type);
 			querysep = "AND";
 			queryoids[i] = pk_type;
@@ -509,7 +529,7 @@ ri_Check_Pk_Match(Relation pk_rel, Relation fk_rel,
 		appendStringInfoString(&querybuf, " FOR KEY SHARE OF x");
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -679,19 +699,19 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
 		appendStringInfo(&querybuf, "SELECT 1 FROM %s%s x",
 						 fk_only, fkrelname);
 		querysep = "WHERE";
-		for (int i = 0; i < riinfo->nkeys; i++)
+		for (int i = 0; i < riinfo->param->nkeys; i++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+			Oid pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+			Oid pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+						 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 			sprintf(paramname, "$%d", i + 1);
 			ri_GenerateQual(&querybuf, querysep,
 							paramname, pk_type,
-							riinfo->pf_eq_oprs[i],
+							riinfo->param->pf_eq_oprs[i],
 							attname, fk_type);
 			if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
 				ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -701,7 +721,7 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
 		appendStringInfoString(&querybuf, " FOR KEY SHARE OF x");
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -785,19 +805,19 @@ RI_FKey_cascade_del(PG_FUNCTION_ARGS)
 		appendStringInfo(&querybuf, "DELETE FROM %s%s",
 						 fk_only, fkrelname);
 		querysep = "WHERE";
-		for (int i = 0; i < riinfo->nkeys; i++)
+		for (int i = 0; i < riinfo->param->nkeys; i++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+			Oid pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+			Oid pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+						 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 			sprintf(paramname, "$%d", i + 1);
 			ri_GenerateQual(&querybuf, querysep,
 							paramname, pk_type,
-							riinfo->pf_eq_oprs[i],
+							riinfo->param->pf_eq_oprs[i],
 							attname, fk_type);
 			if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
 				ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -806,7 +826,7 @@ RI_FKey_cascade_del(PG_FUNCTION_ARGS)
 		}
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -901,22 +921,24 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS)
 						 fk_only, fkrelname);
 		querysep = "";
 		qualsep = "WHERE";
-		for (int i = 0, j = riinfo->nkeys; i < riinfo->nkeys; i++, j++)
+		for (int i = 0, j = riinfo->param->nkeys;
+			 i < riinfo->param->nkeys;
+			 i++, j++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+			Oid pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+			Oid pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+						 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 			appendStringInfo(&querybuf,
 							 "%s %s = $%d",
 							 querysep, attname, i + 1);
 			sprintf(paramname, "$%d", j + 1);
 			ri_GenerateQual(&qualbuf, qualsep,
 							paramname, pk_type,
-							riinfo->pf_eq_oprs[i],
+							riinfo->param->pf_eq_oprs[i],
 							attname, fk_type);
 			if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
 				ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -928,7 +950,7 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS)
 		appendBinaryStringInfo(&querybuf, qualbuf.data, qualbuf.len);
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys * 2, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys * 2, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -1080,15 +1102,15 @@ ri_set(TriggerData *trigdata, bool is_set_null)
 						 fk_only, fkrelname);
 		querysep = "";
 		qualsep = "WHERE";
-		for (int i = 0; i < riinfo->nkeys; i++)
+		for (int i = 0; i < riinfo->param->nkeys; i++)
 		{
-			Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-			Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-			Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+			Oid pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+			Oid pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+			Oid fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 			quoteOneName(attname,
-						 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+						 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 			appendStringInfo(&querybuf,
 							 "%s %s = %s",
 							 querysep, attname,
@@ -1096,7 +1118,7 @@ ri_set(TriggerData *trigdata, bool is_set_null)
 			sprintf(paramname, "$%d", i + 1);
 			ri_GenerateQual(&qualbuf, qualsep,
 							paramname, pk_type,
-							riinfo->pf_eq_oprs[i],
+							riinfo->param->pf_eq_oprs[i],
 							attname, fk_type);
 			if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
 				ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -1107,7 +1129,7 @@ ri_set(TriggerData *trigdata, bool is_set_null)
 		appendBinaryStringInfo(&querybuf, qualbuf.data, qualbuf.len);
 
 		/* Prepare and save the plan */
-		qplan = ri_PlanCheck(querybuf.data, riinfo->nkeys, queryoids,
+		qplan = ri_PlanCheck(querybuf.data, riinfo->param->nkeys, queryoids,
 							 &qkey, fk_rel, pk_rel);
 	}
 
@@ -1217,7 +1239,7 @@ RI_FKey_fk_upd_check_required(Trigger *trigger, Relation fk_rel,
 	 */
 	else if (ri_nullcheck == RI_KEYS_SOME_NULL)
 	{
-		switch (riinfo->confmatchtype)
+		switch (riinfo->param->confmatchtype)
 		{
 			case FKCONSTR_MATCH_SIMPLE:
 
@@ -1333,14 +1355,16 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	fkrte->rellockmode = AccessShareLock;
 	fkrte->requiredPerms = ACL_SELECT;
 
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
 		int			attno;
 
-		attno = riinfo->pk_attnums[i] - FirstLowInvalidHeapAttributeNumber;
+		attno = riinfo->param->pk_attnums[i] -
+			FirstLowInvalidHeapAttributeNumber;
 		pkrte->selectedCols = bms_add_member(pkrte->selectedCols, attno);
 
-		attno = riinfo->fk_attnums[i] - FirstLowInvalidHeapAttributeNumber;
+		attno = riinfo->param->fk_attnums[i] -
+			FirstLowInvalidHeapAttributeNumber;
 		fkrte->selectedCols = bms_add_member(fkrte->selectedCols, attno);
 	}
 
@@ -1377,10 +1401,10 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	initStringInfo(&querybuf);
 	appendStringInfoString(&querybuf, "SELECT ");
 	sep = "";
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
 		quoteOneName(fkattname,
-					 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+					 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		appendStringInfo(&querybuf, "%sfk.%s", sep, fkattname);
 		sep = ", ";
 	}
@@ -1398,20 +1422,20 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	strcpy(pkattname, "pk.");
 	strcpy(fkattname, "fk.");
 	sep = "(";
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
-		Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-		Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-		Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-		Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+		Oid pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+		Oid fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+		Oid pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+		Oid fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 		quoteOneName(pkattname + 3,
-					 RIAttName(pk_rel, riinfo->pk_attnums[i]));
+					 RIAttName(pk_rel, riinfo->param->pk_attnums[i]));
 		quoteOneName(fkattname + 3,
-					 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+					 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		ri_GenerateQual(&querybuf, sep,
 						pkattname, pk_type,
-						riinfo->pf_eq_oprs[i],
+						riinfo->param->pf_eq_oprs[i],
 						fkattname, fk_type);
 		if (pk_coll != fk_coll)
 			ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -1422,17 +1446,17 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	 * It's sufficient to test any one pk attribute for null to detect a join
 	 * failure.
 	 */
-	quoteOneName(pkattname, RIAttName(pk_rel, riinfo->pk_attnums[0]));
+	quoteOneName(pkattname, RIAttName(pk_rel, riinfo->param->pk_attnums[0]));
 	appendStringInfo(&querybuf, ") WHERE pk.%s IS NULL AND (", pkattname);
 
 	sep = "";
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
-		quoteOneName(fkattname, RIAttName(fk_rel, riinfo->fk_attnums[i]));
+		quoteOneName(fkattname, RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		appendStringInfo(&querybuf,
 						 "%sfk.%s IS NOT NULL",
 						 sep, fkattname);
-		switch (riinfo->confmatchtype)
+		switch (riinfo->param->confmatchtype)
 		{
 			case FKCONSTR_MATCH_SIMPLE:
 				sep = " AND ";
@@ -1505,6 +1529,9 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 		HeapTuple	tuple = SPI_tuptable->vals[0];
 		TupleDesc	tupdesc = SPI_tuptable->tupdesc;
 		RI_ConstraintInfo fake_riinfo;
+		RI_ConstraintParam fake_riparam;
+
+		fake_riinfo.param = &fake_riparam;
 
 		slot = MakeSingleTupleTableSlot(tupdesc, &TTSOpsVirtual);
 
@@ -1522,15 +1549,15 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 		 * or fk_rel's tupdesc.
 		 */
 		memcpy(&fake_riinfo, riinfo, sizeof(RI_ConstraintInfo));
-		for (int i = 0; i < fake_riinfo.nkeys; i++)
-			fake_riinfo.fk_attnums[i] = i + 1;
+		for (int i = 0; i < fake_riinfo.param->nkeys; i++)
+			fake_riinfo.param->fk_attnums[i] = i + 1;
 
 		/*
 		 * If it's MATCH FULL, and there are any nulls in the FK keys,
 		 * complain about that rather than the lack of a match.  MATCH FULL
 		 * disallows partially-null FK rows.
 		 */
-		if (fake_riinfo.confmatchtype == FKCONSTR_MATCH_FULL &&
+		if (fake_riinfo.param->confmatchtype == FKCONSTR_MATCH_FULL &&
 			ri_NullCheck(tupdesc, slot, &fake_riinfo, false) != RI_KEYS_NONE_NULL)
 			ereport(ERROR,
 					(errcode(ERRCODE_FOREIGN_KEY_VIOLATION),
@@ -1615,10 +1642,10 @@ RI_PartitionRemove_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	initStringInfo(&querybuf);
 	appendStringInfoString(&querybuf, "SELECT ");
 	sep = "";
-	for (i = 0; i < riinfo->nkeys; i++)
+	for (i = 0; i < riinfo->param->nkeys; i++)
 	{
 		quoteOneName(fkattname,
-					 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+					 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		appendStringInfo(&querybuf, "%sfk.%s", sep, fkattname);
 		sep = ", ";
 	}
@@ -1633,20 +1660,20 @@ RI_PartitionRemove_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 	strcpy(pkattname, "pk.");
 	strcpy(fkattname, "fk.");
 	sep = "(";
-	for (i = 0; i < riinfo->nkeys; i++)
+	for (i = 0; i < riinfo->param->nkeys; i++)
 	{
-		Oid			pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
-		Oid			fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
-		Oid			pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
-		Oid			fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
+		Oid pk_type = RIAttType(pk_rel, riinfo->param->pk_attnums[i]);
+		Oid fk_type = RIAttType(fk_rel, riinfo->param->fk_attnums[i]);
+		Oid pk_coll = RIAttCollation(pk_rel, riinfo->param->pk_attnums[i]);
+		Oid fk_coll = RIAttCollation(fk_rel, riinfo->param->fk_attnums[i]);
 
 		quoteOneName(pkattname + 3,
-					 RIAttName(pk_rel, riinfo->pk_attnums[i]));
+					 RIAttName(pk_rel, riinfo->param->pk_attnums[i]));
 		quoteOneName(fkattname + 3,
-					 RIAttName(fk_rel, riinfo->fk_attnums[i]));
+					 RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		ri_GenerateQual(&querybuf, sep,
 						pkattname, pk_type,
-						riinfo->pf_eq_oprs[i],
+						riinfo->param->pf_eq_oprs[i],
 						fkattname, fk_type);
 		if (pk_coll != fk_coll)
 			ri_GenerateQualCollation(&querybuf, pk_coll);
@@ -1666,13 +1693,13 @@ RI_PartitionRemove_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 		appendStringInfoString(&querybuf, ") WHERE (");
 
 	sep = "";
-	for (i = 0; i < riinfo->nkeys; i++)
+	for (i = 0; i < riinfo->param->nkeys; i++)
 	{
-		quoteOneName(fkattname, RIAttName(fk_rel, riinfo->fk_attnums[i]));
+		quoteOneName(fkattname, RIAttName(fk_rel, riinfo->param->fk_attnums[i]));
 		appendStringInfo(&querybuf,
 						 "%sfk.%s IS NOT NULL",
 						 sep, fkattname);
-		switch (riinfo->confmatchtype)
+		switch (riinfo->param->confmatchtype)
 		{
 			case FKCONSTR_MATCH_SIMPLE:
 				sep = " AND ";
@@ -1762,8 +1789,8 @@ RI_PartitionRemove_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
 		 * or fk_rel's tupdesc.
 		 */
 		memcpy(&fake_riinfo, riinfo, sizeof(RI_ConstraintInfo));
-		for (i = 0; i < fake_riinfo.nkeys; i++)
-			fake_riinfo.pk_attnums[i] = i + 1;
+		for (i = 0; i < fake_riinfo.param->nkeys; i++)
+			fake_riinfo.param->pk_attnums[i] = i + 1;
 
 		ri_ReportViolation(&fake_riinfo, pk_rel, fk_rel,
 						   slot, tupdesc, 0, true);
@@ -1905,7 +1932,7 @@ ri_BuildQueryKey(RI_QueryKey *key, const RI_ConstraintInfo *riinfo,
 	 * We assume struct RI_QueryKey contains no padding bytes, else we'd need
 	 * to use memset to clear them.
 	 */
-	key->constr_id = riinfo->constraint_id;
+	key->constr_id = riinfo->param->query_key;
 	key->constr_queryno = constr_queryno;
 }
 
@@ -1983,25 +2010,25 @@ ri_FetchConstraintInfo(Trigger *trigger, Relation trig_rel, bool rel_is_pk)
 	if (rel_is_pk)
 	{
 		if (riinfo->fk_relid != trigger->tgconstrrelid ||
-			riinfo->pk_relid != RelationGetRelid(trig_rel))
+			riinfo->param->pk_relid != RelationGetRelid(trig_rel))
 			elog(ERROR, "wrong pg_constraint entry for trigger \"%s\" on table \"%s\"",
 				 trigger->tgname, RelationGetRelationName(trig_rel));
 	}
 	else
 	{
 		if (riinfo->fk_relid != RelationGetRelid(trig_rel) ||
-			riinfo->pk_relid != trigger->tgconstrrelid)
+			riinfo->param->pk_relid != trigger->tgconstrrelid)
 			elog(ERROR, "wrong pg_constraint entry for trigger \"%s\" on table \"%s\"",
 				 trigger->tgname, RelationGetRelationName(trig_rel));
 	}
 
-	if (riinfo->confmatchtype != FKCONSTR_MATCH_FULL &&
-		riinfo->confmatchtype != FKCONSTR_MATCH_PARTIAL &&
-		riinfo->confmatchtype != FKCONSTR_MATCH_SIMPLE)
+	if (riinfo->param->confmatchtype != FKCONSTR_MATCH_FULL &&
+		riinfo->param->confmatchtype != FKCONSTR_MATCH_PARTIAL &&
+		riinfo->param->confmatchtype != FKCONSTR_MATCH_SIMPLE)
 		elog(ERROR, "unrecognized confmatchtype: %d",
-			 riinfo->confmatchtype);
+			 riinfo->param->confmatchtype);
 
-	if (riinfo->confmatchtype == FKCONSTR_MATCH_PARTIAL)
+	if (riinfo->param->confmatchtype == FKCONSTR_MATCH_PARTIAL)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("MATCH PARTIAL not yet implemented")));
@@ -2032,8 +2059,15 @@ ri_LoadConstraintInfo(Oid constraintOid)
 	riinfo = (RI_ConstraintInfo *) hash_search(ri_constraint_cache,
 											   (void *) &constraintOid,
 											   HASH_ENTER, &found);
+
 	if (!found)
+	{
 		riinfo->valid = false;
+		riinfo->param = (RI_ConstraintParam *)
+			MemoryContextAlloc(ri_constraint_cache_cxt,
+							   sizeof(RI_ConstraintParam));
+		riinfo->param->ownerinfo = riinfo;
+	}
 	else if (riinfo->valid)
 		return riinfo;
 
@@ -2051,22 +2085,23 @@ ri_LoadConstraintInfo(Oid constraintOid)
 
 	/* And extract data */
 	Assert(riinfo->constraint_id == constraintOid);
+	riinfo->param->query_key = riinfo->constraint_id;
 	riinfo->oidHashValue = GetSysCacheHashValue1(CONSTROID,
 												 ObjectIdGetDatum(constraintOid));
 	memcpy(&riinfo->conname, &conForm->conname, sizeof(NameData));
-	riinfo->pk_relid = conForm->confrelid;
+	riinfo->param->pk_relid = conForm->confrelid;
 	riinfo->fk_relid = conForm->conrelid;
-	riinfo->confupdtype = conForm->confupdtype;
-	riinfo->confdeltype = conForm->confdeltype;
-	riinfo->confmatchtype = conForm->confmatchtype;
+	riinfo->param->confupdtype = conForm->confupdtype;
+	riinfo->param->confdeltype = conForm->confdeltype;
+	riinfo->param->confmatchtype = conForm->confmatchtype;
 
 	DeconstructFkConstraintRow(tup,
-							   &riinfo->nkeys,
-							   riinfo->fk_attnums,
-							   riinfo->pk_attnums,
-							   riinfo->pf_eq_oprs,
-							   riinfo->pp_eq_oprs,
-							   riinfo->ff_eq_oprs);
+							   &riinfo->param->nkeys,
+							   riinfo->param->fk_attnums,
+							   riinfo->param->pk_attnums,
+							   riinfo->param->pf_eq_oprs,
+							   riinfo->param->pp_eq_oprs,
+							   riinfo->param->ff_eq_oprs);
 
 	ReleaseSysCache(tup);
 
@@ -2227,7 +2262,7 @@ ri_PerformCheck(const RI_ConstraintInfo *riinfo,
 						 vals, nulls);
 		if (oldslot)
 			ri_ExtractValues(source_rel, oldslot, riinfo, source_is_pk,
-							 vals + riinfo->nkeys, nulls + riinfo->nkeys);
+							 vals + riinfo->param->nkeys, nulls + riinfo->param->nkeys);
 	}
 	else
 	{
@@ -2320,11 +2355,11 @@ ri_ExtractValues(Relation rel, TupleTableSlot *slot,
 	bool		isnull;
 
 	if (rel_is_pk)
-		attnums = riinfo->pk_attnums;
+		attnums = riinfo->param->pk_attnums;
 	else
-		attnums = riinfo->fk_attnums;
+		attnums = riinfo->param->fk_attnums;
 
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
 		vals[i] = slot_getattr(slot, attnums[i], &isnull);
 		nulls[i] = isnull ? 'n' : ' ';
@@ -2361,14 +2396,14 @@ ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 	onfk = (queryno == RI_PLAN_CHECK_LOOKUPPK);
 	if (onfk)
 	{
-		attnums = riinfo->fk_attnums;
+		attnums = riinfo->param->fk_attnums;
 		rel_oid = fk_rel->rd_id;
 		if (tupdesc == NULL)
 			tupdesc = fk_rel->rd_att;
 	}
 	else
 	{
-		attnums = riinfo->pk_attnums;
+		attnums = riinfo->param->pk_attnums;
 		rel_oid = pk_rel->rd_id;
 		if (tupdesc == NULL)
 			tupdesc = pk_rel->rd_att;
@@ -2395,7 +2430,7 @@ ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 		if (aclresult != ACLCHECK_OK)
 		{
 			/* Try for column-level permissions */
-			for (int idx = 0; idx < riinfo->nkeys; idx++)
+			for (int idx = 0; idx < riinfo->param->nkeys; idx++)
 			{
 				aclresult = pg_attribute_aclcheck(rel_oid, attnums[idx],
 												  GetUserId(),
@@ -2418,7 +2453,7 @@ ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 		/* Get printable versions of the keys involved */
 		initStringInfo(&key_names);
 		initStringInfo(&key_values);
-		for (int idx = 0; idx < riinfo->nkeys; idx++)
+		for (int idx = 0; idx < riinfo->param->nkeys; idx++)
 		{
 			int			fnum = attnums[idx];
 			Form_pg_attribute att = TupleDescAttr(tupdesc, fnum - 1);
@@ -2508,11 +2543,11 @@ ri_NullCheck(TupleDesc tupDesc,
 	bool		nonenull = true;
 
 	if (rel_is_pk)
-		attnums = riinfo->pk_attnums;
+		attnums = riinfo->param->pk_attnums;
 	else
-		attnums = riinfo->fk_attnums;
+		attnums = riinfo->param->fk_attnums;
 
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
 		if (slot_attisnull(slot, attnums[i]))
 			nonenull = false;
@@ -2540,12 +2575,19 @@ ri_InitHashTables(void)
 {
 	HASHCTL		ctl;
 
+	Assert(ri_constraint_cache_cxt == NULL);
+	ri_constraint_cache_cxt = AllocSetContextCreate(CacheMemoryContext,
+													"RI constraint cache",
+													ALLOCSET_DEFAULT_SIZES);
+		
 	memset(&ctl, 0, sizeof(ctl));
 	ctl.keysize = sizeof(Oid);
 	ctl.entrysize = sizeof(RI_ConstraintInfo);
+	ctl.hcxt = ri_constraint_cache_cxt;
 	ri_constraint_cache = hash_create("RI constraint cache",
 									  RI_INIT_CONSTRAINTHASHSIZE,
-									  &ctl, HASH_ELEM | HASH_BLOBS);
+									  &ctl,
+									  HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
 
 	/* Arrange to flush cache on pg_constraint changes */
 	CacheRegisterSyscacheCallback(CONSTROID,
@@ -2555,16 +2597,18 @@ ri_InitHashTables(void)
 	memset(&ctl, 0, sizeof(ctl));
 	ctl.keysize = sizeof(RI_QueryKey);
 	ctl.entrysize = sizeof(RI_QueryHashEntry);
+	ctl.hcxt = ri_constraint_cache_cxt;
 	ri_query_cache = hash_create("RI query cache",
 								 RI_INIT_QUERYHASHSIZE,
-								 &ctl, HASH_ELEM | HASH_BLOBS);
+								 &ctl, HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
 
 	memset(&ctl, 0, sizeof(ctl));
 	ctl.keysize = sizeof(RI_CompareKey);
 	ctl.entrysize = sizeof(RI_CompareHashEntry);
+	ctl.hcxt = ri_constraint_cache_cxt;
 	ri_compare_cache = hash_create("RI compare cache",
 								   RI_INIT_QUERYHASHSIZE,
-								   &ctl, HASH_ELEM | HASH_BLOBS);
+								   &ctl, HASH_ELEM | HASH_BLOBS | HASH_CONTEXT);
 }
 
 
@@ -2667,12 +2711,12 @@ ri_KeysEqual(Relation rel, TupleTableSlot *oldslot, TupleTableSlot *newslot,
 	const int16 *attnums;
 
 	if (rel_is_pk)
-		attnums = riinfo->pk_attnums;
+		attnums = riinfo->param->pk_attnums;
 	else
-		attnums = riinfo->fk_attnums;
+		attnums = riinfo->param->fk_attnums;
 
 	/* XXX: could be worthwhile to fetch all necessary attrs at once */
-	for (int i = 0; i < riinfo->nkeys; i++)
+	for (int i = 0; i < riinfo->param->nkeys; i++)
 	{
 		Datum		oldvalue;
 		Datum		newvalue;
@@ -2702,7 +2746,8 @@ ri_KeysEqual(Relation rel, TupleTableSlot *oldslot, TupleTableSlot *newslot,
 			 * difference for ON UPDATE CASCADE, but for consistency we treat
 			 * all changes to the PK the same.
 			 */
-			Form_pg_attribute att = TupleDescAttr(oldslot->tts_tupleDescriptor, attnums[i] - 1);
+			Form_pg_attribute att =
+				TupleDescAttr(oldslot->tts_tupleDescriptor, attnums[i] - 1);
 
 			if (!datum_image_eq(oldvalue, newvalue, att->attbyval, att->attlen))
 				return false;
@@ -2714,7 +2759,8 @@ ri_KeysEqual(Relation rel, TupleTableSlot *oldslot, TupleTableSlot *newslot,
 			 * operator.  Changes that compare equal will still satisfy the
 			 * constraint after the update.
 			 */
-			if (!ri_AttributesEqual(riinfo->ff_eq_oprs[i], RIAttType(rel, attnums[i]),
+			if (!ri_AttributesEqual(riinfo->param->ff_eq_oprs[i],
+									RIAttType(rel, attnums[i]),
 									oldvalue, newvalue))
 				return false;
 		}
-- 
2.18.4

0002-share-param-part-of-riinfo.patchtext/x-patch; charset=us-asciiDownload

From 67cbe62a069b3cb6b50eab7215094f71660c4cf5 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyoga.ntt@gmail.com>
Date: Fri, 4 Dec 2020 12:01:26 +0900
Subject: [PATCH 2/2] share param part of riinfo

---
 src/backend/utils/adt/ri_triggers.c | 66 +++++++++++++++++++++++++++--
 1 file changed, 62 insertions(+), 4 deletions(-)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 0306bf7739..187884f622 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -2043,9 +2043,11 @@ static const RI_ConstraintInfo *
 ri_LoadConstraintInfo(Oid constraintOid)
 {
 	RI_ConstraintInfo *riinfo;
+	RI_ConstraintParam tmpparam;
 	bool		found;
 	HeapTuple	tup;
 	Form_pg_constraint conForm;
+	Oid			conparentid;
 
 	/*
 	 * On the first call initialize the hashtable
@@ -2063,10 +2065,10 @@ ri_LoadConstraintInfo(Oid constraintOid)
 	if (!found)
 	{
 		riinfo->valid = false;
-		riinfo->param = (RI_ConstraintParam *)
-			MemoryContextAlloc(ri_constraint_cache_cxt,
-							   sizeof(RI_ConstraintParam));
-		riinfo->param->ownerinfo = riinfo;
+
+		/* tentatively link the tmp parameter area to the riinfo */
+		riinfo->param = &tmpparam;
+		riinfo->param->ownerinfo = NULL;
 	}
 	else if (riinfo->valid)
 		return riinfo;
@@ -2103,8 +2105,64 @@ ri_LoadConstraintInfo(Oid constraintOid)
 							   riinfo->param->pp_eq_oprs,
 							   riinfo->param->ff_eq_oprs);
 
+	conparentid = conForm->conparentid;
 	ReleaseSysCache(tup);
 
+	Assert(riinfo->param);
+
+	/* check if this relation can share the parameter part with the parent. */
+	if (OidIsValid(conparentid))
+	{
+		const RI_ConstraintInfo *priinfo;
+		RI_ConstraintParam		*pparam;
+		RI_ConstraintParam		*param = riinfo->param;
+
+		/* load parent's riinfo (recurses to the topmost parent) */
+		priinfo = ri_LoadConstraintInfo(conparentid);
+		pparam = priinfo->param;
+
+		/* check if the parameter is identical with the parent */
+		if (memcmp(param, pparam,
+				   offsetof(RI_ConstraintParam, pk_attnums)) == 0)
+		{
+			int i;
+
+			/*
+			 * The unused elements of the arrays are not guaranteed to be
+			 * zero-filled. Check only the used elements.
+			 */
+			for (i = 0 ; i < param->nkeys ; i++)
+			{
+				if (param->pk_attnums[i] != pparam->pk_attnums[i] ||
+					param->fk_attnums[i] != pparam->fk_attnums[i] ||
+					param->pf_eq_oprs[i] != pparam->pf_eq_oprs[i] ||
+					param->pp_eq_oprs[i] != pparam->pp_eq_oprs[i] ||
+					param->ff_eq_oprs[i] != pparam->ff_eq_oprs[i])
+					break;
+			}
+
+			if (i == param->nkeys)
+			{
+				/* all parameters match. share the parameters with the parent */
+				if (riinfo->param->ownerinfo == riinfo)
+					pfree(riinfo->param);
+				riinfo->param = pparam;
+			}
+		}
+	}
+
+	/* make a permanent copy if the parameter area is tentative */
+	if (riinfo->param == &tmpparam)
+	{
+		riinfo->param = (RI_ConstraintParam *)
+			MemoryContextAlloc(ri_constraint_cache_cxt,
+							   sizeof(RI_ConstraintParam));
+		/* copy parameter values */
+		memcpy(riinfo->param, &tmpparam,
+			   offsetof(RI_ConstraintParam, ownerinfo));
+		riinfo->param->ownerinfo = riinfo;
+	}
+
 	/*
 	 * For efficient processing of invalidation messages below, we keep a
 	 * doubly-linked list, and a count, of all currently valid entries.
-- 
2.18.4

#24

horikyota.ntt@gmail.com

about 5 years ago

In reply to: Keisuke Kuroda (#22)

Re: Huge memory consumption on partitioned table with FKs

At Fri, 4 Dec 2020 12:00:09 +0900, Keisuke Kuroda <keisuke.kuroda.3862@gmail.com> wrote in

Hi Amit,

I have attached a patch in which I've tried to merge the ideas from
both my patch and Kuroda-san's. I liked that his patch added
conparentid to RI_ConstraintInfo because that saves a needless
syscache lookup for constraints that don't have a parent. I've kept
my idea to compute the root constraint id only once in
ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey().
Kuroda-san, anything you'd like to add to that?

Thank you for the merge! It looks good to me.
I think a fix for InvalidateConstraintCacheCallBack() is also good.

I also confirmed that the patch passed the make check-world.

It's fine that constraint_rood_id overrides constraint_id, but how
about that constraint_root_id stores constraint_id if it is not a
partition? That change makes the patch a bit simpler.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#25

amitlangote09@gmail.com

about 5 years ago

In reply to: Kyotaro Horiguchi (#24)

1 attachment(s)

Re: Huge memory consumption on partitioned table with FKs

On Fri, Dec 4, 2020 at 2:48 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Fri, 4 Dec 2020 12:00:09 +0900, Keisuke Kuroda <keisuke.kuroda.3862@gmail.com> wrote in

Hi Amit,

I have attached a patch in which I've tried to merge the ideas from
both my patch and Kuroda-san's. I liked that his patch added
conparentid to RI_ConstraintInfo because that saves a needless
syscache lookup for constraints that don't have a parent. I've kept
my idea to compute the root constraint id only once in
ri_LoadConstraint(), not on every invocation of ri_BuildQueryKey().
Kuroda-san, anything you'd like to add to that?

Thank you for the merge! It looks good to me.
I think a fix for InvalidateConstraintCacheCallBack() is also good.

I also confirmed that the patch passed the make check-world.

It's fine that constraint_rood_id overrides constraint_id, but how
about that constraint_root_id stores constraint_id if it is not a
partition? That change makes the patch a bit simpler.

My patch was like that before posting to this thread, but keeping
constraint_id and constraint_root_id separate looked better for
documenting the partitioning case as working differently from the
regular table case. I guess a comment in ri_BuildQueryKey is enough
for that though and it's not like we're using constraint_root_id in
any other place to make matters confusing, so I changed it as you
suggest. Updated patch attached.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachments:

v3-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patchapplication/octet-stream; name=v3-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patchDownload

From 857911593a33cabb05d609140aab7fc1ce7b9d62 Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Thu, 23 Apr 2020 21:11:58 +0900
Subject: [PATCH v3] ri_triggers.c: Use root constraint OID as key to
 ri_query_cache

All child constraints of a given foreign key constraint can use the
same RI query and the resulting plan, that is, no need to create as
many copies of the query and the plan as there are partitions, as
happens now due to the child constraint OID being used in the key
for ri_query_cache.

Keisuke Kuroda, Amit Langote
---
 src/backend/utils/adt/ri_triggers.c | 73 ++++++++++++++++++++++++++++++++++---
 1 file changed, 67 insertions(+), 6 deletions(-)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 02b1a38..042d651 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -95,13 +95,20 @@
  * RI_ConstraintInfo
  *
  * Information extracted from an FK pg_constraint entry.  This is cached in
- * ri_constraint_cache.
+ * ri_constraint_cache with constraint_root_id as a part of the hash key.
  */
 typedef struct RI_ConstraintInfo
 {
-	Oid			constraint_id;	/* OID of pg_constraint entry (hash key) */
+	Oid			constraint_id;	/* OID of pg_constraint entry */
+	Oid			constraint_parent;	/* OID of parent of this constraint */
+	Oid			constraint_root_id;	/* OID of the constraint in some ancestor
+									 * of the partition (most commonly root
+									 * ancestor) from which this constraint was
+									 * inherited; same as constraint_id if the
+									 * constraint is non-inherited */
 	bool		valid;			/* successfully initialized? */
-	uint32		oidHashValue;	/* hash value of pg_constraint OID */
+	uint32		oidHashValue;	/* hash value of constraint_id */
+	uint32		rootHashValue;	/* hash value of constraint_root_id */
 	NameData	conname;		/* name of the FK constraint */
 	Oid			pk_relid;		/* referenced relation */
 	Oid			fk_relid;		/* referencing relation */
@@ -221,6 +228,8 @@ static void ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 							   Relation pk_rel, Relation fk_rel,
 							   TupleTableSlot *violatorslot, TupleDesc tupdesc,
 							   int queryno, bool partgone) pg_attribute_noreturn();
+static Oid get_ri_constraint_root(const RI_ConstraintInfo *riinfo);
+static Oid get_ri_constraint_root_recurse(Oid constrOid);
 
 
 /*
@@ -1892,7 +1901,7 @@ ri_GenerateQualCollation(StringInfo buf, Oid collation)
  *	Construct a hashtable key for a prepared SPI plan of an FK constraint.
  *
  *		key: output argument, *key is filled in based on the other arguments
- *		riinfo: info from pg_constraint entry
+ *		riinfo: info derived from pg_constraint entry
  *		constr_queryno: an internal number identifying the query type
  *			(see RI_PLAN_XXX constants at head of file)
  * ----------
@@ -1902,10 +1911,18 @@ ri_BuildQueryKey(RI_QueryKey *key, const RI_ConstraintInfo *riinfo,
 				 int32 constr_queryno)
 {
 	/*
+	 * For partitions, constraint_root_id stores the OID of the constraint in
+	 * the ancestor from which this constraint has been inherited, so all
+	 * partitions descending from the same ancestor will share the SPI plan.
+	 * Sharing the plan is okay, because query used is the same for all
+	 * partitions irrespective of the RI query type.  That is true despite the
+	 * fact that riinfo->fk_attnums may differ among partitions sharing the
+	 * query, because they all refer to the same logical column set.
+	 *
 	 * We assume struct RI_QueryKey contains no padding bytes, else we'd need
 	 * to use memset to clear them.
 	 */
-	key->constr_id = riinfo->constraint_id;
+	key->constr_id = riinfo->constraint_root_id;
 	key->constr_queryno = constr_queryno;
 }
 
@@ -2010,6 +2027,37 @@ ri_FetchConstraintInfo(Trigger *trigger, Relation trig_rel, bool rel_is_pk)
 }
 
 /*
+ * get_ri_constraint_root
+ *		Returns the OID of RI constraint's root parent or its own if not
+ *		inherited
+ */
+static Oid
+get_ri_constraint_root(const RI_ConstraintInfo *riinfo)
+{
+	if (!OidIsValid(riinfo->constraint_parent))
+		return riinfo->constraint_id;
+
+	return get_ri_constraint_root_recurse(riinfo->constraint_parent);
+}
+
+static Oid
+get_ri_constraint_root_recurse(Oid constrOid)
+{
+	HeapTuple	tuple;
+	Oid			constrParentOid;
+
+	tuple = SearchSysCache1(CONSTROID, ObjectIdGetDatum(constrOid));
+	if (!HeapTupleIsValid(tuple))
+		elog(ERROR, "cache lookup failed for constraint %u", constrOid);
+	constrParentOid = ((Form_pg_constraint) GETSTRUCT(tuple))->conparentid;
+	ReleaseSysCache(tuple);
+	if (OidIsValid(constrParentOid))
+		return get_ri_constraint_root_recurse(constrParentOid);
+
+	return constrOid;
+}
+
+/*
  * Fetch or create the RI_ConstraintInfo struct for an FK constraint.
  */
 static const RI_ConstraintInfo *
@@ -2051,8 +2099,12 @@ ri_LoadConstraintInfo(Oid constraintOid)
 
 	/* And extract data */
 	Assert(riinfo->constraint_id == constraintOid);
+	riinfo->constraint_parent = conForm->conparentid;
+	riinfo->constraint_root_id = get_ri_constraint_root(riinfo);
 	riinfo->oidHashValue = GetSysCacheHashValue1(CONSTROID,
 												 ObjectIdGetDatum(constraintOid));
+	riinfo->rootHashValue = GetSysCacheHashValue1(CONSTROID,
+												  ObjectIdGetDatum(riinfo->constraint_root_id));
 	memcpy(&riinfo->conname, &conForm->conname, sizeof(NameData));
 	riinfo->pk_relid = conForm->confrelid;
 	riinfo->fk_relid = conForm->conrelid;
@@ -2117,7 +2169,16 @@ InvalidateConstraintCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
 		RI_ConstraintInfo *riinfo = dlist_container(RI_ConstraintInfo,
 													valid_link, iter.cur);
 
-		if (hashvalue == 0 || riinfo->oidHashValue == hashvalue)
+		/*
+		 * We look for changes to two specific pg_constraint entries here --
+		 * the one matching the constraint given by riinfo->constraint_id and
+		 * also the one given by riinfo->constraint_root_id.  The latter too
+		 * because if its parent is updated, it is no longer the root
+		 * constraint.
+		 */
+		if (hashvalue == 0 ||
+			riinfo->oidHashValue == hashvalue ||
+			riinfo->rootHashValue == hashvalue)
 		{
 			riinfo->valid = false;
 			/* Remove invalidated entries from the list, too */
-- 
1.8.3.1

#26

amitlangote09@gmail.com

about 5 years ago

In reply to: Kyotaro Horiguchi (#23)

Re: Huge memory consumption on partitioned table with FKs

On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

Also, the comment that was in RI_ConstraintInfo now appears in
RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
undocumented. What is the relationship between those two structs? I
see that they have pointers to each other, but I think the relationship
should be documented more clearly.

I'm not sure the footprint of this patch worth doing but here is a bit
more polished version.

I noticed that the foreign_key test fails and it may have to do with
the fact that a partition's param info remains attached to the
parent's RI_ConstraintInfo even after it's detached from the parent
table using DETACH PARTITION.

--
Amit Langote
EDB: http://www.enterprisedb.com

#27

alvherre@alvh.no-ip.org

about 5 years ago

In reply to: Amit Langote (#26)

Re: Huge memory consumption on partitioned table with FKs

On 2020-Dec-07, Amit Langote wrote:

On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

Also, the comment that was in RI_ConstraintInfo now appears in
RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
undocumented. What is the relationship between those two structs? I
see that they have pointers to each other, but I think the relationship
should be documented more clearly.

I'm not sure the footprint of this patch worth doing but here is a bit
more polished version.

I noticed that the foreign_key test fails and it may have to do with
the fact that a partition's param info remains attached to the
parent's RI_ConstraintInfo even after it's detached from the parent
table using DETACH PARTITION.

I think this bit about splitting the struct is a distraction. Let's get
a patch that solves the bug first, and then we can discuss what further
refinements we want to do. I think we should get your patch in
CA+HiwqEOrfN9b=f3sDmySPGc4gO-L_VMFHXLLxVmmdP34e64+w@mail.gmail.com
committed (which I have not read yet.) Do you agree with this plan?

#28

amitlangote09@gmail.com

about 5 years ago

In reply to: Alvaro Herrera (#27)

Re: Huge memory consumption on partitioned table with FKs

Hi Alvaro,

On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2020-Dec-07, Amit Langote wrote:

On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

Also, the comment that was in RI_ConstraintInfo now appears in
RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
undocumented. What is the relationship between those two structs? I
see that they have pointers to each other, but I think the

relationship

should be documented more clearly.

I'm not sure the footprint of this patch worth doing but here is a bit
more polished version.

I noticed that the foreign_key test fails and it may have to do with
the fact that a partition's param info remains attached to the
parent's RI_ConstraintInfo even after it's detached from the parent
table using DETACH PARTITION.

I think this bit about splitting the struct is a distraction. Let's get
a patch that solves the bug first, and then we can discuss what further
refinements we want to do. I think we should get your patch in
CA+HiwqEOrfN9b=f3sDmySPGc4gO-L_VMFHXLLxVmmdP34e64+w@mail.gmail.com
committed (which I have not read yet.) Do you agree with this plan?

Yeah, I agree.

- Amit

--

Amit Langote
EDB: http://www.enterprisedb.com

#29

Or /messages/by-id/CA+HiwqGrr2YOO6voBM6m_OAc9w-WMxe1gOuQ-UyDPin6zJtyZw@mail.gmail.com ?

horikyota.ntt@gmail.com

about 5 years ago

In reply to: Amit Langote (#28)

Re: Huge memory consumption on partitioned table with FKs

At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

Hi Alvaro,

On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2020-Dec-07, Amit Langote wrote:

On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

Also, the comment that was in RI_ConstraintInfo now appears in
RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
undocumented. What is the relationship between those two structs? I
see that they have pointers to each other, but I think the

relationship

should be documented more clearly.

I'm not sure the footprint of this patch worth doing but here is a bit
more polished version.

I noticed that the foreign_key test fails and it may have to do with
the fact that a partition's param info remains attached to the
parent's RI_ConstraintInfo even after it's detached from the parent
table using DETACH PARTITION.

I think this bit about splitting the struct is a distraction. Let's get
a patch that solves the bug first, and then we can discuss what further
refinements we want to do. I think we should get your patch in
CA+HiwqEOrfN9b=f3sDmySPGc4gO-L_VMFHXLLxVmmdP34e64+w@mail.gmail.com
committed (which I have not read yet.) Do you agree with this plan?

Yeah, I agree.

+1 from me to either one.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#30

amitlangote09@gmail.com

about 5 years ago

In reply to: Kyotaro Horiguchi (#29)

Re: Huge memory consumption on partitioned table with FKs

On Tue, Dec 8, 2020 at 12:04 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

Hi Alvaro,

On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

On 2020-Dec-07, Amit Langote wrote:

On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

Also, the comment that was in RI_ConstraintInfo now appears in
RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
undocumented. What is the relationship between those two structs? I
see that they have pointers to each other, but I think the

relationship

should be documented more clearly.

I'm not sure the footprint of this patch worth doing but here is a bit
more polished version.

I noticed that the foreign_key test fails and it may have to do with
the fact that a partition's param info remains attached to the
parent's RI_ConstraintInfo even after it's detached from the parent
table using DETACH PARTITION.

I think this bit about splitting the struct is a distraction. Let's get
a patch that solves the bug first, and then we can discuss what further
refinements we want to do. I think we should get your patch in
CA+HiwqEOrfN9b=f3sDmySPGc4gO-L_VMFHXLLxVmmdP34e64+w@mail.gmail.com
committed (which I have not read yet.) Do you agree with this plan?

Yeah, I agree.

Or /messages/by-id/CA+HiwqGrr2YOO6voBM6m_OAc9w-WMxe1gOuQ-UyDPin6zJtyZw@mail.gmail.com ?

+1 from me to either one.

Oh, I hadn't actually checked the actual message that Alvaro
mentioned, but yeah I too am fine with either that one or the latest
one.

--
Amit Langote
EDB: http://www.enterprisedb.com

#31

amitlangote09@gmail.com

about 5 years ago

In reply to: Amit Langote (#26)

Re: Huge memory consumption on partitioned table with FKs

On Mon, Dec 7, 2020 at 11:01 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Fri, Dec 4, 2020 at 12:05 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

Also, the comment that was in RI_ConstraintInfo now appears in
RI_ConstraintParam, and the new struct (RI_ConstraintInfo) is now
undocumented. What is the relationship between those two structs? I
see that they have pointers to each other, but I think the relationship
should be documented more clearly.

I'm not sure the footprint of this patch worth doing but here is a bit
more polished version.

I noticed that the foreign_key test fails and it may have to do with
the fact that a partition's param info remains attached to the
parent's RI_ConstraintInfo even after it's detached from the parent
table using DETACH PARTITION.

Just for (maybe not so distant) future reference, I managed to fix the
failure with this:

diff --git a/src/backend/utils/adt/ri_triggers.c
b/src/backend/utils/adt/ri_triggers.c
index 187884f..c67f2a6 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -1932,7 +1932,7 @@ ri_BuildQueryKey(RI_QueryKey *key, const
RI_ConstraintInfo *riinfo,
     * We assume struct RI_QueryKey contains no padding bytes, else we'd need
     * to use memset to clear them.
     */
-   key->constr_id = riinfo->param->query_key;
+   key->constr_id = riinfo->constraint_id;
    key->constr_queryno = constr_queryno;
 }

It seems the shareable part (param) is not properly invalidated after
a partition's constraint is detached, apparently leaving query_key in
it set to the parent's constraint id.

--
Amit Langote
EDB: http://www.enterprisedb.com

#32

keisuke.kuroda.3862@gmail.com

about 5 years ago

In reply to: Amit Langote (#31)

Re: Huge memory consumption on partitioned table with FKs

Hi Amit-san,

I noticed that this patch
(v3-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patch)
is not registered in the commitfest. I think it needs to be registered for
commit, is that correct?

I have confirmed that this patch can be applied to HEAD (master),
and that check-world PASS.

Best Regards,

--
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3862@gmail.com

#33

https://commitfest.postgresql.org/32/2930/

amitlangote09@gmail.com

about 5 years ago

In reply to: Keisuke Kuroda (#32)

Re: Huge memory consumption on partitioned table with FKs

Kuroda-san,

On Fri, Jan 8, 2021 at 1:02 PM Keisuke Kuroda
<keisuke.kuroda.3862@gmail.com> wrote:

Hi Amit-san,

I noticed that this patch
(v3-0001-ri_triggers.c-Use-root-constraint-OID-as-key-to-r.patch)
is not registered in the commitfest. I think it needs to be registered for
commit, is that correct?

I have confirmed that this patch can be applied to HEAD (master),
and that check-world PASS.

Thanks for checking. Indeed, it should have been added to the January
commit-fest. I've added it to the March one:

--
Amit Langote
EDB: http://www.enterprisedb.com

#34

keisuke.kuroda.3862@gmail.com

about 5 years ago

In reply to: Amit Langote (#33)

Re: Huge memory consumption on partitioned table with FKs

Hi Amit-san,

Thanks for checking. Indeed, it should have been added to the January
commit-fest. I've added it to the March one:

https://commitfest.postgresql.org/32/2930/

Thank you for your quick response!

--
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3862@gmail.com

#35

David Steele

david@pgmasters.net

almost 5 years ago

In reply to: Amit Langote (#30)

Re: Huge memory consumption on partitioned table with FKs

On 12/7/20 10:59 PM, Amit Langote wrote:

On Tue, Dec 8, 2020 at 12:04 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

I think this bit about splitting the struct is a distraction. Let's get
a patch that solves the bug first, and then we can discuss what further
refinements we want to do. I think we should get your patch in
CA+HiwqEOrfN9b=f3sDmySPGc4gO-L_VMFHXLLxVmmdP34e64+w@mail.gmail.com
committed (which I have not read yet.) Do you agree with this plan?

Yeah, I agree.

Or /messages/by-id/CA+HiwqGrr2YOO6voBM6m_OAc9w-WMxe1gOuQ-UyDPin6zJtyZw@mail.gmail.com ?

+1 from me to either one.

Oh, I hadn't actually checked the actual message that Alvaro
mentioned, but yeah I too am fine with either that one or the latest
one.

Any progress on the decision of how to handle this bug? Not sure if that
was handled elsewhere but it appears to be key to making progress on
this patch.

Regards,
--
-David
david@pgmasters.net

#36

amitlangote09@gmail.com

almost 5 years ago

In reply to: David Steele (#35)

Re: Huge memory consumption on partitioned table with FKs

Hi David,

On Wed, Mar 3, 2021 at 10:21 PM David Steele <david@pgmasters.net> wrote:

On 12/7/20 10:59 PM, Amit Langote wrote:

On Tue, Dec 8, 2020 at 12:04 PM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:

At Tue, 8 Dec 2020 01:16:00 +0900, Amit Langote <amitlangote09@gmail.com> wrote in

On Mon, Dec 7, 2020 at 23:48 Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:

I think this bit about splitting the struct is a distraction. Let's get
a patch that solves the bug first, and then we can discuss what further
refinements we want to do. I think we should get your patch in
CA+HiwqEOrfN9b=f3sDmySPGc4gO-L_VMFHXLLxVmmdP34e64+w@mail.gmail.com
committed (which I have not read yet.) Do you agree with this plan?

Yeah, I agree.

Or /messages/by-id/CA+HiwqGrr2YOO6voBM6m_OAc9w-WMxe1gOuQ-UyDPin6zJtyZw@mail.gmail.com ?

+1 from me to either one.

Oh, I hadn't actually checked the actual message that Alvaro
mentioned, but yeah I too am fine with either that one or the latest
one.

Any progress on the decision of how to handle this bug? Not sure if that
was handled elsewhere but it appears to be key to making progress on
this patch.

The v3 patch posted on Dec 4 is still what is being proposed to fix this.

I don't know of any unaddressed comments on the patch, so I've marked
the entry Ready for Committer.

--
Amit Langote
EDB: http://www.enterprisedb.com

#37

alvherre@alvh.no-ip.org

almost 5 years ago

In reply to: Amit Langote (#36)

Re: Huge memory consumption on partitioned table with FKs

On 2021-Mar-03, Amit Langote wrote:

I don't know of any unaddressed comments on the patch, so I've marked
the entry Ready for Committer.

Thanks, I'll look at it later this week.

--
ï¿½lvaro Herrera 39ï¿½49'30"S 73ï¿½17'W
#error "Operator lives in the wrong universe"
("Use of cookies in real-time system development", M. Gleixner, M. Mc Guire)

#38

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: Amit Langote (#25)

Re: Huge memory consumption on partitioned table with FKs

Amit Langote <amitlangote09@gmail.com> writes:

Updated patch attached.

This claim seems false on its face:

All child constraints of a given foreign key constraint can use the
same RI query and the resulting plan, that is, no need to create as
many copies of the query and the plan as there are partitions, as
happens now due to the child constraint OID being used in the key
for ri_query_cache.

What if the child tables don't have the same physical column numbers
as the parent? The comment claiming that it's okay if riinfo->fk_attnums
doesn't match seems quite off point, because the query plan is still
going to need to use the correct column numbers. Even if column numbers
are the same, the plan would also contain table and index OIDs that could
only be right for one partition.

I could imagine translating a parent plan to apply to a child instead of
building it from scratch, but that would take a lot of code we don't have
(there's no rewriteHandler infrastructure for plan nodes).

Maybe I'm missing something, because I see that the cfbot claims
this is passing, and I'd sure like to think that our test coverage
is not so thin that it'd fail to detect probing the wrong partition
for foreign key matches. But that's what it looks like this patch
will do.

regards, tom lane

#39

amitlangote09@gmail.com

almost 5 years ago

In reply to: Tom Lane (#38)

1 attachment(s)

Re: Huge memory consumption on partitioned table with FKs

On Fri, Mar 5, 2021 at 6:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

Updated patch attached.

This claim seems false on its face:

All child constraints of a given foreign key constraint can use the
same RI query and the resulting plan, that is, no need to create as
many copies of the query and the plan as there are partitions, as
happens now due to the child constraint OID being used in the key
for ri_query_cache.

What if the child tables don't have the same physical column numbers
as the parent? The comment claiming that it's okay if riinfo->fk_attnums
doesn't match seems quite off point, because the query plan is still
going to need to use the correct column numbers. Even if column numbers
are the same, the plan would also contain table and index OIDs that could
only be right for one partition.

I could imagine translating a parent plan to apply to a child instead of
building it from scratch, but that would take a lot of code we don't have
(there's no rewriteHandler infrastructure for plan nodes).

Maybe I'm missing something, because I see that the cfbot claims
this is passing, and I'd sure like to think that our test coverage
is not so thin that it'd fail to detect probing the wrong partition
for foreign key matches. But that's what it looks like this patch
will do.

The quoted comment could have been written to be clearer about this,
but it's not talking about the table that is to be queried, but the
table whose RI trigger is being executed. In all the cases except one
(mentioned below), the table that is queried is the same irrespective
of which partition's trigger is being executed, so we're basically
creating the same plan separately for each partition.

create table foo (a int primary key) partition by range (a);
create table foo1 partition of foo for values from (minvalue) to (1);
create table foo2 partition of foo for values from (1) to (maxvalue);
create table bar (a int references foo) partition by range (a);
create table bar1 partition of bar for values from (minvalue) to (1);
create table bar2 partition of bar for values from (1) to (maxvalue);

insert into foo values (0), (1);
insert into bar values (0), (1);

For the 2nd insert statement, RI_FKey_check() issues the following
query irrespective of whether it is called from bar1's or bar2's
insert check RI trigger:

SELECT 1 FROM foo WHERE foo.a = $1;

Likewise for:

delete from foo;

ri_restrict() issues the following query irrespective of whether
called from foo1's or foo2's delete action trigger:

SELECT 1 FROM bar WHERE bar.a = $1;

The one case I found in which the queried table is the same as the
table whose trigger has been invoked is ri_Check_Pk_Match(), in which
case, it's actually wrong for partitions to share the plan, so the
patch was wrong in that case. Apparently, we didn't test that case
with partitioning, so I added one as follows:

+-- test that ri_Check_Pk_Match() scans the correct partition for a deferred
+-- ON DELETE/UPDATE NO ACTION constraint
+CREATE SCHEMA fkpart10
+  CREATE TABLE tbl1(f1 int PRIMARY KEY) PARTITION BY RANGE(f1)
+  CREATE TABLE tbl1_p1 PARTITION OF tbl1 FOR VALUES FROM (minvalue) TO (1)
+  CREATE TABLE tbl1_p2 PARTITION OF tbl1 FOR VALUES FROM (1) TO (maxvalue)
+  CREATE TABLE tbl2(f1 int REFERENCES tbl1 DEFERRABLE INITIALLY DEFERRED);
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+INSERT INTO fkpart10.tbl2 VALUES (0), (1);
+BEGIN;
+DELETE FROM fkpart10.tbl1 WHERE f1 = 0;
+UPDATE fkpart10.tbl1 SET f1 = 2 WHERE f1 = 1;
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+COMMIT;
+DROP SCHEMA fkpart10 CASCADE;
+NOTICE:  drop cascades to 2 other objects
+DETAIL:  drop cascades to table fkpart10.tbl1
+drop cascades to table fkpart10.tbl2

With the v3 patch, the test case fails with the following error on COMMIT:

+ERROR:  update or delete on table "tbl1_p2" violates foreign key
constraint "tbl2_f1_fkey2" on table "tbl2"
+DETAIL:  Key (f1)=(1) is still referenced from table "tbl2".

That's because in ri_Check_Pk_Match(), partition tbl2 ends up using
the same cached plan as tbl1 and so scans tbl1 to check whether the
row deleted from tbl2 has reappeared, which is of course wrong. I
have fixed that by continuing to use the individual partition's
constraint's OID as the query key for this particular query.

I have also removed the somewhat confusing comment about fk_attnums.
The point it was trying to make is that fk_attnums being possibly
different among partitions does not make a difference to what any
given shareable query's text ultimately looks like. Those attribute
numbers are only used by the macros RIAttName(), RIAttType(),
RIAttCollation() when generating the query text and they'd all return
the same values no matter which partition's attribute numbers are
used.

Updated patch attached.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachments:

v4-0001-Share-SPI-plan-between-partitions-in-some-RI-trig.patchapplication/x-patch; name=v4-0001-Share-SPI-plan-between-partitions-in-some-RI-trig.patchDownload

From b433aabbab5f74a4284fabdb078200c6c1e89e57 Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Thu, 23 Apr 2020 21:11:58 +0900
Subject: [PATCH v4] Share SPI plan between partitions in some RI triggers

In all cases except one, the RI triggers of leaf partitions that all
share a given foreign key constraint issue the same query through
SPI, although ri_query_cache stores the plan of the query separately
for each partition.  That is unnecessary and can even lead to a huge
increase in memory allocated to store those plans if the query's
target relation is a partitioned table containing many partitions.
This commit changes the key used by ri_query_cache in such cases to
be the shared foreign key constraint's root OID so that all
partitions sharing the constraint also share the plan.

However, the query issued by ri_Check_Pk_Match() targets the same
table as the one on which the RI trigger is fired, so each leaf
partition should get its own query and the associated plan in that
one case. This commit adds a test for this case.

Keisuke Kuroda, Amit Langote
---
 src/backend/utils/adt/ri_triggers.c       | 78 +++++++++++++++++++++--
 src/test/regress/expected/foreign_key.out | 18 ++++++
 src/test/regress/sql/foreign_key.sql      | 16 +++++
 3 files changed, 106 insertions(+), 6 deletions(-)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 6e3a41062f..db65d61c49 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -95,13 +95,20 @@
  * RI_ConstraintInfo
  *
  * Information extracted from an FK pg_constraint entry.  This is cached in
- * ri_constraint_cache.
+ * ri_constraint_cache with constraint_root_id as a part of the hash key.
  */
 typedef struct RI_ConstraintInfo
 {
-	Oid			constraint_id;	/* OID of pg_constraint entry (hash key) */
+	Oid			constraint_id;	/* OID of pg_constraint entry */
+	Oid			constraint_parent;	/* OID of parent of this constraint */
+	Oid			constraint_root_id;	/* OID of the constraint in some ancestor
+									 * of the partition (most commonly root
+									 * ancestor) from which this constraint is
+									 * inherited; same as constraint_id if the
+									 * constraint is non-inherited */
 	bool		valid;			/* successfully initialized? */
-	uint32		oidHashValue;	/* hash value of pg_constraint OID */
+	uint32		oidHashValue;	/* hash value of constraint_id */
+	uint32		rootHashValue;	/* hash value of constraint_root_id */
 	NameData	conname;		/* name of the FK constraint */
 	Oid			pk_relid;		/* referenced relation */
 	Oid			fk_relid;		/* referencing relation */
@@ -221,6 +228,8 @@ static void ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 							   Relation pk_rel, Relation fk_rel,
 							   TupleTableSlot *violatorslot, TupleDesc tupdesc,
 							   int queryno, bool partgone) pg_attribute_noreturn();
+static Oid get_ri_constraint_root(const RI_ConstraintInfo *riinfo);
+static Oid get_ri_constraint_root_recurse(Oid constrOid);
 
 
 /*
@@ -1892,7 +1901,7 @@ ri_GenerateQualCollation(StringInfo buf, Oid collation)
  *	Construct a hashtable key for a prepared SPI plan of an FK constraint.
  *
  *		key: output argument, *key is filled in based on the other arguments
- *		riinfo: info from pg_constraint entry
+ *		riinfo: info derived from pg_constraint entry
  *		constr_queryno: an internal number identifying the query type
  *			(see RI_PLAN_XXX constants at head of file)
  * ----------
@@ -1902,10 +1911,23 @@ ri_BuildQueryKey(RI_QueryKey *key, const RI_ConstraintInfo *riinfo,
 				 int32 constr_queryno)
 {
 	/*
+	 * For partitions, constraint_root_id stores the OID of the constraint in
+	 * the ancestor from which this constraint has been inherited.  For most
+	 * queries, we use that OID as the key instead of the partition's own
+	 * constraint's OID so that all partitions sharing the constraint will
+	 * also share the SPI plan. That's okay to do because the table that is
+	 * queried is the same irrespective of which partition's trigger is being
+	 * executed.  The only case in which the table queried is the same as the
+	 * table whose trigger is being executed is when performing
+	 * ri_Check_Pk_Match(), where each partition must have its on own plan.
+	 *
 	 * We assume struct RI_QueryKey contains no padding bytes, else we'd need
 	 * to use memset to clear them.
 	 */
-	key->constr_id = riinfo->constraint_id;
+	if (constr_queryno != RI_PLAN_CHECK_LOOKUPPK_FROM_PK)
+		key->constr_id = riinfo->constraint_root_id;
+	else
+		key->constr_id = riinfo->constraint_id;
 	key->constr_queryno = constr_queryno;
 }
 
@@ -2009,6 +2031,37 @@ ri_FetchConstraintInfo(Trigger *trigger, Relation trig_rel, bool rel_is_pk)
 	return riinfo;
 }
 
+/*
+ * get_ri_constraint_root
+ *		Returns the OID of RI constraint's root parent or its own if not
+ *		inherited
+ */
+static Oid
+get_ri_constraint_root(const RI_ConstraintInfo *riinfo)
+{
+	if (!OidIsValid(riinfo->constraint_parent))
+		return riinfo->constraint_id;
+
+	return get_ri_constraint_root_recurse(riinfo->constraint_parent);
+}
+
+static Oid
+get_ri_constraint_root_recurse(Oid constrOid)
+{
+	HeapTuple	tuple;
+	Oid			constrParentOid;
+
+	tuple = SearchSysCache1(CONSTROID, ObjectIdGetDatum(constrOid));
+	if (!HeapTupleIsValid(tuple))
+		elog(ERROR, "cache lookup failed for constraint %u", constrOid);
+	constrParentOid = ((Form_pg_constraint) GETSTRUCT(tuple))->conparentid;
+	ReleaseSysCache(tuple);
+	if (OidIsValid(constrParentOid))
+		return get_ri_constraint_root_recurse(constrParentOid);
+
+	return constrOid;
+}
+
 /*
  * Fetch or create the RI_ConstraintInfo struct for an FK constraint.
  */
@@ -2051,8 +2104,12 @@ ri_LoadConstraintInfo(Oid constraintOid)
 
 	/* And extract data */
 	Assert(riinfo->constraint_id == constraintOid);
+	riinfo->constraint_parent = conForm->conparentid;
+	riinfo->constraint_root_id = get_ri_constraint_root(riinfo);
 	riinfo->oidHashValue = GetSysCacheHashValue1(CONSTROID,
 												 ObjectIdGetDatum(constraintOid));
+	riinfo->rootHashValue = GetSysCacheHashValue1(CONSTROID,
+												  ObjectIdGetDatum(riinfo->constraint_root_id));
 	memcpy(&riinfo->conname, &conForm->conname, sizeof(NameData));
 	riinfo->pk_relid = conForm->confrelid;
 	riinfo->fk_relid = conForm->conrelid;
@@ -2117,7 +2174,16 @@ InvalidateConstraintCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
 		RI_ConstraintInfo *riinfo = dlist_container(RI_ConstraintInfo,
 													valid_link, iter.cur);
 
-		if (hashvalue == 0 || riinfo->oidHashValue == hashvalue)
+		/*
+		 * We look for changes to two specific pg_constraint entries here --
+		 * the one matching the constraint given by riinfo->constraint_id and
+		 * also the one given by riinfo->constraint_root_id.  The latter too
+		 * because if its parent is updated, it is no longer the root
+		 * constraint.
+		 */
+		if (hashvalue == 0 ||
+			riinfo->oidHashValue == hashvalue ||
+			riinfo->rootHashValue == hashvalue)
 		{
 			riinfo->valid = false;
 			/* Remove invalidated entries from the list, too */
diff --git a/src/test/regress/expected/foreign_key.out b/src/test/regress/expected/foreign_key.out
index 07bd5b6434..7386f4d635 100644
--- a/src/test/regress/expected/foreign_key.out
+++ b/src/test/regress/expected/foreign_key.out
@@ -2470,3 +2470,21 @@ DROP SCHEMA fkpart9 CASCADE;
 NOTICE:  drop cascades to 2 other objects
 DETAIL:  drop cascades to table fkpart9.pk
 drop cascades to table fkpart9.fk
+-- test that ri_Check_Pk_Match() scans the correct partition for a deferred
+-- ON DELETE/UPDATE NO ACTION constraint
+CREATE SCHEMA fkpart10
+  CREATE TABLE tbl1(f1 int PRIMARY KEY) PARTITION BY RANGE(f1)
+  CREATE TABLE tbl1_p1 PARTITION OF tbl1 FOR VALUES FROM (minvalue) TO (1)
+  CREATE TABLE tbl1_p2 PARTITION OF tbl1 FOR VALUES FROM (1) TO (maxvalue)
+  CREATE TABLE tbl2(f1 int REFERENCES tbl1 DEFERRABLE INITIALLY DEFERRED);
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+INSERT INTO fkpart10.tbl2 VALUES (0), (1);
+BEGIN;
+DELETE FROM fkpart10.tbl1 WHERE f1 = 0;
+UPDATE fkpart10.tbl1 SET f1 = 2 WHERE f1 = 1;
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+COMMIT;
+DROP SCHEMA fkpart10 CASCADE;
+NOTICE:  drop cascades to 2 other objects
+DETAIL:  drop cascades to table fkpart10.tbl1
+drop cascades to table fkpart10.tbl2
diff --git a/src/test/regress/sql/foreign_key.sql b/src/test/regress/sql/foreign_key.sql
index c5c9011afc..67aa20435d 100644
--- a/src/test/regress/sql/foreign_key.sql
+++ b/src/test/regress/sql/foreign_key.sql
@@ -1738,3 +1738,19 @@ DELETE FROM fkpart9.pk WHERE a=35;
 SELECT * FROM fkpart9.pk;
 SELECT * FROM fkpart9.fk;
 DROP SCHEMA fkpart9 CASCADE;
+
+-- test that ri_Check_Pk_Match() scans the correct partition for a deferred
+-- ON DELETE/UPDATE NO ACTION constraint
+CREATE SCHEMA fkpart10
+  CREATE TABLE tbl1(f1 int PRIMARY KEY) PARTITION BY RANGE(f1)
+  CREATE TABLE tbl1_p1 PARTITION OF tbl1 FOR VALUES FROM (minvalue) TO (1)
+  CREATE TABLE tbl1_p2 PARTITION OF tbl1 FOR VALUES FROM (1) TO (maxvalue)
+  CREATE TABLE tbl2(f1 int REFERENCES tbl1 DEFERRABLE INITIALLY DEFERRED);
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+INSERT INTO fkpart10.tbl2 VALUES (0), (1);
+BEGIN;
+DELETE FROM fkpart10.tbl1 WHERE f1 = 0;
+UPDATE fkpart10.tbl1 SET f1 = 2 WHERE f1 = 1;
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+COMMIT;
+DROP SCHEMA fkpart10 CASCADE;
-- 
2.24.1

#40

Andy Fan

zhihui.fan1213@gmail.com

almost 5 years ago

In reply to: Tom Lane (#38)

Re: Huge memory consumption on partitioned table with FKs

On Fri, Mar 5, 2021 at 5:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

Updated patch attached.

This claim seems false on its face:

All child constraints of a given foreign key constraint can use the
same RI query and the resulting plan, that is, no need to create as
many copies of the query and the plan as there are partitions, as
happens now due to the child constraint OID being used in the key
for ri_query_cache.

What if the child tables don't have the same physical column numbers
as the parent?

My point below is a bit off-topic, but I want to share it here. Since
we implement a partitioned table in PG with the inherited class, it has much
more flexibility than other databases. Like in PG, we allow different
partitions
have different physical order, different indexes, maybe different index
states.
that would cause our development work hard in many places and cause some
runtime issues as well (like catalog memory usage), have we discussed
limiting some flexibility so that we can have better coding/running
experience?
I want to do some research in this direction, but it would be better that I
can
listen to any advice from others. More specifically, I want to reduce the
memory
usage of Partitioned table/index as the first step. In my testing, each
IndexOptInfo
will use 2kB memory in each backend.

The comment claiming that it's okay if riinfo->fk_attnums
doesn't match seems quite off point, because the query plan is still
going to need to use the correct column numbers. Even if column numbers
are the same, the plan would also contain table and index OIDs that could
only be right for one partition.

I could imagine translating a parent plan to apply to a child instead of
building it from scratch, but that would take a lot of code we don't have
(there's no rewriteHandler infrastructure for plan nodes).

Maybe I'm missing something, because I see that the cfbot claims
this is passing, and I'd sure like to think that our test coverage
is not so thin that it'd fail to detect probing the wrong partition
for foreign key matches. But that's what it looks like this patch
will do.

regards, tom lane

--
Best Regards
Andy Fan (https://www.aliyun.com/)

#41

Andy Fan

zhihui.fan1213@gmail.com

almost 5 years ago

In reply to: Andy Fan (#40)

Re: Huge memory consumption on partitioned table with FKs

On Mon, Mar 8, 2021 at 3:43 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

On Fri, Mar 5, 2021 at 5:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

Updated patch attached.

This claim seems false on its face:

All child constraints of a given foreign key constraint can use the
same RI query and the resulting plan, that is, no need to create as
many copies of the query and the plan as there are partitions, as
happens now due to the child constraint OID being used in the key
for ri_query_cache.

What if the child tables don't have the same physical column numbers
as the parent?

My point below is a bit off-topic, but I want to share it here. Since
we implement a partitioned table in PG with the inherited class, it has
much
more flexibility than other databases. Like in PG, we allow different
partitions
have different physical order, different indexes, maybe different index
states.
that would cause our development work hard in many places and cause some
runtime issues as well (like catalog memory usage), have we discussed
limiting some flexibility so that we can have better coding/running
experience?
I want to do some research in this direction, but it would be better that
I can
listen to any advice from others. More specifically, I want to reduce the
memory
usage of Partitioned table/index as the first step. In my testing, each
IndexOptInfo
will use 2kB memory in each backend.

As for the compatible issue, will it be ok to introduce a new concept
like "
CREATE TABLE p (a int) partitioned by list(a) RESTRICTED". We can add these
limitation to restricted partitioned relation only.

The comment claiming that it's okay if riinfo->fk_attnums

doesn't match seems quite off point, because the query plan is still
going to need to use the correct column numbers. Even if column numbers
are the same, the plan would also contain table and index OIDs that could
only be right for one partition.

I could imagine translating a parent plan to apply to a child instead of
building it from scratch, but that would take a lot of code we don't have
(there's no rewriteHandler infrastructure for plan nodes).

Maybe I'm missing something, because I see that the cfbot claims
this is passing, and I'd sure like to think that our test coverage
is not so thin that it'd fail to detect probing the wrong partition
for foreign key matches. But that's what it looks like this patch
will do.

regards, tom lane

--
Best Regards
Andy Fan (https://www.aliyun.com/)

--
Best Regards
Andy Fan (https://www.aliyun.com/)

#42

amitlangote09@gmail.com

almost 5 years ago

In reply to: Andy Fan (#41)

Re: Huge memory consumption on partitioned table with FKs

Hi Andy,

On Mon, Mar 8, 2021 at 8:39 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

On Mon, Mar 8, 2021 at 3:43 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

My point below is a bit off-topic, but I want to share it here. Since
we implement a partitioned table in PG with the inherited class, it has much
more flexibility than other databases. Like in PG, we allow different partitions
have different physical order, different indexes, maybe different index states.
that would cause our development work hard in many places and cause some
runtime issues as well (like catalog memory usage), have we discussed
limiting some flexibility so that we can have better coding/running experience?
I want to do some research in this direction, but it would be better that I can
listen to any advice from others. More specifically, I want to reduce the memory
usage of Partitioned table/index as the first step. In my testing, each IndexOptInfo
will use 2kB memory in each backend.

As for the compatible issue, will it be ok to introduce a new concept like "
CREATE TABLE p (a int) partitioned by list(a) RESTRICTED". We can add these
limitation to restricted partitioned relation only.

I think you'd agree that the topics you want to discuss deserve a
separate discussion thread. You may refer to this discussion in that
new thread if you think that your proposals can solve the problem
being discussed here more generally, which would of course be great.

--
Amit Langote
EDB: http://www.enterprisedb.com

#43

Andy Fan

zhihui.fan1213@gmail.com

almost 5 years ago

In reply to: Amit Langote (#42)

Re: Huge memory consumption on partitioned table with FKs

On Mon, Mar 8, 2021 at 8:42 PM Amit Langote <amitlangote09@gmail.com> wrote:

Hi Andy,

On Mon, Mar 8, 2021 at 8:39 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

On Mon, Mar 8, 2021 at 3:43 PM Andy Fan <zhihui.fan1213@gmail.com>

wrote:

My point below is a bit off-topic, but I want to share it here. Since
we implement a partitioned table in PG with the inherited class, it has

much

more flexibility than other databases. Like in PG, we allow different

partitions

have different physical order, different indexes, maybe different

index states.

that would cause our development work hard in many places and cause some
runtime issues as well (like catalog memory usage), have we discussed
limiting some flexibility so that we can have better coding/running

experience?

I want to do some research in this direction, but it would be better

that I can

listen to any advice from others. More specifically, I want to reduce

the memory

usage of Partitioned table/index as the first step. In my testing,

each IndexOptInfo

will use 2kB memory in each backend.

As for the compatible issue, will it be ok to introduce a new concept

like "

CREATE TABLE p (a int) partitioned by list(a) RESTRICTED". We can add

these

limitation to restricted partitioned relation only.

I think you'd agree that the topics you want to discuss deserve a
separate discussion thread. You may refer to this discussion in that
new thread if you think that your proposals can solve the problem
being discussed here more generally, which would of course be great.

Sure, I can prepare more data and start a new thread for this.

--
Best Regards
Andy Fan (https://www.aliyun.com/)

#44

amitlangote09@gmail.com

almost 5 years ago

In reply to: Andy Fan (#43)

Re: Huge memory consumption on partitioned table with FKs

On Mon, Mar 8, 2021 at 9:53 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

On Mon, Mar 8, 2021 at 8:42 PM Amit Langote <amitlangote09@gmail.com> wrote:

On Mon, Mar 8, 2021 at 8:39 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

On Mon, Mar 8, 2021 at 3:43 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

My point below is a bit off-topic, but I want to share it here. Since
we implement a partitioned table in PG with the inherited class, it has much
more flexibility than other databases. Like in PG, we allow different partitions
have different physical order, different indexes, maybe different index states.
that would cause our development work hard in many places and cause some
runtime issues as well (like catalog memory usage), have we discussed
limiting some flexibility so that we can have better coding/running experience?
I want to do some research in this direction, but it would be better that I can
listen to any advice from others. More specifically, I want to reduce the memory
usage of Partitioned table/index as the first step. In my testing, each IndexOptInfo
will use 2kB memory in each backend.

As for the compatible issue, will it be ok to introduce a new concept like "
CREATE TABLE p (a int) partitioned by list(a) RESTRICTED". We can add these
limitation to restricted partitioned relation only.

I think you'd agree that the topics you want to discuss deserve a
separate discussion thread. You may refer to this discussion in that
new thread if you think that your proposals can solve the problem
being discussed here more generally, which would of course be great.

Sure, I can prepare more data and start a new thread for this.

Great, thanks.

--
Amit Langote
EDB: http://www.enterprisedb.com

#45

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: Amit Langote (#39)

Re: Huge memory consumption on partitioned table with FKs

Amit Langote <amitlangote09@gmail.com> writes:

On Fri, Mar 5, 2021 at 6:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

This claim seems false on its face:

All child constraints of a given foreign key constraint can use the
same RI query and the resulting plan, that is, no need to create as
many copies of the query and the plan as there are partitions, as
happens now due to the child constraint OID being used in the key
for ri_query_cache.

The quoted comment could have been written to be clearer about this,
but it's not talking about the table that is to be queried, but the
table whose RI trigger is being executed. In all the cases except one
(mentioned below), the table that is queried is the same irrespective
of which partition's trigger is being executed, so we're basically
creating the same plan separately for each partition.

Hmm. So, the key point is that the values coming from the partitioned
child table are injected into the test query as parameters, not as
column references, thus it doesn't matter *to the test query* what
numbers the referencing columns have in that child. We just have to
be sure we pass the right parameter values. But ... doesn't the code
use riinfo->fk_attnums[] to pull out the values to be passed?

IOW, I now get the point about being able to share the SPI plans,
but I'm still dubious about sharing the RI_ConstraintInfo cache entries.

It looks to me like the v4 patch is now actually not sharing the
cache entries, ie their hash key is just the child constraint OID
same as before; but the comments are pretty confused about this.

It might be simpler if you add just one new field which is the OID of
the constraint that we're building the SPI query from, which might be
either equal to constraint_id, or the OID of some parent constraint.
In particular it's not clear to me why we need both constraint_parent
and constraint_root_id.

regards, tom lane

#46

amitlangote09@gmail.com

almost 5 years ago

In reply to: Tom Lane (#45)

1 attachment(s)

Re: Huge memory consumption on partitioned table with FKs

On Wed, Mar 10, 2021 at 8:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

On Fri, Mar 5, 2021 at 6:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

This claim seems false on its face:

All child constraints of a given foreign key constraint can use the
same RI query and the resulting plan, that is, no need to create as
many copies of the query and the plan as there are partitions, as
happens now due to the child constraint OID being used in the key
for ri_query_cache.

The quoted comment could have been written to be clearer about this,
but it's not talking about the table that is to be queried, but the
table whose RI trigger is being executed. In all the cases except one
(mentioned below), the table that is queried is the same irrespective
of which partition's trigger is being executed, so we're basically
creating the same plan separately for each partition.

Hmm. So, the key point is that the values coming from the partitioned
child table are injected into the test query as parameters, not as
column references, thus it doesn't matter *to the test query* what
numbers the referencing columns have in that child. We just have to
be sure we pass the right parameter values.

Right.

But ... doesn't the code
use riinfo->fk_attnums[] to pull out the values to be passed?

Yes, from a slot that belongs to the child table.

IOW, I now get the point about being able to share the SPI plans,
but I'm still dubious about sharing the RI_ConstraintInfo cache entries.

There was actually a proposal upthread about sharing the
RI_ConstraintInfo too, but we decided to not pursue that for now.

It looks to me like the v4 patch is now actually not sharing the
cache entries, ie their hash key is just the child constraint OID
same as before;

Yeah, you may see that we're only changing ri_BuildQueryKey() in the
patch affecting only ri_query_cache, but not ri_LoadConstraintInfo()
which leaves ri_constraint_cache unaffected.

but the comments are pretty confused about this.

I've tried improving the comment in ri_BuildQueryKey() a bit to make
clear what is and what is not being shared between partitions.

It might be simpler if you add just one new field which is the OID of
the constraint that we're building the SPI query from, which might be
either equal to constraint_id, or the OID of some parent constraint.
In particular it's not clear to me why we need both constraint_parent
and constraint_root_id.

Yeah, I think constraint_parent is a leftover from some earlier
hacking. I have removed it.

Attached updated patch.

--
Amit Langote
EDB: http://www.enterprisedb.com

Attachments:

v5-0001-Share-SPI-plan-between-partitions-in-some-RI-trig.patchapplication/octet-stream; name=v5-0001-Share-SPI-plan-between-partitions-in-some-RI-trig.patchDownload

From 5baa25e8f5e33826e48e54bedb9e37b1334acad5 Mon Sep 17 00:00:00 2001
From: amitlan <amitlangote09@gmail.com>
Date: Thu, 23 Apr 2020 21:11:58 +0900
Subject: [PATCH v5] Share SPI plan between partitions in some RI triggers

In all cases except one, the RI triggers of leaf partitions that all
share a given foreign key constraint issue the same query through
SPI, although ri_query_cache stores the plan of the query separately
for each partition.  That is unnecessary and can even lead to a huge
increase in memory allocated to store those plans if the query's
target relation is a partitioned table containing many partitions.
This commit changes the key used by ri_query_cache in such cases to
be the shared foreign key constraint's root OID so that all
partitions sharing the constraint also share the plan.

However, the query issued by ri_Check_Pk_Match() targets the same
table as the one on which the RI trigger is fired, so each leaf
partition should get its own query and the associated plan in that
one case. This commit adds a test for this case.

Keisuke Kuroda, Amit Langote
---
 src/backend/utils/adt/ri_triggers.c       | 72 +++++++++++++++++++++--
 src/test/regress/expected/foreign_key.out | 18 ++++++
 src/test/regress/sql/foreign_key.sql      | 16 +++++
 3 files changed, 100 insertions(+), 6 deletions(-)

diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 6e3a41062f..6080c6c8e8 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -95,13 +95,19 @@
  * RI_ConstraintInfo
  *
  * Information extracted from an FK pg_constraint entry.  This is cached in
- * ri_constraint_cache.
+ * ri_constraint_cache with constraint_root_id as a part of the hash key.
  */
 typedef struct RI_ConstraintInfo
 {
-	Oid			constraint_id;	/* OID of pg_constraint entry (hash key) */
+	Oid			constraint_id;	/* OID of pg_constraint entry */
+	Oid			constraint_root_id;	/* OID of the constraint in some ancestor
+									 * of the partition (most commonly root
+									 * ancestor) from which this constraint is
+									 * inherited; same as constraint_id if the
+									 * constraint is non-inherited */
 	bool		valid;			/* successfully initialized? */
-	uint32		oidHashValue;	/* hash value of pg_constraint OID */
+	uint32		oidHashValue;	/* hash value of constraint_id */
+	uint32		rootHashValue;	/* hash value of constraint_root_id */
 	NameData	conname;		/* name of the FK constraint */
 	Oid			pk_relid;		/* referenced relation */
 	Oid			fk_relid;		/* referencing relation */
@@ -221,6 +227,7 @@ static void ri_ReportViolation(const RI_ConstraintInfo *riinfo,
 							   Relation pk_rel, Relation fk_rel,
 							   TupleTableSlot *violatorslot, TupleDesc tupdesc,
 							   int queryno, bool partgone) pg_attribute_noreturn();
+static Oid get_ri_constraint_root_recurse(Oid constrOid);
 
 
 /*
@@ -1892,7 +1899,7 @@ ri_GenerateQualCollation(StringInfo buf, Oid collation)
  *	Construct a hashtable key for a prepared SPI plan of an FK constraint.
  *
  *		key: output argument, *key is filled in based on the other arguments
- *		riinfo: info from pg_constraint entry
+ *		riinfo: info derived from pg_constraint entry
  *		constr_queryno: an internal number identifying the query type
  *			(see RI_PLAN_XXX constants at head of file)
  * ----------
@@ -1902,10 +1909,26 @@ ri_BuildQueryKey(RI_QueryKey *key, const RI_ConstraintInfo *riinfo,
 				 int32 constr_queryno)
 {
 	/*
+	 * All queries except RI_PLAN_CHECK_LOOKUPPK_FROM_PK can be shared among
+	 * partitions that inherit a given FK constraint (to be precise, the
+	 * constraint's check and action triggers) because they contain the same
+	 * target table to scan, which is the other table involved in the
+	 * constraint.  So we use the root constraint's OID as the key into
+	 * ri_query_cache such that the SPI plan for those queries can likewise
+	 * be shared between partitions.  We must a keep a separate instance of
+	 * the RI_PLAN_CHECK_LOOKUPPK_FROM_PK query for each partition because
+	 * it scans the same table as the table on which the trigger is fired.
+	 * Note that we still maintain a separate RI_ConstraintInfo for each
+	 * partition, because the contents of pk_attnums, fk_attnums arrays
+	 * can be different between partitions.
+	 *
 	 * We assume struct RI_QueryKey contains no padding bytes, else we'd need
 	 * to use memset to clear them.
 	 */
-	key->constr_id = riinfo->constraint_id;
+	if (constr_queryno != RI_PLAN_CHECK_LOOKUPPK_FROM_PK)
+		key->constr_id = riinfo->constraint_root_id;
+	else
+		key->constr_id = riinfo->constraint_id;
 	key->constr_queryno = constr_queryno;
 }
 
@@ -2009,6 +2032,27 @@ ri_FetchConstraintInfo(Trigger *trigger, Relation trig_rel, bool rel_is_pk)
 	return riinfo;
 }
 
+/*
+ * get_ri_constraint_root_recurse
+ *		Recursively finds and returns the OID of the constraint's root parent
+ */
+static Oid
+get_ri_constraint_root_recurse(Oid constrOid)
+{
+	HeapTuple	tuple;
+	Oid			constrParentOid;
+
+	tuple = SearchSysCache1(CONSTROID, ObjectIdGetDatum(constrOid));
+	if (!HeapTupleIsValid(tuple))
+		elog(ERROR, "cache lookup failed for constraint %u", constrOid);
+	constrParentOid = ((Form_pg_constraint) GETSTRUCT(tuple))->conparentid;
+	ReleaseSysCache(tuple);
+	if (OidIsValid(constrParentOid))
+		return get_ri_constraint_root_recurse(constrParentOid);
+
+	return constrOid;
+}
+
 /*
  * Fetch or create the RI_ConstraintInfo struct for an FK constraint.
  */
@@ -2051,8 +2095,15 @@ ri_LoadConstraintInfo(Oid constraintOid)
 
 	/* And extract data */
 	Assert(riinfo->constraint_id == constraintOid);
+	if (OidIsValid(conForm->conparentid))
+		riinfo->constraint_root_id =
+			get_ri_constraint_root_recurse(constraintOid);
+	else
+		riinfo->constraint_root_id = constraintOid;
 	riinfo->oidHashValue = GetSysCacheHashValue1(CONSTROID,
 												 ObjectIdGetDatum(constraintOid));
+	riinfo->rootHashValue = GetSysCacheHashValue1(CONSTROID,
+												  ObjectIdGetDatum(riinfo->constraint_root_id));
 	memcpy(&riinfo->conname, &conForm->conname, sizeof(NameData));
 	riinfo->pk_relid = conForm->confrelid;
 	riinfo->fk_relid = conForm->conrelid;
@@ -2117,7 +2168,16 @@ InvalidateConstraintCacheCallBack(Datum arg, int cacheid, uint32 hashvalue)
 		RI_ConstraintInfo *riinfo = dlist_container(RI_ConstraintInfo,
 													valid_link, iter.cur);
 
-		if (hashvalue == 0 || riinfo->oidHashValue == hashvalue)
+		/*
+		 * We look for changes to two specific pg_constraint entries here --
+		 * the one matching the constraint given by riinfo->constraint_id and
+		 * also the one given by riinfo->constraint_root_id.  The latter too
+		 * because if its parent is updated, it is no longer the root
+		 * constraint.
+		 */
+		if (hashvalue == 0 ||
+			riinfo->oidHashValue == hashvalue ||
+			riinfo->rootHashValue == hashvalue)
 		{
 			riinfo->valid = false;
 			/* Remove invalidated entries from the list, too */
diff --git a/src/test/regress/expected/foreign_key.out b/src/test/regress/expected/foreign_key.out
index 07bd5b6434..7386f4d635 100644
--- a/src/test/regress/expected/foreign_key.out
+++ b/src/test/regress/expected/foreign_key.out
@@ -2470,3 +2470,21 @@ DROP SCHEMA fkpart9 CASCADE;
 NOTICE:  drop cascades to 2 other objects
 DETAIL:  drop cascades to table fkpart9.pk
 drop cascades to table fkpart9.fk
+-- test that ri_Check_Pk_Match() scans the correct partition for a deferred
+-- ON DELETE/UPDATE NO ACTION constraint
+CREATE SCHEMA fkpart10
+  CREATE TABLE tbl1(f1 int PRIMARY KEY) PARTITION BY RANGE(f1)
+  CREATE TABLE tbl1_p1 PARTITION OF tbl1 FOR VALUES FROM (minvalue) TO (1)
+  CREATE TABLE tbl1_p2 PARTITION OF tbl1 FOR VALUES FROM (1) TO (maxvalue)
+  CREATE TABLE tbl2(f1 int REFERENCES tbl1 DEFERRABLE INITIALLY DEFERRED);
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+INSERT INTO fkpart10.tbl2 VALUES (0), (1);
+BEGIN;
+DELETE FROM fkpart10.tbl1 WHERE f1 = 0;
+UPDATE fkpart10.tbl1 SET f1 = 2 WHERE f1 = 1;
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+COMMIT;
+DROP SCHEMA fkpart10 CASCADE;
+NOTICE:  drop cascades to 2 other objects
+DETAIL:  drop cascades to table fkpart10.tbl1
+drop cascades to table fkpart10.tbl2
diff --git a/src/test/regress/sql/foreign_key.sql b/src/test/regress/sql/foreign_key.sql
index c5c9011afc..67aa20435d 100644
--- a/src/test/regress/sql/foreign_key.sql
+++ b/src/test/regress/sql/foreign_key.sql
@@ -1738,3 +1738,19 @@ DELETE FROM fkpart9.pk WHERE a=35;
 SELECT * FROM fkpart9.pk;
 SELECT * FROM fkpart9.fk;
 DROP SCHEMA fkpart9 CASCADE;
+
+-- test that ri_Check_Pk_Match() scans the correct partition for a deferred
+-- ON DELETE/UPDATE NO ACTION constraint
+CREATE SCHEMA fkpart10
+  CREATE TABLE tbl1(f1 int PRIMARY KEY) PARTITION BY RANGE(f1)
+  CREATE TABLE tbl1_p1 PARTITION OF tbl1 FOR VALUES FROM (minvalue) TO (1)
+  CREATE TABLE tbl1_p2 PARTITION OF tbl1 FOR VALUES FROM (1) TO (maxvalue)
+  CREATE TABLE tbl2(f1 int REFERENCES tbl1 DEFERRABLE INITIALLY DEFERRED);
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+INSERT INTO fkpart10.tbl2 VALUES (0), (1);
+BEGIN;
+DELETE FROM fkpart10.tbl1 WHERE f1 = 0;
+UPDATE fkpart10.tbl1 SET f1 = 2 WHERE f1 = 1;
+INSERT INTO fkpart10.tbl1 VALUES (0), (1);
+COMMIT;
+DROP SCHEMA fkpart10 CASCADE;
-- 
2.24.1

#47

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: Amit Langote (#46)

Re: Huge memory consumption on partitioned table with FKs

Amit Langote <amitlangote09@gmail.com> writes:

On Wed, Mar 10, 2021 at 8:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmm. So, the key point is that the values coming from the partitioned
child table are injected into the test query as parameters, not as
column references, thus it doesn't matter *to the test query* what
numbers the referencing columns have in that child. We just have to
be sure we pass the right parameter values.

Right.

I did some cosmetic fooling with this (mostly, rewriting the comments
YA time) and pushed it.

regards, tom lane

#48

amitlangote09@gmail.com

almost 5 years ago

In reply to: Tom Lane (#47)

Re: Huge memory consumption on partitioned table with FKs

On Thu, Mar 11, 2021 at 4:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

On Wed, Mar 10, 2021 at 8:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmm. So, the key point is that the values coming from the partitioned
child table are injected into the test query as parameters, not as
column references, thus it doesn't matter *to the test query* what
numbers the referencing columns have in that child. We just have to
be sure we pass the right parameter values.

Right.

I did some cosmetic fooling with this (mostly, rewriting the comments
YA time) and pushed it.

Perfect. Thanks for your time on this.

--
Amit Langote
EDB: http://www.enterprisedb.com

#49

keisuke.kuroda.3862@gmail.com

almost 5 years ago

In reply to: Amit Langote (#48)

Re: Huge memory consumption on partitioned table with FKs

Hi hackers,

I did some cosmetic fooling with this (mostly, rewriting the comments
YA time) and pushed it.

Perfect. Thanks for your time on this.

Thank you for your help! I'm glad to solve it.

--
Keisuke Kuroda
NTT Software Innovation Center
keisuke.kuroda.3862@gmail.com

#50

Tatsuro Yamada

tatsuro.yamada.tf@nttcom.co.jp

almost 5 years ago

In reply to: Amit Langote (#48)

Re: Huge memory consumption on partitioned table with FKs

On 2021/03/11 9:39, Amit Langote wrote:

On Thu, Mar 11, 2021 at 4:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Amit Langote <amitlangote09@gmail.com> writes:

On Wed, Mar 10, 2021 at 8:37 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hmm. So, the key point is that the values coming from the partitioned
child table are injected into the test query as parameters, not as
column references, thus it doesn't matter *to the test query* what
numbers the referencing columns have in that child. We just have to
be sure we pass the right parameter values.

Right.

I did some cosmetic fooling with this (mostly, rewriting the comments
YA time) and pushed it.

Perfect. Thanks for your time on this.

Thanks for fixing the problem! :-D

Regards,
Tatsuro Yamada

#51

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=husky&dt=2021-03-10%2021%3A09%3A32

tgl@sss.pgh.pa.us

almost 5 years ago

In reply to: Tatsuro Yamada (#50)

Re: Huge memory consumption on partitioned table with FKs

Tatsuro Yamada <tatsuro.yamada.tf@nttcom.co.jp> writes:

Thanks for fixing the problem! :-D

Hmm, I'm not sure we're done with this patch:

The critical log extract is

2021-03-11 05:10:13.012 CET [21574:1082] pg_regress/foreign_key LOG: statement: insert into fk_notpartitioned_pk (a, b)
select 2048, x from generate_series(1,10) x;
2021-03-11 05:10:13.104 CET [11830:368] LOG: server process (PID 21574) was terminated by signal 11: Segmentation fault
2021-03-11 05:10:13.104 CET [11830:369] DETAIL: Failed process was running: insert into fk_notpartitioned_pk (a, b)
select 2048, x from generate_series(1,10) x;

Now, maybe it's a coincidence that husky failed on a
partitioned-foreign-key test right after this patch went in, but I bet
not. Since husky runs CLOBBER_CACHE_ALWAYS, it looks to me like we've
overlooked some cache-reset scenario or other.

regards, tom lane

#52