altering a column's collation leaves an invalid foreign key
Dear hackers,
I was looking at how foreign keys deal with collations, and I came across this comment about not
re-checking a foreign key if the column type changes in a compatible way:
* Since we require that all collations share the same notion of
* equality (which they do, because texteq reduces to bitwise
* equality), we don't compare collation here.
But now that we have nondeterministic collations, isn't that out of date?
For instance I could make this foreign key:
paul=# create collation itext (provider = 'icu', locale = 'und-u-ks-level1', deterministic = false);
CREATE COLLATION
paul=# create table t1 (id text collate itext primary key);
CREATE TABLE
paul=# create table t2 (id text, parent_id text references t1);
CREATE TABLE
And then:
paul=# insert into t1 values ('a');
INSERT 0 1
paul=# insert into t2 values ('.', 'A');
INSERT 0 1
So far that behavior seems correct, because the user told us 'a' and 'A' were equivalent,
but now I can change the collation on the referenced table and the FK doesn't complain:
paul=# alter table t1 alter column id type text collate "C";
ALTER TABLE
The constraint claims to be valid, but I can't drop & add it:
paul=# alter table t2 drop constraint t2_parent_id_fkey;
ALTER TABLE
paul=# alter table t2 add constraint t2_parent_id_fkey foreign key (parent_id) references t1;
ERROR: insert or update on table "t2" violates foreign key constraint "t2_parent_id_fkey"
DETAIL: Key (parent_id)=(A) is not present in table "t1".
Isn't that a problem?
Perhaps if the previous collation was nondeterministic we should force a re-check.
(Tested on 17devel 697f8d266c and also 16.)
Yours,
--
Paul ~{:-)
pj@illuminatedcomputing.com
On 3/23/24 10:04, Paul Jungwirth wrote:
Perhaps if the previous collation was nondeterministic we should force a re-check.
Here is a patch implementing this. It was a bit more fuss than I expected, so maybe someone has a
better way.
We have had nondeterministic collations since v12, so perhaps it is something to back-patch?
Yours,
--
Paul ~{:-)
pj@illuminatedcomputing.com
Attachments:
v1-0001-Re-check-foreign-key-if-referenced-collation-was-.patchtext/x-patch; charset=UTF-8; name=v1-0001-Re-check-foreign-key-if-referenced-collation-was-.patchDownload
From a8078f85859400f4e4cffb57d8ec1b069e46bfb8 Mon Sep 17 00:00:00 2001
From: "Paul A. Jungwirth" <pj@illuminatedcomputing.com>
Date: Sun, 24 Mar 2024 23:37:48 -0700
Subject: [PATCH v1] Re-check foreign key if referenced collation was
nondeterministic
With deterministic collations, we break ties by looking at binary
equality. But nondeterministic collations can allow non-identical values
to be considered equal, e.g. 'a' and 'A' when case-insensitive. So some
references that used to be valid may not be anymore.
---
src/backend/commands/tablecmds.c | 96 ++++++++++++++++---
src/include/nodes/parsenodes.h | 2 +
.../regress/expected/collate.icu.utf8.out | 9 ++
src/test/regress/sql/collate.icu.utf8.sql | 8 ++
4 files changed, 103 insertions(+), 12 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 71740984f33..940b1845edb 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -194,6 +194,7 @@ typedef struct AlteredTableInfo
/* Objects to rebuild after completing ALTER TYPE operations */
List *changedConstraintOids; /* OIDs of constraints to rebuild */
List *changedConstraintDefs; /* string definitions of same */
+ List *changedCollationOids; /* collation of each attnum */
List *changedIndexOids; /* OIDs of indexes to rebuild */
List *changedIndexDefs; /* string definitions of same */
char *replicaIdentityIndex; /* index to reset as REPLICA IDENTITY */
@@ -572,6 +573,7 @@ static ObjectAddress ATExecAlterColumnType(AlteredTableInfo *tab, Relation rel,
AlterTableCmd *cmd, LOCKMODE lockmode);
static void RememberAllDependentForRebuilding(AlteredTableInfo *tab, AlterTableType subtype,
Relation rel, AttrNumber attnum, const char *colName);
+static void RememberCollationForRebuilding(AttrNumber attnum, AlteredTableInfo *tab);
static void RememberConstraintForRebuilding(Oid conoid, AlteredTableInfo *tab);
static void RememberIndexForRebuilding(Oid indoid, AlteredTableInfo *tab);
static void RememberStatisticsForRebuilding(Oid stxoid, AlteredTableInfo *tab);
@@ -579,12 +581,12 @@ static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab,
LOCKMODE lockmode);
static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
char *cmd, List **wqueue, LOCKMODE lockmode,
- bool rewrite);
+ bool rewrite, List *collationOids);
static void RebuildConstraintComment(AlteredTableInfo *tab, AlterTablePass pass,
Oid objid, Relation rel, List *domname,
const char *conname);
static void TryReuseIndex(Oid oldId, IndexStmt *stmt);
-static void TryReuseForeignKey(Oid oldId, Constraint *con);
+static void TryReuseForeignKey(Oid oldId, Constraint *con, List *changedCollationOids);
static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char *colName,
List *options, LOCKMODE lockmode);
static void change_owner_fix_column_acls(Oid relationOid,
@@ -9826,6 +9828,7 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
bool old_check_ok;
ObjectAddress address;
ListCell *old_pfeqop_item = list_head(fkconstraint->old_conpfeqop);
+ ListCell *old_collation_item = list_head(fkconstraint->old_collations);
/*
* Grab ShareRowExclusiveLock on the pk table, so that someone doesn't
@@ -10212,9 +10215,15 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
CoercionPathType new_pathtype;
Oid old_castfunc;
Oid new_castfunc;
+ Oid old_collation;
Form_pg_attribute attr = TupleDescAttr(tab->oldDesc,
fkattnum[i] - 1);
+ /* Get the collation for this key part. */
+ old_collation = lfirst_oid(old_collation_item);
+ old_collation_item = lnext(fkconstraint->old_collations,
+ old_collation_item);
+
/*
* Identify coercion pathways from each of the old and new FK-side
* column types to the right (foreign) operand type of the pfeqop.
@@ -10250,9 +10259,12 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
* turn conform to the domain. Consequently, we need not treat
* domains specially here.
*
- * Since we require that all collations share the same notion of
- * equality (which they do, because texteq reduces to bitwise
- * equality), we don't compare collation here.
+ * All deterministic collations use bitwise equality to resolve
+ * ties, but if the previous collation was nondeterministic,
+ * we must re-check the foreign key, because some references
+ * that use to be "equal" may not be anymore. If we have
+ * InvalidOid here, either there was no collation or the
+ * attribute didn't change.
*
* We need not directly consider the PK type. It's necessarily
* binary coercible to the opcintype of the unique index column,
@@ -10261,6 +10273,8 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
*/
old_check_ok = (new_pathtype == old_pathtype &&
new_castfunc == old_castfunc &&
+ (old_collation == InvalidOid ||
+ get_collation_isdeterministic(old_collation)) &&
(!IsPolymorphicType(pfeqop_right) ||
new_fktype == old_fktype));
}
@@ -13985,6 +13999,7 @@ RememberAllDependentForRebuilding(AlteredTableInfo *tab, AlterTableType subtype,
relKind == RELKIND_PARTITIONED_INDEX)
{
Assert(foundObject.objectSubId == 0);
+ RememberCollationForRebuilding(attnum, tab);
RememberIndexForRebuilding(foundObject.objectId, tab);
}
else if (relKind == RELKIND_SEQUENCE)
@@ -14006,6 +14021,7 @@ RememberAllDependentForRebuilding(AlteredTableInfo *tab, AlterTableType subtype,
case OCLASS_CONSTRAINT:
Assert(foundObject.objectSubId == 0);
+ RememberCollationForRebuilding(attnum, tab);
RememberConstraintForRebuilding(foundObject.objectId, tab);
break;
@@ -14188,6 +14204,40 @@ RememberClusterOnForRebuilding(Oid indoid, AlteredTableInfo *tab)
tab->clusterOnIndex = get_rel_name(indoid);
}
+/*
+ * Subroutine for ATExecAlterColumnType: remember what the collations were
+ * for each attribute of the table. This is because if one of them used to be
+ * nondeterministic, foreign keys that referenced that attribute must be
+ * re-checked. We make a list with one entry for each attribute in the table,
+ * so that FKs can easily look them up by attribute number. But only attributes
+ * that are changing get a non-zero value.
+ */
+static void
+RememberCollationForRebuilding(AttrNumber attnum, AlteredTableInfo *tab)
+{
+ Oid typid;
+ int32 typmod;
+ Oid collid;
+ ListCell *lc;
+
+ /* Fill in the list with InvalidOid if this is our first visit */
+ if (tab->changedCollationOids == NIL)
+ {
+ int len = RelationGetNumberOfAttributes(tab->rel);
+ int i;
+
+ for (i = 0; i < len; i++)
+ tab->changedCollationOids = lappend_oid(tab->changedCollationOids,
+ InvalidOid);
+ }
+
+ get_atttypetypmodcoll(RelationGetRelid(tab->rel), attnum,
+ &typid, &typmod, &collid);
+
+ lc = list_nth_cell(tab->changedCollationOids, attnum - 1);
+ lfirst_oid(lc) = collid;
+}
+
/*
* Subroutine for ATExecAlterColumnType: remember that a constraint needs
* to be rebuilt (which we might already know).
@@ -14393,7 +14443,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
ATPostAlterTypeParse(oldId, relid, confrelid,
(char *) lfirst(def_item),
- wqueue, lockmode, tab->rewrite);
+ wqueue, lockmode, tab->rewrite,
+ tab->changedCollationOids);
}
forboth(oid_item, tab->changedIndexOids,
def_item, tab->changedIndexDefs)
@@ -14404,7 +14455,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
relid = IndexGetRelation(oldId, false);
ATPostAlterTypeParse(oldId, relid, InvalidOid,
(char *) lfirst(def_item),
- wqueue, lockmode, tab->rewrite);
+ wqueue, lockmode, tab->rewrite,
+ tab->changedCollationOids);
ObjectAddressSet(obj, RelationRelationId, oldId);
add_exact_object_address(&obj, objects);
@@ -14420,7 +14472,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
relid = StatisticsGetRelation(oldId, false);
ATPostAlterTypeParse(oldId, relid, InvalidOid,
(char *) lfirst(def_item),
- wqueue, lockmode, tab->rewrite);
+ wqueue, lockmode, tab->rewrite,
+ tab->changedCollationOids);
ObjectAddressSet(obj, StatisticExtRelationId, oldId);
add_exact_object_address(&obj, objects);
@@ -14483,7 +14536,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
*/
static void
ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
- List **wqueue, LOCKMODE lockmode, bool rewrite)
+ List **wqueue, LOCKMODE lockmode, bool rewrite,
+ List *changedCollationOids)
{
List *raw_parsetree_list;
List *querytree_list;
@@ -14610,7 +14664,7 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
/* rewriting neither side of a FK */
if (con->contype == CONSTR_FOREIGN &&
!rewrite && tab->rewrite == 0)
- TryReuseForeignKey(oldId, con);
+ TryReuseForeignKey(oldId, con, changedCollationOids);
con->reset_default_tblspc = true;
cmd->subtype = AT_ReAddConstraint;
tab->subcmds[AT_PASS_OLD_CONSTR] =
@@ -14768,20 +14822,22 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
*
* Stash the old P-F equality operator into the Constraint node, for possible
* use by ATAddForeignKeyConstraint() in determining whether revalidation of
- * this constraint can be skipped.
+ * this constraint can be skipped. Also stash the collations.
*/
static void
-TryReuseForeignKey(Oid oldId, Constraint *con)
+TryReuseForeignKey(Oid oldId, Constraint *con, List *changedCollationOids)
{
HeapTuple tup;
Datum adatum;
ArrayType *arr;
Oid *rawarr;
+ AttrNumber *attarr;
int numkeys;
int i;
Assert(con->contype == CONSTR_FOREIGN);
Assert(con->old_conpfeqop == NIL); /* already prepared this node */
+ Assert(con->old_collations == NIL); /* already prepared this node */
tup = SearchSysCache1(CONSTROID, ObjectIdGetDatum(oldId));
if (!HeapTupleIsValid(tup)) /* should not happen */
@@ -14802,6 +14858,22 @@ TryReuseForeignKey(Oid oldId, Constraint *con)
for (i = 0; i < numkeys; i++)
con->old_conpfeqop = lappend_oid(con->old_conpfeqop, rawarr[i]);
+ adatum = SysCacheGetAttrNotNull(CONSTROID, tup,
+ Anum_pg_constraint_conkey);
+ arr = DatumGetArrayTypeP(adatum); /* ensure not toasted */
+ numkeys = ARR_DIMS(arr)[0];
+ /* test follows the one in ri_FetchConstraintInfo() */
+ if (ARR_NDIM(arr) != 1 ||
+ ARR_HASNULL(arr) ||
+ ARR_ELEMTYPE(arr) != INT2OID)
+ elog(ERROR, "conkey is not a 1-D smallint array");
+ attarr = (AttrNumber *) ARR_DATA_PTR(arr);
+
+ /* stash a List of the collation Oids in our Constraint node */
+ for (i = 0; i < numkeys; i++)
+ con->old_collations = lappend_oid(con->old_collations,
+ list_nth_oid(changedCollationOids, attarr[i] - 1));
+
ReleaseSysCache(tup);
}
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index b89baef95d3..96c71c75115 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2691,6 +2691,8 @@ typedef struct Constraint
char fk_del_action; /* ON DELETE action */
List *fk_del_set_cols; /* ON DELETE SET NULL/DEFAULT (col1, col2) */
List *old_conpfeqop; /* pg_constraint.conpfeqop of my former self */
+ List *old_collations; /* collations of referenced columns,
+ before change */
Oid old_pktable_oid; /* pg_constraint.confrelid of my former
* self */
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 4b8c8f143f3..4119776f0d8 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1933,6 +1933,15 @@ SELECT * FROM test11fk;
---
(0 rows)
+-- Test altering the collation of the referenced column.
+CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE fktable (x text REFERENCES pktable);
+INSERT INTO pktable VALUES ('a');
+INSERT INTO fktable VALUES ('A');
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE "C";
+ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_x_fkey"
+DETAIL: Key (x)=(A) is not present in table "pktable".
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 80f28a97d78..08a6964950b 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -721,6 +721,14 @@ DELETE FROM test11pk WHERE x = 'abc';
SELECT * FROM test11pk;
SELECT * FROM test11fk;
+-- Test altering the collation of the referenced column.
+CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE fktable (x text REFERENCES pktable);
+INSERT INTO pktable VALUES ('a');
+INSERT INTO fktable VALUES ('A');
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE "C";
+
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
--
2.42.0
On Mon, Mar 25, 2024 at 2:47 PM Paul Jungwirth
<pj@illuminatedcomputing.com> wrote:
On 3/23/24 10:04, Paul Jungwirth wrote:
Perhaps if the previous collation was nondeterministic we should force a re-check.
Here is a patch implementing this. It was a bit more fuss than I expected, so maybe someone has a
better way.
+ /* test follows the one in ri_FetchConstraintInfo() */
+ if (ARR_NDIM(arr) != 1 ||
+ ARR_HASNULL(arr) ||
+ ARR_ELEMTYPE(arr) != INT2OID)
+ elog(ERROR, "conkey is not a 1-D smallint array");
+ attarr = (AttrNumber *) ARR_DATA_PTR(arr);
+
+ /* stash a List of the collation Oids in our Constraint node */
+ for (i = 0; i < numkeys; i++)
+ con->old_collations = lappend_oid(con->old_collations,
+ list_nth_oid(changedCollationOids, attarr[i] - 1));
I don't understand the "ri_FetchConstraintInfo" comment.
+static void
+RememberCollationForRebuilding(AttrNumber attnum, AlteredTableInfo *tab)
+{
+ Oid typid;
+ int32 typmod;
+ Oid collid;
+ ListCell *lc;
+
+ /* Fill in the list with InvalidOid if this is our first visit */
+ if (tab->changedCollationOids == NIL)
+ {
+ int len = RelationGetNumberOfAttributes(tab->rel);
+ int i;
+
+ for (i = 0; i < len; i++)
+ tab->changedCollationOids = lappend_oid(tab->changedCollationOids,
+ InvalidOid);
+ }
+
+ get_atttypetypmodcoll(RelationGetRelid(tab->rel), attnum,
+ &typid, &typmod, &collid);
+
+ lc = list_nth_cell(tab->changedCollationOids, attnum - 1);
+ lfirst_oid(lc) = collid;
+}
do we need to check if `collid` is a valid collation?
like:
if (!OidIsValid(collid))
{
lc = list_nth_cell(tab->changedCollationOids, attnum - 1);
lfirst_oid(lc) = collid;
}
On Tue, Mar 26, 2024 at 1:00 PM jian he <jian.universality@gmail.com> wrote:
On Mon, Mar 25, 2024 at 2:47 PM Paul Jungwirth
<pj@illuminatedcomputing.com> wrote:On 3/23/24 10:04, Paul Jungwirth wrote:
Perhaps if the previous collation was nondeterministic we should force a re-check.
Here is a patch implementing this. It was a bit more fuss than I expected, so maybe someone has a
better way.+ /* test follows the one in ri_FetchConstraintInfo() */ + if (ARR_NDIM(arr) != 1 || + ARR_HASNULL(arr) || + ARR_ELEMTYPE(arr) != INT2OID) + elog(ERROR, "conkey is not a 1-D smallint array"); + attarr = (AttrNumber *) ARR_DATA_PTR(arr); + + /* stash a List of the collation Oids in our Constraint node */ + for (i = 0; i < numkeys; i++) + con->old_collations = lappend_oid(con->old_collations, + list_nth_oid(changedCollationOids, attarr[i] - 1));I don't understand the "ri_FetchConstraintInfo" comment.
I still don't understand this comment.
+static void +RememberCollationForRebuilding(AttrNumber attnum, AlteredTableInfo *tab) +{ + Oid typid; + int32 typmod; + Oid collid; + ListCell *lc; + + /* Fill in the list with InvalidOid if this is our first visit */ + if (tab->changedCollationOids == NIL) + { + int len = RelationGetNumberOfAttributes(tab->rel); + int i; + + for (i = 0; i < len; i++) + tab->changedCollationOids = lappend_oid(tab->changedCollationOids, + InvalidOid); + } + + get_atttypetypmodcoll(RelationGetRelid(tab->rel), attnum, + &typid, &typmod, &collid); + + lc = list_nth_cell(tab->changedCollationOids, attnum - 1); + lfirst_oid(lc) = collid; +}do we need to check if `collid` is a valid collation?
like:if (!OidIsValid(collid))
{
lc = list_nth_cell(tab->changedCollationOids, attnum - 1);
lfirst_oid(lc) = collid;
}
now I think RememberCollationForRebuilding is fine. no need further change.
in ATAddForeignKeyConstraint.
if (old_check_ok)
{
/*
* When a pfeqop changes, revalidate the constraint. We could
* permit intra-opfamily changes, but that adds subtle complexity
* without any concrete benefit for core types. We need not
* assess ppeqop or ffeqop, which RI_Initial_Check() does not use.
*/
old_check_ok = (pfeqop == lfirst_oid(old_pfeqop_item));
old_pfeqop_item = lnext(fkconstraint->old_conpfeqop,
old_pfeqop_item);
}
maybe we can do the logic right below it. like:
if (old_check_ok)
{
* All deterministic collations use bitwise equality to resolve
* ties, but if the previous collation was nondeterministic,
* we must re-check the foreign key, because some references
* that use to be "equal" may not be anymore. If we have
* InvalidOid here, either there was no collation or the
* attribute didn't change.
old_check_ok = (old_collation == InvalidOid ||
get_collation_isdeterministic(old_collation));
}
then we don't need to cram it with the other `if (old_check_ok){}`.
do we need to add an entry in
https://wiki.postgresql.org/wiki/PostgreSQL_17_Open_Items#Open_Issues
?
On Fri, Apr 12, 2024 at 5:06 PM jian he <jian.universality@gmail.com> wrote:
On Tue, Mar 26, 2024 at 1:00 PM jian he <jian.universality@gmail.com> wrote:
On Mon, Mar 25, 2024 at 2:47 PM Paul Jungwirth
<pj@illuminatedcomputing.com> wrote:On 3/23/24 10:04, Paul Jungwirth wrote:
Perhaps if the previous collation was nondeterministic we should force a re-check.
Here is a patch implementing this. It was a bit more fuss than I expected, so maybe someone has a
better way.
I think I found a simple way.
the logic is:
* ATExecAlterColumnType changes one column once at a time.
* one column can only have one collation. so we don't need to store a
list of collation oid.
* ATExecAlterColumnType we can get the new collation (targetcollid)
and original collation info.
* RememberAllDependentForRebuilding will check the column dependent,
whether this column is referenced by a foreign key or not information
is recorded.
so AlteredTableInfo->changedConstraintOids have the primary key and
foreign key oids.
* ATRewriteCatalogs will call ATPostAlterTypeCleanup (see the comments
in ATRewriteCatalogs)
* for tab->changedConstraintOids (foreign key, primary key) will call
ATPostAlterTypeParse, so
for foreign key (con->contype == CONSTR_FOREIGN) will call TryReuseForeignKey.
* in TryReuseForeignKey, we can pass the information that our primary
key old collation is nondeterministic
and old collation != new collation to the foreign key constraint.
so foreign key can act accordingly at ATAddForeignKeyConstraint (old_check_ok).
based on the above logic, I add one bool in struct AlteredTableInfo,
one bool in struct Constraint.
bool in AlteredTableInfo is for storing it, later passing it to struct
Constraint.
we need bool in Constraint because ATAddForeignKeyConstraint needs it.
Attachments:
v3-0001-ALTER-COLUMN-SET-DATA-TYPE-Re-check-foreign-key-t.patchtext/x-patch; charset=US-ASCII; name=v3-0001-ALTER-COLUMN-SET-DATA-TYPE-Re-check-foreign-key-t.patchDownload
From 0eb0846450cb33996d6fb35c19290d297c26fd3c Mon Sep 17 00:00:00 2001
From: jian he <jian.universality@gmail.com>
Date: Sat, 13 Apr 2024 20:28:25 +0800
Subject: [PATCH v3 1/1] ALTER COLUMN SET DATA TYPE Re-check foreign key ties
under certain condition
with deterministic collations, we check ties by looking at binary
equality. But nondeterministic collation can allow non-identical values
to be considered equal, e.g. 'a' and 'A' when case-insensitive.
ALTER COLUMN .. SET DATA TYPE make
some references that used to be valid may not be anymore.
for `ALTER COLUMN .. SET DATA TYPE`:
If the changed key columns is part of the primary key columns,
then under the following condition
we need to revalidate the foreign key constraint:
* the primary key is referenced by a foreign key constraint
* the new collation is not the same as the old
* the old collation is nondeterministic.
discussion: https://postgr.es/m/78d824e0-b21e-480d-a252-e4b84bc2c24b%40illuminatedcomputing.com
---
src/backend/commands/tablecmds.c | 62 +++++++++++++++----
src/include/nodes/parsenodes.h | 3 +-
.../regress/expected/collate.icu.utf8.out | 9 +++
src/test/regress/sql/collate.icu.utf8.sql | 8 +++
4 files changed, 69 insertions(+), 13 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 000212f2..d5bd61f1 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -198,6 +198,7 @@ typedef struct AlteredTableInfo
/* Objects to rebuild after completing ALTER TYPE operations */
List *changedConstraintOids; /* OIDs of constraints to rebuild */
List *changedConstraintDefs; /* string definitions of same */
+ bool verify_new_collation; /* T if we should recheck new collation for foreign key */
List *changedIndexOids; /* OIDs of indexes to rebuild */
List *changedIndexDefs; /* string definitions of same */
char *replicaIdentityIndex; /* index to reset as REPLICA IDENTITY */
@@ -583,12 +584,13 @@ static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab,
LOCKMODE lockmode);
static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
char *cmd, List **wqueue, LOCKMODE lockmode,
- bool rewrite);
+ bool rewrite,
+ bool verify_new_collation);
static void RebuildConstraintComment(AlteredTableInfo *tab, AlterTablePass pass,
Oid objid, Relation rel, List *domname,
const char *conname);
static void TryReuseIndex(Oid oldId, IndexStmt *stmt);
-static void TryReuseForeignKey(Oid oldId, Constraint *con);
+static void TryReuseForeignKey(Oid oldId, Constraint *con, bool verify_new_collation);
static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char *colName,
List *options, LOCKMODE lockmode);
static void change_owner_fix_column_acls(Oid relationOid,
@@ -10256,6 +10258,15 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
old_pfeqop_item);
}
if (old_check_ok)
+ {
+ /* also see TryReuseForeignKey.
+ * collation_recheck is true means that, we need to
+ * recheck the foreign key side value is ok with the
+ * new primey key collation.
+ */
+ old_check_ok = !fkconstraint->collation_recheck;
+ }
+ if (old_check_ok)
{
Oid old_fktype;
Oid new_fktype;
@@ -10301,10 +10312,6 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
* turn conform to the domain. Consequently, we need not treat
* domains specially here.
*
- * Since we require that all collations share the same notion of
- * equality (which they do, because texteq reduces to bitwise
- * equality), we don't compare collation here.
- *
* We need not directly consider the PK type. It's necessarily
* binary coercible to the opcintype of the unique index column,
* and ri_triggers.c will only deal with PK datums in terms of
@@ -13738,6 +13745,30 @@ ATExecAlterColumnType(AlteredTableInfo *tab, Relation rel,
/* And the collation */
targetcollid = GetColumnDefCollation(NULL, def, targettype);
+ if (OidIsValid(targetcollid))
+ {
+ Oid typid;
+ int32 typmod;
+ Oid collid;
+
+ get_atttypetypmodcoll(RelationGetRelid(tab->rel), attnum,
+ &typid, &typmod, &collid);
+ /*
+ * All deterministic collations use bitwise equality to resolve
+ * ties, but if the previous(source) collation was indeterminstic,
+ * and previous collation is not the same as the target collation then
+ * we must re-check the foreign key, because some references
+ * that use to be "equal" may not be anymore. we first store this
+ * information in AlteredTableInfo, later we pass it to foreign key
+ * Constraint.
+ * also see ATAddForeignKeyConstraint.
+ */
+ if (OidIsValid(collid))
+ {
+ if (!get_collation_isdeterministic(collid) && targetcollid != collid)
+ tab->verify_new_collation = true;
+ }
+ }
/*
* If there is a default expression for the column, get it and ensure we
* can coerce it to the new datatype. (We must do this before changing
@@ -14406,7 +14437,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
ATPostAlterTypeParse(oldId, relid, confrelid,
(char *) lfirst(def_item),
- wqueue, lockmode, tab->rewrite);
+ wqueue, lockmode, tab->rewrite,
+ tab->verify_new_collation);
}
forboth(oid_item, tab->changedIndexOids,
def_item, tab->changedIndexDefs)
@@ -14417,7 +14449,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
relid = IndexGetRelation(oldId, false);
ATPostAlterTypeParse(oldId, relid, InvalidOid,
(char *) lfirst(def_item),
- wqueue, lockmode, tab->rewrite);
+ wqueue, lockmode, tab->rewrite,
+ tab->verify_new_collation);
ObjectAddressSet(obj, RelationRelationId, oldId);
add_exact_object_address(&obj, objects);
@@ -14433,7 +14466,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
relid = StatisticsGetRelation(oldId, false);
ATPostAlterTypeParse(oldId, relid, InvalidOid,
(char *) lfirst(def_item),
- wqueue, lockmode, tab->rewrite);
+ wqueue, lockmode, tab->rewrite,
+ tab->verify_new_collation);
ObjectAddressSet(obj, StatisticExtRelationId, oldId);
add_exact_object_address(&obj, objects);
@@ -14496,7 +14530,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
*/
static void
ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
- List **wqueue, LOCKMODE lockmode, bool rewrite)
+ List **wqueue, LOCKMODE lockmode, bool rewrite,
+ bool verify_new_collation)
{
List *raw_parsetree_list;
List *querytree_list;
@@ -14623,7 +14658,7 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
/* rewriting neither side of a FK */
if (con->contype == CONSTR_FOREIGN &&
!rewrite && tab->rewrite == 0)
- TryReuseForeignKey(oldId, con);
+ TryReuseForeignKey(oldId, con, verify_new_collation);
con->reset_default_tblspc = true;
cmd->subtype = AT_ReAddConstraint;
tab->subcmds[AT_PASS_OLD_CONSTR] =
@@ -14784,7 +14819,7 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
* this constraint can be skipped.
*/
static void
-TryReuseForeignKey(Oid oldId, Constraint *con)
+TryReuseForeignKey(Oid oldId, Constraint *con, bool verify_new_collation)
{
HeapTuple tup;
Datum adatum;
@@ -14816,6 +14851,9 @@ TryReuseForeignKey(Oid oldId, Constraint *con)
con->old_conpfeqop = lappend_oid(con->old_conpfeqop, rawarr[i]);
ReleaseSysCache(tup);
+
+ /* verify_new_collation is computed at ATExecAlterColumnType */
+ con->collation_recheck = verify_new_collation;
}
/*
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index f763f790..0349c61c 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2777,7 +2777,8 @@ typedef struct Constraint
List *old_conpfeqop; /* pg_constraint.conpfeqop of my former self */
Oid old_pktable_oid; /* pg_constraint.confrelid of my former
* self */
-
+ bool collation_recheck; /* does primary key columns collation change need
+ * foreign key constraint recheck */
ParseLoc location; /* token location, or -1 if unknown */
} Constraint;
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 4b8c8f14..4119776f 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1933,6 +1933,15 @@ SELECT * FROM test11fk;
---
(0 rows)
+-- Test altering the collation of the referenced column.
+CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE fktable (x text REFERENCES pktable);
+INSERT INTO pktable VALUES ('a');
+INSERT INTO fktable VALUES ('A');
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE "C";
+ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_x_fkey"
+DETAIL: Key (x)=(A) is not present in table "pktable".
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 80f28a97..08a69649 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -721,6 +721,14 @@ DELETE FROM test11pk WHERE x = 'abc';
SELECT * FROM test11pk;
SELECT * FROM test11fk;
+-- Test altering the collation of the referenced column.
+CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE fktable (x text REFERENCES pktable);
+INSERT INTO pktable VALUES ('a');
+INSERT INTO fktable VALUES ('A');
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE "C";
+
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
base-commit: 4cc1c76fe9f13aa96bae14f4fcfdf6d508af72a4
--
2.34.1
On Sat, Apr 13, 2024 at 9:13 PM jian he <jian.universality@gmail.com> wrote:
Here is a patch implementing this. It was a bit more fuss than I expected, so maybe someone has a
better way.I think I found a simple way.
the logic is:
* ATExecAlterColumnType changes one column once at a time.
* one column can only have one collation. so we don't need to store a
list of collation oid.
* ATExecAlterColumnType we can get the new collation (targetcollid)
and original collation info.
* RememberAllDependentForRebuilding will check the column dependent,
whether this column is referenced by a foreign key or not information
is recorded.
so AlteredTableInfo->changedConstraintOids have the primary key and
foreign key oids.
* ATRewriteCatalogs will call ATPostAlterTypeCleanup (see the comments
in ATRewriteCatalogs)
* for tab->changedConstraintOids (foreign key, primary key) will call
ATPostAlterTypeParse, so
for foreign key (con->contype == CONSTR_FOREIGN) will call TryReuseForeignKey.
* in TryReuseForeignKey, we can pass the information that our primary
key old collation is nondeterministic
and old collation != new collation to the foreign key constraint.
so foreign key can act accordingly at ATAddForeignKeyConstraint (old_check_ok).based on the above logic, I add one bool in struct AlteredTableInfo,
one bool in struct Constraint.
bool in AlteredTableInfo is for storing it, later passing it to struct
Constraint.
we need bool in Constraint because ATAddForeignKeyConstraint needs it.
I refactored the comments.
also added some extra tests hoping to make it more bullet proof, maybe
it's redundant.
Attachments:
v4-0001-Revalidate-PK-FK-ties-when-PK-collation-is-change.patchtext/x-patch; charset=US-ASCII; name=v4-0001-Revalidate-PK-FK-ties-when-PK-collation-is-change.patchDownload
From 9a2e73231e037c7555449d8790885c1feb4f3ed0 Mon Sep 17 00:00:00 2001
From: jian he <jian.universality@gmail.com>
Date: Fri, 7 Jun 2024 14:32:55 +0800
Subject: [PATCH v4 1/1] Revalidate PK-FK ties when PK collation is changed
With deterministic collations, we check the PK-FK ties by looking at binary
equality. But nondeterministic collation can allow non-identical values
to be considered equal, e.g. 'a' and 'A' when case-insensitive.
ALTER COLUMN .. SET DATA TYPE make
some references that used to be valid may not be anymore.
for `ALTER COLUMN .. SET DATA TYPE`:
If the changed key column is part of the primary key columns,
then under the following conditions we need to revalidate the PK-FK ties
* the PK new collation is different from the old
* the old collation is nondeterministic.
we need to revalidate the PK-FK ties.
discussion: https://postgr.es/m/78d824e0-b21e-480d-a252-e4b84bc2c24b%40illuminatedcomputing.com
---
src/backend/commands/tablecmds.c | 61 +++++++++++++++----
src/include/nodes/parsenodes.h | 3 +-
.../regress/expected/collate.icu.utf8.out | 18 ++++++
src/test/regress/sql/collate.icu.utf8.sql | 16 +++++
4 files changed, 85 insertions(+), 13 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 7b6c69b7..77337b6f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -199,6 +199,7 @@ typedef struct AlteredTableInfo
/* Objects to rebuild after completing ALTER TYPE operations */
List *changedConstraintOids; /* OIDs of constraints to rebuild */
List *changedConstraintDefs; /* string definitions of same */
+ bool verify_new_collation; /* T if we need recheck the new collation */
List *changedIndexOids; /* OIDs of indexes to rebuild */
List *changedIndexDefs; /* string definitions of same */
char *replicaIdentityIndex; /* index to reset as REPLICA IDENTITY */
@@ -572,12 +573,13 @@ static void ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab,
LOCKMODE lockmode);
static void ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId,
char *cmd, List **wqueue, LOCKMODE lockmode,
- bool rewrite);
+ bool rewrite,
+ bool verify_new_collation);
static void RebuildConstraintComment(AlteredTableInfo *tab, AlterTablePass pass,
Oid objid, Relation rel, List *domname,
const char *conname);
static void TryReuseIndex(Oid oldId, IndexStmt *stmt);
-static void TryReuseForeignKey(Oid oldId, Constraint *con);
+static void TryReuseForeignKey(Oid oldId, Constraint *con, bool verify_new_collation);
static ObjectAddress ATExecAlterColumnGenericOptions(Relation rel, const char *colName,
List *options, LOCKMODE lockmode);
static void change_owner_fix_column_acls(Oid relationOid,
@@ -9857,6 +9859,15 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
old_pfeqop_item);
}
if (old_check_ok)
+ {
+ /*
+ * collation_recheck is true then we need to
+ * revalidate the foreign key and primary key value ties.
+ * see also TryReuseForeignKey.
+ */
+ old_check_ok = !fkconstraint->collation_recheck;
+ }
+ if (old_check_ok)
{
Oid old_fktype;
Oid new_fktype;
@@ -9902,10 +9913,6 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
* turn conform to the domain. Consequently, we need not treat
* domains specially here.
*
- * Since we require that all collations share the same notion of
- * equality (which they do, because texteq reduces to bitwise
- * equality), we don't compare collation here.
- *
* We need not directly consider the PK type. It's necessarily
* binary coercible to the opcintype of the unique index column,
* and ri_triggers.c will only deal with PK datums in terms of
@@ -13035,6 +13042,29 @@ ATExecAlterColumnType(AlteredTableInfo *tab, Relation rel,
/* And the collation */
targetcollid = GetColumnDefCollation(NULL, def, targettype);
+ if (OidIsValid(targetcollid))
+ {
+ Oid typid;
+ int32 typmod;
+ Oid source_collid;
+
+ get_atttypetypmodcoll(RelationGetRelid(tab->rel), attnum,
+ &typid, &typmod, &source_collid);
+ /*
+ * All deterministic collations use bitwise equality to resolve
+ * PK-FK ties. But if the primary key (source) collation was indeterministic,
+ * and ALTER COLUMN .. SET DATA TYPE changes the primary key collation, then
+ * we must re-check the PK-FK ties, because some references
+ * that used to be "equal" may not be anymore.
+ * We add a flag in AlteredTableInfo to signify
+ * that we need to verify the new collation for PK-FK ties.
+ * see ATAddForeignKeyConstraint also.
+ */
+ if (OidIsValid(source_collid) &&
+ !get_collation_isdeterministic(source_collid) &&
+ (targetcollid != source_collid))
+ tab->verify_new_collation = true;
+ }
/*
* If there is a default expression for the column, get it and ensure we
* can coerce it to the new datatype. (We must do this before changing
@@ -13742,7 +13772,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
ATPostAlterTypeParse(oldId, relid, confrelid,
(char *) lfirst(def_item),
- wqueue, lockmode, tab->rewrite);
+ wqueue, lockmode, tab->rewrite,
+ tab->verify_new_collation);
}
forboth(oid_item, tab->changedIndexOids,
def_item, tab->changedIndexDefs)
@@ -13753,7 +13784,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
relid = IndexGetRelation(oldId, false);
ATPostAlterTypeParse(oldId, relid, InvalidOid,
(char *) lfirst(def_item),
- wqueue, lockmode, tab->rewrite);
+ wqueue, lockmode, tab->rewrite,
+ tab->verify_new_collation);
ObjectAddressSet(obj, RelationRelationId, oldId);
add_exact_object_address(&obj, objects);
@@ -13769,7 +13801,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
relid = StatisticsGetRelation(oldId, false);
ATPostAlterTypeParse(oldId, relid, InvalidOid,
(char *) lfirst(def_item),
- wqueue, lockmode, tab->rewrite);
+ wqueue, lockmode, tab->rewrite,
+ tab->verify_new_collation);
ObjectAddressSet(obj, StatisticExtRelationId, oldId);
add_exact_object_address(&obj, objects);
@@ -13832,7 +13865,8 @@ ATPostAlterTypeCleanup(List **wqueue, AlteredTableInfo *tab, LOCKMODE lockmode)
*/
static void
ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
- List **wqueue, LOCKMODE lockmode, bool rewrite)
+ List **wqueue, LOCKMODE lockmode, bool rewrite,
+ bool verify_new_collation)
{
List *raw_parsetree_list;
List *querytree_list;
@@ -13959,7 +13993,7 @@ ATPostAlterTypeParse(Oid oldId, Oid oldRelId, Oid refRelId, char *cmd,
/* rewriting neither side of a FK */
if (con->contype == CONSTR_FOREIGN &&
!rewrite && tab->rewrite == 0)
- TryReuseForeignKey(oldId, con);
+ TryReuseForeignKey(oldId, con, verify_new_collation);
con->reset_default_tblspc = true;
cmd->subtype = AT_ReAddConstraint;
tab->subcmds[AT_PASS_OLD_CONSTR] =
@@ -14119,7 +14153,7 @@ TryReuseIndex(Oid oldId, IndexStmt *stmt)
* this constraint can be skipped.
*/
static void
-TryReuseForeignKey(Oid oldId, Constraint *con)
+TryReuseForeignKey(Oid oldId, Constraint *con, bool verify_new_collation)
{
HeapTuple tup;
Datum adatum;
@@ -14151,6 +14185,9 @@ TryReuseForeignKey(Oid oldId, Constraint *con)
con->old_conpfeqop = lappend_oid(con->old_conpfeqop, rawarr[i]);
ReleaseSysCache(tup);
+
+ /* verify_new_collation is computed at ATExecAlterColumnType */
+ con->collation_recheck = verify_new_collation;
}
/*
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index ddfed02d..29b484f5 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -2773,7 +2773,8 @@ typedef struct Constraint
List *old_conpfeqop; /* pg_constraint.conpfeqop of my former self */
Oid old_pktable_oid; /* pg_constraint.confrelid of my former
* self */
-
+ bool collation_recheck; /* does primary key columns collation changes need
+ * foreign key constraint recheck */
ParseLoc location; /* token location, or -1 if unknown */
} Constraint;
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 7d59fb44..ea28031b 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1933,6 +1933,24 @@ SELECT * FROM test11fk;
---
(0 rows)
+-- Test altering the collation of the referenced column.
+CREATE COLLATION ignore_accent_case (provider = icu, deterministic = false, locale = 'und-u-ks-level1');
+CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE fktable (x text REFERENCES pktable);
+INSERT INTO pktable VALUES ('a');
+INSERT INTO fktable VALUES ('A');
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE "C";
+ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_x_fkey"
+DETAIL: Key (x)=(A) is not present in table "pktable".
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE ignore_accents;
+ERROR: insert or update on table "fktable" violates foreign key constraint "fktable_x_fkey"
+DETAIL: Key (x)=(A) is not present in table "pktable".
+-- should ok:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE case_insensitive;
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE ignore_accent_case;
+DROP TABLE fktable, pktable;
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 80f28a97..dcdbbc48 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -721,6 +721,22 @@ DELETE FROM test11pk WHERE x = 'abc';
SELECT * FROM test11pk;
SELECT * FROM test11fk;
+-- Test altering the collation of the referenced column.
+CREATE COLLATION ignore_accent_case (provider = icu, deterministic = false, locale = 'und-u-ks-level1');
+CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE fktable (x text REFERENCES pktable);
+INSERT INTO pktable VALUES ('a');
+INSERT INTO fktable VALUES ('A');
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE "C";
+
+-- should fail:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE ignore_accents;
+
+-- should ok:
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE case_insensitive;
+ALTER TABLE pktable ALTER COLUMN x TYPE text COLLATE ignore_accent_case;
+DROP TABLE fktable, pktable;
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
--
2.34.1
jian he <jian.universality@gmail.com> writes:
* in TryReuseForeignKey, we can pass the information that our primary
key old collation is nondeterministic
and old collation != new collation to the foreign key constraint.
I have a basic question about this: why are we allowing FKs to be
based on nondeterministic collations at all? ISTM that that breaks
the assumption that there is exactly one referenced row for any
referencing row.
regards, tom lane
On Sat, Jun 8, 2024 at 4:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
jian he <jian.universality@gmail.com> writes:
* in TryReuseForeignKey, we can pass the information that our primary
key old collation is nondeterministic
and old collation != new collation to the foreign key constraint.I have a basic question about this: why are we allowing FKs to be
based on nondeterministic collations at all? ISTM that that breaks
the assumption that there is exactly one referenced row for any
referencing row.
for FKs nondeterministic,
I think that would require the PRIMARY KEY collation to not be
indeterministic also.
for example:
CREATE COLLATION ignore_accent_case (provider = icu, deterministic =
false, locale = 'und-u-ks-level1');
DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE ignore_accent_case PRIMARY KEY);
CREATE TABLE fktable (x text REFERENCES pktable on update cascade on
delete cascade);
INSERT INTO pktable VALUES ('A');
INSERT INTO fktable VALUES ('a');
INSERT INTO fktable VALUES ('A');
update pktable set x = 'Å';
table fktable;
if FK is nondeterministic, then it looks PK more like FK.
the following example, one FK row is referenced by two PK rows.
DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE "C" PRIMARY KEY);
CREATE TABLE fktable (x text COLLATE ignore_accent_case REFERENCES
pktable on update cascade on delete cascade);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('A');
begin; delete from pktable where x = 'Å'; TABLE fktable; rollback;
begin; delete from pktable where x = 'A'; TABLE fktable; rollback;
On 08.06.24 06:14, jian he wrote:
if FK is nondeterministic, then it looks PK more like FK.
the following example, one FK row is referenced by two PK rows.DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE "C" PRIMARY KEY);
CREATE TABLE fktable (x text COLLATE ignore_accent_case REFERENCES
pktable on update cascade on delete cascade);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('A');
Yes, this is a problem. The RI checks are done with the collation of
the primary key.
The comment at ri_GenerateQualCollation() says "the SQL standard
specifies that RI comparisons should use the referenced column's
collation". But this is not what it says in my current copy.
... [ digs around ISO archives ] ...
Yes, this was changed in SQL:2016 to require the collation on the PK
side and the FK side to match at constraint creation time. The argument
made is exactly the same we have here. This was actually already the
rule in SQL:1999 but was then relaxed in SQL:2003 and then changed back
because it was a mistake.
We probably don't need to enforce this for deterministic collations,
which would preserve some backward compatibility.
I'll think some more about what steps to take to solve this and what to
do about back branches etc.
On Tue, Jun 18, 2024 at 4:50 PM Peter Eisentraut <peter@eisentraut.org> wrote:
On 08.06.24 06:14, jian he wrote:
if FK is nondeterministic, then it looks PK more like FK.
the following example, one FK row is referenced by two PK rows.DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE "C" PRIMARY KEY);
CREATE TABLE fktable (x text COLLATE ignore_accent_case REFERENCES
pktable on update cascade on delete cascade);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('A');Yes, this is a problem. The RI checks are done with the collation of
the primary key.The comment at ri_GenerateQualCollation() says "the SQL standard
specifies that RI comparisons should use the referenced column's
collation". But this is not what it says in my current copy.... [ digs around ISO archives ] ...
Yes, this was changed in SQL:2016 to require the collation on the PK
side and the FK side to match at constraint creation time. The argument
made is exactly the same we have here. This was actually already the
rule in SQL:1999 but was then relaxed in SQL:2003 and then changed back
because it was a mistake.We probably don't need to enforce this for deterministic collations,
which would preserve some backward compatibility.I'll think some more about what steps to take to solve this and what to
do about back branches etc.
I have come up with 3 corner cases.
---case1. not ok. PK indeterministic, FK default
DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE ignore_accent_case PRIMARY KEY);
CREATE TABLE fktable (x text REFERENCES pktable on update cascade on
delete cascade);
INSERT INTO pktable VALUES ('A');
INSERT INTO fktable VALUES ('a');
INSERT INTO fktable VALUES ('A');
RI_FKey_check (Check foreign key existence ) querybuf.data is
SELECT 1 FROM ONLY "public"."pktable" x WHERE "x"
OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x
in here, fktable doesn't have collation.
we cannot rewrite to
SELECT 1 FROM ONLY "public"."pktable" x WHERE "x"
OPERATOR(pg_catalog.=) $1 collate "C" FOR KEY SHARE OF x
so assumption (only one referenced row for any referencing row) will
break when inserting values to fktable.
RI_FKey_check already allows invalidate values to happen, not sure how
ri_GenerateQualCollation can help.
overall i don't know how to stop invalid value while inserting value
to fktable in this case.
---case2. not ok case PK deterministic, FK indeterministic
DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE "C" PRIMARY KEY);
CREATE TABLE fktable (x text COLLATE ignore_accent_case REFERENCES
pktable on update cascade on delete cascade);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('A');
begin; update pktable set x = 'B' where x = 'Å'; table fktable; rollback;
begin; update pktable set x = 'B' where x = 'A'; table fktable; rollback;
when cascade update fktable, in RI_FKey_cascade_upd
we can use pktable's collation. but again, a query updating fktable
only, using pktable collation seems strange.
---case3. ---not ok case PK indeterministic, FK deterministic
DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE ignore_accent_case PRIMARY KEY);
CREATE TABLE fktable (x text collate "C" REFERENCES pktable on update
cascade on delete cascade);
INSERT INTO pktable VALUES ('A');
INSERT INTO fktable VALUES ('A');
INSERT INTO fktable VALUES ('a');
begin; update pktable set x = 'Å'; table fktable; rollback;
when we "INSERT INTO fktable VALUES ('a');"
we can disallow this invalid query in RI_FKey_check by constructing
the following stop query
SELECT 1 FROM ONLY public.pktable x WHERE x OPERATOR(pg_catalog.=) 'a'
collate "C" FOR KEY SHARE OF x;
but this query associated PK table with fktable collation seems weird?
summary:
case1 seems hard to resolve
case2 can be resolved in ri_GenerateQualCollation, not 100% sure.
case3 can be resolved in RI_FKey_check
because case1 is hard to resolve;
so overall I feel like erroring out PK indeterministic and FK
indeterministic while creating foreign keys is easier.
We can mandate foreign keys and primary key columns be deterministic
in ATAddForeignKeyConstraint.
The attached patch does that.
That means src/test/regress/sql/collate.icu.utf8.sql table test10pk,
table test11pk will have a big change.
so only attach src/backend/commands/tablecmds.c changes.
Attachments:
make_pk_fk_tie_collation_deterministic.difftext/x-patch; charset=US-ASCII; name=make_pk_fk_tie_collation_deterministic.diffDownload
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 721d2478..491ba87c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -9757,6 +9757,39 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_FOREIGN_KEY),
errmsg("number of referencing and referenced columns for foreign key disagree")));
+ for (i = 0; i < numpks; i++)
+ {
+ int32 keycoltypmod;
+ Oid keycoltype;
+ Oid keycolcollation;
+
+ get_atttypetypmodcoll(RelationGetRelid(pkrel), pkattnum[i],
+ &keycoltype, &keycoltypmod,
+ &keycolcollation);
+ if (OidIsValid(keycolcollation) &&
+ !get_collation_isdeterministic(keycolcollation))
+ ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("foreign key constraint \"%s\" cannot be implemented",
+ fkconstraint->conname),
+ errdetail("Cannot have indeterministic collations in primary key for referenced table \"%s\"",
+ RelationGetRelationName(pkrel)),
+ errhint("You may make the primary key collations deterministic.")));
+
+ get_atttypetypmodcoll(RelationGetRelid(rel), fkattnum[i],
+ &keycoltype, &keycoltypmod,
+ &keycolcollation);
+ if (OidIsValid(keycolcollation) &&
+ !get_collation_isdeterministic(keycolcollation))
+ ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("foreign key constraint \"%s\" cannot be implemented",
+ fkconstraint->conname),
+ errdetail("Foreign key cannot use indeterministic collations for referencing table \"%s\"",
+ RelationGetRelationName(rel)),
+ errhint("You may make the foreign key collations deterministic.")));
+ }
+
/*
* On the strength of a previous constraint, we might avoid scanning
* tables to validate this one. See below.
On 07.06.24 08:39, jian he wrote:
On Sat, Apr 13, 2024 at 9:13 PM jian he <jian.universality@gmail.com> wrote:
Here is a patch implementing this. It was a bit more fuss than I expected, so maybe someone has a
better way.I think I found a simple way.
the logic is:
* ATExecAlterColumnType changes one column once at a time.
* one column can only have one collation. so we don't need to store a
list of collation oid.
* ATExecAlterColumnType we can get the new collation (targetcollid)
and original collation info.
* RememberAllDependentForRebuilding will check the column dependent,
whether this column is referenced by a foreign key or not information
is recorded.
so AlteredTableInfo->changedConstraintOids have the primary key and
foreign key oids.
* ATRewriteCatalogs will call ATPostAlterTypeCleanup (see the comments
in ATRewriteCatalogs)
* for tab->changedConstraintOids (foreign key, primary key) will call
ATPostAlterTypeParse, so
for foreign key (con->contype == CONSTR_FOREIGN) will call TryReuseForeignKey.
* in TryReuseForeignKey, we can pass the information that our primary
key old collation is nondeterministic
and old collation != new collation to the foreign key constraint.
so foreign key can act accordingly at ATAddForeignKeyConstraint (old_check_ok).based on the above logic, I add one bool in struct AlteredTableInfo,
one bool in struct Constraint.
bool in AlteredTableInfo is for storing it, later passing it to struct
Constraint.
we need bool in Constraint because ATAddForeignKeyConstraint needs it.I refactored the comments.
also added some extra tests hoping to make it more bullet proof, maybe
it's redundant.
I like this patch version (v4). It's the simplest, probably also
easiest to backpatch.
It has a flaw: It will also trigger a FK recheck if you alter the
collation of the referencing column (foreign key column), even though
that is not necessary. (Note that your tests and the examples in this
thread only discuss altering the PK column collation, because that is
what is actually used during the foreign key checks.) Maybe there is an
easy way to avoid that, but I couldn't see one in that patch structure.
Maybe that is ok as a compromise. If, in the future, we make it a
requirement that the collations on the PK and FK side have to be the
same if either collation is nondeterministic, then this case can no
longer happen anyway. And so building more infrastructure for this
check might be wasted.
On Tue, Sep 3, 2024 at 5:41 PM Peter Eisentraut <peter@eisentraut.org> wrote:
I like this patch version (v4). It's the simplest, probably also
easiest to backpatch.
I am actually confused.
In this email thread [1]/messages/by-id/CACJufxEW6OMBqt8cbr=3Jt++Zd_SL-4YDjfk7Q7DhGKiSLcu4g@mail.gmail.com, I listed 3 corn cases.
I thought all these 3 corner cases should not be allowed.
but V4 didn't solve these corner case issues.
what do you think of their corner case, should it be allowed?
Anyway, I thought these corner cases should not be allowed to happen,
so I made sure PK, FK ties related collation were deterministic.
PK can have indeterministic collation as long as it does not interact with FK.
[1]: /messages/by-id/CACJufxEW6OMBqt8cbr=3Jt++Zd_SL-4YDjfk7Q7DhGKiSLcu4g@mail.gmail.com
Attachments:
v5-0001-ensure-PK-FK-ties-related-collation-indeterminist.patchtext/x-patch; charset=US-ASCII; name=v5-0001-ensure-PK-FK-ties-related-collation-indeterminist.patchDownload
From 024d818e0e286af10b74649bd3156836d6f28447 Mon Sep 17 00:00:00 2001
From: jian he <jian.universality@gmail.com>
Date: Wed, 4 Sep 2024 14:31:43 +0800
Subject: [PATCH v5 1/1] ensure PK-FK ties related collation indeterministic
PK can have indeterministic collation as long as it not interact with FK.
To make the PK-FK tie consistent, we need to ensure both side's collation is
deterministic, otherwise, the following two some anomalitys can happen.
1. PK side collation is indeterministic, multiple form FK values equal to one PK
value.
2. FK side collation is indeterministic, multiple form PK values equal to one FK
value.
discussion: https://postgr.es/m/78d824e0-b21e-480d-a252-e4b84bc2c24b%40illuminatedcomputing.com
---
src/backend/commands/tablecmds.c | 65 ++++++++
.../regress/expected/collate.icu.utf8.out | 142 +++++++-----------
src/test/regress/sql/collate.icu.utf8.sql | 60 ++++----
3 files changed, 151 insertions(+), 116 deletions(-)
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index b3cc6f8f69..6c951ae36e 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -9705,6 +9705,71 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
(errcode(ERRCODE_INVALID_FOREIGN_KEY),
errmsg("number of referencing and referenced columns for foreign key disagree")));
+ for (i = 0; i < numpks; i++)
+ {
+ int32 keycoltypmod;
+ Oid keycoltype;
+ Oid keycolcollation;
+ Oid typcollation;
+ bool col_coll_deterministic = true;
+
+ /*
+ * for both foreign key and primary key, we need make sure the specified
+ * keys collation (via COLLATE clause) is deterministic. we also need to
+ * ensure the key's typcollation is indeterministic (for domain type).
+ *
+ * PK and FK both are not indeterministic, then anomality can happen:
+ * 1. PK side collation is indeterministic, multiple form FK values
+ * equal to one PK value.
+ * 2. FK side collation is indeterministic, multiple form PK values
+ * equal to one FK value.
+ *
+ */
+
+ /* checking PK collation deterministic */
+ get_atttypetypmodcoll(RelationGetRelid(pkrel), pkattnum[i],
+ &keycoltype, &keycoltypmod,
+ &keycolcollation);
+
+ typcollation = get_typcollation(keycoltype);
+ if (OidIsValid(typcollation))
+ col_coll_deterministic = get_collation_isdeterministic(typcollation);
+
+ if (col_coll_deterministic && OidIsValid(keycolcollation))
+ col_coll_deterministic = get_collation_isdeterministic(keycolcollation);
+
+ if (!col_coll_deterministic)
+ ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("foreign key constraint \"%s\" cannot be implemented",
+ fkconstraint->conname),
+ errdetail("Cannot have indeterministic collations in primary key for referenced table \"%s\".",
+ RelationGetRelationName(pkrel)),
+ errhint("You may need to make primary key constraint's collation deterministic.")));
+
+
+ /* checking PK collation deterministic */
+ get_atttypetypmodcoll(RelationGetRelid(rel), fkattnum[i],
+ &keycoltype, &keycoltypmod,
+ &keycolcollation);
+
+ typcollation = get_typcollation(keycoltype);
+ if (OidIsValid(typcollation))
+ col_coll_deterministic = get_collation_isdeterministic(typcollation);
+
+ if (col_coll_deterministic && OidIsValid(keycolcollation))
+ col_coll_deterministic = get_collation_isdeterministic(keycolcollation);
+
+ if (!col_coll_deterministic)
+ ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg("foreign key constraint \"%s\" cannot be implemented",
+ fkconstraint->conname),
+ errdetail("Foreign key constraint cannot use indeterministic collation for referencing table \"%s\".",
+ RelationGetRelationName(rel)),
+ errhint("You may need to make the foreign key constraint's collation deterministic.")));
+ }
+
/*
* On the strength of a previous constraint, we might avoid scanning
* tables to validate this one. See below.
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 31345295c1..5074812099 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1841,100 +1841,62 @@ SELECT * FROM test4 WHERE b = 'Cote' COLLATE case_insensitive;
(1 row)
-- foreign keys (should use collation of primary key)
--- PK is case-sensitive, FK is case-insensitive
+CREATE COLLATION ignore_accent_case (provider = icu, deterministic = false, locale = 'und-u-ks-level1');
+CREATE DOMAIN text_d_ignore_accent_case AS text COLLATE ignore_accent_case;
+-- PK is case-sensitive (deterministic), FK is case-insensitive (indeterministic).
+-- not OK, multiple PK-side value equal to one FK-side value not allowed
+-- e.g. not allowed case: PK values {'a', 'A"} both equal to FK value {'A'}
CREATE TABLE test10pk (x text COLLATE case_sensitive PRIMARY KEY);
-INSERT INTO test10pk VALUES ('abc'), ('def'), ('ghi');
CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test10fk VALUES ('abc'); -- ok
-INSERT INTO test10fk VALUES ('ABC'); -- error
-ERROR: insert or update on table "test10fk" violates foreign key constraint "test10fk_x_fkey"
-DETAIL: Key (x)=(ABC) is not present in table "test10pk".
-INSERT INTO test10fk VALUES ('xyz'); -- error
-ERROR: insert or update on table "test10fk" violates foreign key constraint "test10fk_x_fkey"
-DETAIL: Key (x)=(xyz) is not present in table "test10pk".
-SELECT * FROM test10pk;
- x
------
- abc
- def
- ghi
-(3 rows)
-
-SELECT * FROM test10fk;
- x
------
- abc
-(1 row)
-
--- restrict update even though the values are "equal" in the FK table
-UPDATE test10fk SET x = 'ABC' WHERE x = 'abc'; -- error
-ERROR: insert or update on table "test10fk" violates foreign key constraint "test10fk_x_fkey"
-DETAIL: Key (x)=(ABC) is not present in table "test10pk".
-SELECT * FROM test10fk;
- x
------
- abc
-(1 row)
-
-DELETE FROM test10pk WHERE x = 'abc';
-SELECT * FROM test10pk;
- x
------
- def
- ghi
-(2 rows)
-
-SELECT * FROM test10fk;
- x
----
-(0 rows)
-
--- PK is case-insensitive, FK is case-sensitive
+ERROR: foreign key constraint "test10fk_x_fkey" cannot be implemented
+DETAIL: Foreign key constraint cannot use indeterministic collation for referencing table "test10fk".
+HINT: You may need to make the foreign key constraint's collation deterministic.
+-- PK is case-insensitive (indeterministic), FK is case-sensitive (deterministic)
+-- more than one referencing (FK) row value equal to referenced (PK) row should not allowed
+-- e.g. not allowed case: FK values {'a', 'A"} both equal to PK value {'A'}
CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
-INSERT INTO test11pk VALUES ('abc'), ('def'), ('ghi');
CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test11fk VALUES ('abc'); -- ok
-INSERT INTO test11fk VALUES ('ABC'); -- ok
-INSERT INTO test11fk VALUES ('xyz'); -- error
-ERROR: insert or update on table "test11fk" violates foreign key constraint "test11fk_x_fkey"
-DETAIL: Key (x)=(xyz) is not present in table "test11pk".
-SELECT * FROM test11pk;
- x
------
- abc
- def
- ghi
-(3 rows)
-
-SELECT * FROM test11fk;
- x
------
- abc
- ABC
-(2 rows)
-
--- cascade update even though the values are "equal" in the PK table
-UPDATE test11pk SET x = 'ABC' WHERE x = 'abc';
-SELECT * FROM test11fk;
- x
------
- ABC
- ABC
-(2 rows)
-
-DELETE FROM test11pk WHERE x = 'abc';
-SELECT * FROM test11pk;
- x
------
- def
- ghi
-(2 rows)
-
-SELECT * FROM test11fk;
- x
----
-(0 rows)
-
+ERROR: foreign key constraint "test11fk_x_fkey" cannot be implemented
+DETAIL: Cannot have indeterministic collations in primary key for referenced table "test11pk".
+HINT: You may need to make primary key constraint's collation deterministic.
+drop table test11pk;
+CREATE TABLE test12pk (x text PRIMARY KEY);
+CREATE TABLE test12fk (x text REFERENCES test12pk on update cascade on delete cascade);
+--alter table change primary key's collation to indeterministic, should fail.
+alter table test12pk alter column x set data type text COLLATE ignore_accent_case;
+ERROR: foreign key constraint "test12fk_x_fkey" cannot be implemented
+DETAIL: Cannot have indeterministic collations in primary key for referenced table "test12pk".
+HINT: You may need to make primary key constraint's collation deterministic.
+alter table test12pk alter column x set data type text_d_ignore_accent_case;
+ERROR: foreign key constraint "test12fk_x_fkey" cannot be implemented
+DETAIL: Cannot have indeterministic collations in primary key for referenced table "test12pk".
+HINT: You may need to make primary key constraint's collation deterministic.
+---should also fail.
+alter table test12pk alter column x set data type text_d_ignore_accent_case COLLATE "C";
+ERROR: foreign key constraint "test12fk_x_fkey" cannot be implemented
+DETAIL: Cannot have indeterministic collations in primary key for referenced table "test12pk".
+HINT: You may need to make primary key constraint's collation deterministic.
+drop table test12pk,test12fk;
+--alter table change foreign key's collation, all these case should fail.
+CREATE TABLE test13pk (x text PRIMARY KEY);
+CREATE TABLE test13fk (x text REFERENCES test13pk on update cascade on delete cascade);
+alter table test13fk alter column x set data type text COLLATE ignore_accent_case;
+ERROR: foreign key constraint "test13fk_x_fkey" cannot be implemented
+DETAIL: Foreign key constraint cannot use indeterministic collation for referencing table "test13fk".
+HINT: You may need to make the foreign key constraint's collation deterministic.
+alter table test13fk alter column x set data type text_d_ignore_accent_case;
+ERROR: foreign key constraint "test13fk_x_fkey" cannot be implemented
+DETAIL: Foreign key constraint cannot use indeterministic collation for referencing table "test13fk".
+HINT: You may need to make the foreign key constraint's collation deterministic.
+alter table test13fk alter column x set data type text_d_ignore_accent_case COLLATE ignore_accent_case;
+ERROR: foreign key constraint "test13fk_x_fkey" cannot be implemented
+DETAIL: Foreign key constraint cannot use indeterministic collation for referencing table "test13fk".
+HINT: You may need to make the foreign key constraint's collation deterministic.
+alter table test13fk alter column x set data type text_d_ignore_accent_case COLLATE "C";
+ERROR: foreign key constraint "test13fk_x_fkey" cannot be implemented
+DETAIL: Foreign key constraint cannot use indeterministic collation for referencing table "test13fk".
+HINT: You may need to make the foreign key constraint's collation deterministic.
+drop table test13pk,test13fk;
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 80f28a97d7..ad418e87eb 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -688,38 +688,46 @@ SELECT * FROM test4 WHERE b = 'Cote' COLLATE ignore_accents; -- still case-sens
SELECT * FROM test4 WHERE b = 'Cote' COLLATE case_insensitive;
-- foreign keys (should use collation of primary key)
+CREATE COLLATION ignore_accent_case (provider = icu, deterministic = false, locale = 'und-u-ks-level1');
+CREATE DOMAIN text_d_ignore_accent_case AS text COLLATE ignore_accent_case;
--- PK is case-sensitive, FK is case-insensitive
+-- PK is case-sensitive (deterministic), FK is case-insensitive (indeterministic).
+-- not OK, multiple PK-side value equal to one FK-side value not allowed
+-- e.g. not allowed case: PK values {'a', 'A"} both equal to FK value {'A'}
CREATE TABLE test10pk (x text COLLATE case_sensitive PRIMARY KEY);
-INSERT INTO test10pk VALUES ('abc'), ('def'), ('ghi');
CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test10fk VALUES ('abc'); -- ok
-INSERT INTO test10fk VALUES ('ABC'); -- error
-INSERT INTO test10fk VALUES ('xyz'); -- error
-SELECT * FROM test10pk;
-SELECT * FROM test10fk;
--- restrict update even though the values are "equal" in the FK table
-UPDATE test10fk SET x = 'ABC' WHERE x = 'abc'; -- error
-SELECT * FROM test10fk;
-DELETE FROM test10pk WHERE x = 'abc';
-SELECT * FROM test10pk;
-SELECT * FROM test10fk;
--- PK is case-insensitive, FK is case-sensitive
+
+-- PK is case-insensitive (indeterministic), FK is case-sensitive (deterministic)
+-- more than one referencing (FK) row value equal to referenced (PK) row should not allowed
+-- e.g. not allowed case: FK values {'a', 'A"} both equal to PK value {'A'}
CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
-INSERT INTO test11pk VALUES ('abc'), ('def'), ('ghi');
CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test11fk VALUES ('abc'); -- ok
-INSERT INTO test11fk VALUES ('ABC'); -- ok
-INSERT INTO test11fk VALUES ('xyz'); -- error
-SELECT * FROM test11pk;
-SELECT * FROM test11fk;
--- cascade update even though the values are "equal" in the PK table
-UPDATE test11pk SET x = 'ABC' WHERE x = 'abc';
-SELECT * FROM test11fk;
-DELETE FROM test11pk WHERE x = 'abc';
-SELECT * FROM test11pk;
-SELECT * FROM test11fk;
+drop table test11pk;
+
+
+
+CREATE TABLE test12pk (x text PRIMARY KEY);
+CREATE TABLE test12fk (x text REFERENCES test12pk on update cascade on delete cascade);
+
+--alter table change primary key's collation to indeterministic, should fail.
+alter table test12pk alter column x set data type text COLLATE ignore_accent_case;
+alter table test12pk alter column x set data type text_d_ignore_accent_case;
+
+---should also fail.
+alter table test12pk alter column x set data type text_d_ignore_accent_case COLLATE "C";
+drop table test12pk,test12fk;
+
+
+
+--alter table change foreign key's collation, all these case should fail.
+CREATE TABLE test13pk (x text PRIMARY KEY);
+CREATE TABLE test13fk (x text REFERENCES test13pk on update cascade on delete cascade);
+alter table test13fk alter column x set data type text COLLATE ignore_accent_case;
+alter table test13fk alter column x set data type text_d_ignore_accent_case;
+alter table test13fk alter column x set data type text_d_ignore_accent_case COLLATE ignore_accent_case;
+alter table test13fk alter column x set data type text_d_ignore_accent_case COLLATE "C";
+drop table test13pk,test13fk;
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
base-commit: 31a98934d1699007b50aefaedd68ab4d6b42e3b4
--
2.34.1
On 04.09.24 08:54, jian he wrote:
On Tue, Sep 3, 2024 at 5:41 PM Peter Eisentraut <peter@eisentraut.org> wrote:
I like this patch version (v4). It's the simplest, probably also
easiest to backpatch.
I am actually confused.
In this email thread [1], I listed 3 corn cases.
I thought all these 3 corner cases should not be allowed.
but V4 didn't solve these corner case issues.
what do you think of their corner case, should it be allowed?
Anyway, I thought these corner cases should not be allowed to happen,
so I made sure PK, FK ties related collation were deterministic.
PK can have indeterministic collation as long as it does not interact with FK.
I had thought we could at first do a limited patch that just prevents
the problematic collation changes that Paul pointed out initially, and
then backpatch that, and then develop a more complete solution for
master. But after thinking about this a bit more, such a limited patch
might just be some partial whack-a-mole that would still leave open
problems (as you have pointed out), and it also looks like such a patch
wouldn't be any simpler than the complete solution.
So I took the v5 patch you had posted and started working from there.
The rule that you had picked isn't quite what we want, I think. It's
okay to have nondeterministic collations on foreign keys, as long as the
collation is the same on both sides. That's what I have implemented.
See attached.
This approach also allows cleaning up a bunch of hackiness in
ri_triggers.c, which feels satisfying.
I don't know what to do about backpatching though. The patch itself
appears to backpatch ok. (There are some cosmetic issues that need
manual intervention, but the code structure is pretty consistent going
back.) But that kind of behavior change, I don't know. Also, for the
most part, the existing setup works, it's only if you do some particular
table alterations that you can construct problems.
We could make the error a warning instead, so people know that what they
are building is problematic and deprecated.
But in either case, or even with some of the other approaches discussed
in previous patch versions, such as v4, you'd only get that warning or
error if you do table DDL. If you'd just upgrade the minor release, you
don't get any info. So we'd also have to construct some query to check
for this and create some release note guidance.
So for the moment this is a master-only patch. I think once we have
tied down the behavior we want for the future, we can then see how we
can nudge older versions in that direction.
Attachments:
v6-0001-Fix-collation-handling-for-foreign-keys.patchtext/plain; charset=UTF-8; name=v6-0001-Fix-collation-handling-for-foreign-keys.patchDownload
From 0cdd04de9775adcae2dd6980d3161cf69076643a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 17 Oct 2024 12:25:08 +0200
Subject: [PATCH v6] Fix collation handling for foreign keys
Allowing foreign keys where the referenced and the referencing columns
have collations with different notions of equality is problematic.
This can only happen when using nondeterministic collations, for
example, if the referenced column is case-insensitive and the
referencing column is not, or vice versa. It does not happen if both
collations are deterministic.
XXX maybe some examples here?
When nondeterministic collations were implemented, the SQL standard
available to yours truly said that referential integrity checks should
be performed with the collation of the referenced column, and so
that's how we implemented it. But this turned out to be a mistake in
the SQL standard, for the same reasons as above, that was later
(SQL:2016) fixed to require both collations to be the same. So that's
what we are aiming for here.
We don't have to be quite so strict. We can allow different
collations if they are both deterministic. This is also good for
backward compatibility.
So the new rule is that the collations either have to be the same or
both deterministic. Or in other words, if one of them is
nondeterministic, then both have to be the same.
Some of the nice test cases for the previous situation in
collate.icu.utf8.sql are now obsolete. They are changed to just check
the error checking of the new rule. Note that collate.sql already
contained a test for foreign keys with different deterministic
colllations.
A bunch of code in ri_triggers.c that added a COLLATE clause to
enforce the referenced column's collation can be removed, because both
columns now have to have the same notion of equality, so it doesn't
matter which one to use.
Reported-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/flat/78d824e0-b21e-480d-a252-e4b84bc2c24b@illuminatedcomputing.com
---
doc/src/sgml/ref/create_table.sgml | 7 ++
src/backend/commands/tablecmds.c | 73 ++++++++++---
src/backend/utils/adt/ri_triggers.c | 55 +++-------
.../regress/expected/collate.icu.utf8.out | 100 ++----------------
src/test/regress/sql/collate.icu.utf8.sql | 35 +-----
5 files changed, 96 insertions(+), 174 deletions(-)
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 83859bac76f..dd7446117bc 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1199,6 +1199,13 @@ <title>Parameters</title>
must have its final column marked <literal>WITHOUT OVERLAPS</literal>.
</para>
+ <para>
+ For each pair of referencing and referenced column, if they are of a
+ collatable data type, then the collations must either be both
+ deterministic or else both the same. This ensures that both columns
+ have a consistent notion of equality.
+ </para>
+
<para>
The user
must have <literal>REFERENCES</literal> permission on the referenced
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 1ccc80087c3..85cd2f0f170 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -389,10 +389,10 @@ static ObjectAddress ATExecValidateConstraint(List **wqueue,
Relation rel, char *constrName,
bool recurse, bool recursing, LOCKMODE lockmode);
static int transformColumnNameList(Oid relId, List *colList,
- int16 *attnums, Oid *atttypids);
+ int16 *attnums, Oid *atttypids, Oid *attcollids);
static int transformFkeyGetPrimaryKey(Relation pkrel, Oid *indexOid,
List **attnamelist,
- int16 *attnums, Oid *atttypids,
+ int16 *attnums, Oid *atttypids, Oid *attcollids,
Oid *opclasses, bool *pk_has_without_overlaps);
static Oid transformFkeyCheckAttrs(Relation pkrel,
int numattrs, int16 *attnums,
@@ -9570,6 +9570,8 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
int16 fkattnum[INDEX_MAX_KEYS] = {0};
Oid pktypoid[INDEX_MAX_KEYS] = {0};
Oid fktypoid[INDEX_MAX_KEYS] = {0};
+ Oid pkcolloid[INDEX_MAX_KEYS] = {0};
+ Oid fkcolloid[INDEX_MAX_KEYS] = {0};
Oid opclasses[INDEX_MAX_KEYS] = {0};
Oid pfeqoperators[INDEX_MAX_KEYS] = {0};
Oid ppeqoperators[INDEX_MAX_KEYS] = {0};
@@ -9670,7 +9672,7 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
*/
numfks = transformColumnNameList(RelationGetRelid(rel),
fkconstraint->fk_attrs,
- fkattnum, fktypoid);
+ fkattnum, fktypoid, fkcolloid);
with_period = fkconstraint->fk_with_period || fkconstraint->pk_with_period;
if (with_period && !fkconstraint->fk_with_period)
ereport(ERROR,
@@ -9679,7 +9681,7 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
numfkdelsetcols = transformColumnNameList(RelationGetRelid(rel),
fkconstraint->fk_del_set_cols,
- fkdelsetcols, NULL);
+ fkdelsetcols, NULL, NULL);
validateFkOnDeleteSetColumns(numfks, fkattnum,
numfkdelsetcols, fkdelsetcols,
fkconstraint->fk_del_set_cols);
@@ -9694,7 +9696,7 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
{
numpks = transformFkeyGetPrimaryKey(pkrel, &indexOid,
&fkconstraint->pk_attrs,
- pkattnum, pktypoid,
+ pkattnum, pktypoid, pkcolloid,
opclasses, &pk_has_without_overlaps);
/* If the primary key uses WITHOUT OVERLAPS, the fk must use PERIOD */
@@ -9707,7 +9709,7 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
{
numpks = transformColumnNameList(RelationGetRelid(pkrel),
fkconstraint->pk_attrs,
- pkattnum, pktypoid);
+ pkattnum, pktypoid, pkcolloid);
/* Since we got pk_attrs, one should be a period. */
if (with_period && !fkconstraint->pk_with_period)
@@ -9809,6 +9811,8 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
Oid pktype = pktypoid[i];
Oid fktype = fktypoid[i];
Oid fktyped;
+ Oid pkcoll = pkcolloid[i];
+ Oid fkcoll = fkcolloid[i];
HeapTuple cla_ht;
Form_pg_opclass cla_tup;
Oid amid;
@@ -9951,6 +9955,40 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
format_type_be(fktype),
format_type_be(pktype))));
+ /*
+ * This shouldn't be possible, but better check to make sure we have a
+ * consistent state for the check below.
+ */
+ if ((pkcoll && !fkcoll) || (!pkcoll && fkcoll))
+ elog(ERROR, "key columns are not both collatable");
+
+ if (pkcoll && fkcoll)
+ {
+ bool pkcolldet;
+ bool fkcolldet;
+
+ pkcolldet = get_collation_isdeterministic(pkcoll);
+ fkcolldet = get_collation_isdeterministic(fkcoll);
+
+ /*
+ * SQL requires that both collations are the same. This is
+ * because we need a consistent notion of equality on both
+ * columns. We relax this by allowing different collations if
+ * they are both deterministic. (This is also for backward
+ * compatibility, because PostgreSQL has always allowed this.)
+ */
+ if ((!pkcolldet || !fkcolldet) && pkcoll != fkcoll)
+ ereport(ERROR,
+ (errcode(ERRCODE_COLLATION_MISMATCH),
+ errmsg("foreign key constraint \"%s\" cannot be implemented", fkconstraint->conname),
+ errdetail("Key columns \"%s\" and \"%s\" have incompatible collations: \"%s\" and \"%s\". "
+ "If either collation is nondeterministic, then both collations have to be the same.",
+ strVal(list_nth(fkconstraint->fk_attrs, i)),
+ strVal(list_nth(fkconstraint->pk_attrs, i)),
+ get_collation_name(fkcoll),
+ get_collation_name(pkcoll))));
+ }
+
if (old_check_ok)
{
/*
@@ -9971,6 +10009,8 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
CoercionPathType new_pathtype;
Oid old_castfunc;
Oid new_castfunc;
+ Oid old_fkcoll;
+ Oid new_fkcoll;
Form_pg_attribute attr = TupleDescAttr(tab->oldDesc,
fkattnum[i] - 1);
@@ -9986,6 +10026,9 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
new_pathtype = findFkeyCast(pfeqop_right, new_fktype,
&new_castfunc);
+ old_fkcoll = attr->attcollation;
+ new_fkcoll = fkcoll;
+
/*
* Upon a change to the cast from the FK column to its pfeqop
* operand, revalidate the constraint. For this evaluation, a
@@ -10009,9 +10052,10 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
* turn conform to the domain. Consequently, we need not treat
* domains specially here.
*
- * Since we require that all collations share the same notion of
- * equality (which they do, because texteq reduces to bitwise
- * equality), we don't compare collation here.
+ * If the collation changes, revalidation is required, unless both
+ * collations are deterministic, because those share the same
+ * notion of equality (because texteq reduces to bitwise
+ * equality).
*
* We need not directly consider the PK type. It's necessarily
* binary coercible to the opcintype of the unique index column,
@@ -10021,7 +10065,9 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
old_check_ok = (new_pathtype == old_pathtype &&
new_castfunc == old_castfunc &&
(!IsPolymorphicType(pfeqop_right) ||
- new_fktype == old_fktype));
+ new_fktype == old_fktype) &&
+ (new_fkcoll == old_fkcoll ||
+ (get_collation_isdeterministic(old_fkcoll) && get_collation_isdeterministic(new_fkcoll))));
}
pfeqoperators[i] = pfeqop;
@@ -11883,7 +11929,7 @@ ATExecValidateConstraint(List **wqueue, Relation rel, char *constrName,
*/
static int
transformColumnNameList(Oid relId, List *colList,
- int16 *attnums, Oid *atttypids)
+ int16 *attnums, Oid *atttypids, Oid *attcollids)
{
ListCell *l;
int attnum;
@@ -11914,6 +11960,8 @@ transformColumnNameList(Oid relId, List *colList,
attnums[attnum] = attform->attnum;
if (atttypids != NULL)
atttypids[attnum] = attform->atttypid;
+ if (attcollids != NULL)
+ attcollids[attnum] = attform->attcollation;
ReleaseSysCache(atttuple);
attnum++;
}
@@ -11937,7 +11985,7 @@ transformColumnNameList(Oid relId, List *colList,
static int
transformFkeyGetPrimaryKey(Relation pkrel, Oid *indexOid,
List **attnamelist,
- int16 *attnums, Oid *atttypids,
+ int16 *attnums, Oid *atttypids, Oid *attcollids,
Oid *opclasses, bool *pk_has_without_overlaps)
{
List *indexoidlist;
@@ -12011,6 +12059,7 @@ transformFkeyGetPrimaryKey(Relation pkrel, Oid *indexOid,
attnums[i] = pkattno;
atttypids[i] = attnumTypeId(pkrel, pkattno);
+ attcollids[i] = attnumCollationId(pkrel, pkattno);
opclasses[i] = indclass->values[i];
*attnamelist = lappend(*attnamelist,
makeString(pstrdup(NameStr(*attnumAttName(pkrel, pkattno)))));
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 6896e1ae638..1738cb3371b 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -207,7 +207,7 @@ static void ri_BuildQueryKey(RI_QueryKey *key,
int32 constr_queryno);
static bool ri_KeysEqual(Relation rel, TupleTableSlot *oldslot, TupleTableSlot *newslot,
const RI_ConstraintInfo *riinfo, bool rel_is_pk);
-static bool ri_CompareWithCast(Oid eq_opr, Oid typeid,
+static bool ri_CompareWithCast(Oid eq_opr, Oid typeid, Oid collid,
Datum lhs, Datum rhs);
static void ri_InitHashTables(void);
@@ -776,8 +776,6 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
{
Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
- Oid pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
- Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
quoteOneName(attname,
RIAttName(fk_rel, riinfo->fk_attnums[i]));
@@ -786,8 +784,6 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
paramname, pk_type,
riinfo->pf_eq_oprs[i],
attname, fk_type);
- if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
- ri_GenerateQualCollation(&querybuf, pk_coll);
querysep = "AND";
queryoids[i] = pk_type;
}
@@ -881,8 +877,6 @@ RI_FKey_cascade_del(PG_FUNCTION_ARGS)
{
Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
- Oid pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
- Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
quoteOneName(attname,
RIAttName(fk_rel, riinfo->fk_attnums[i]));
@@ -891,8 +885,6 @@ RI_FKey_cascade_del(PG_FUNCTION_ARGS)
paramname, pk_type,
riinfo->pf_eq_oprs[i],
attname, fk_type);
- if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
- ri_GenerateQualCollation(&querybuf, pk_coll);
querysep = "AND";
queryoids[i] = pk_type;
}
@@ -996,8 +988,6 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS)
{
Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
- Oid pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
- Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
quoteOneName(attname,
RIAttName(fk_rel, riinfo->fk_attnums[i]));
@@ -1009,8 +999,6 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS)
paramname, pk_type,
riinfo->pf_eq_oprs[i],
attname, fk_type);
- if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
- ri_GenerateQualCollation(&querybuf, pk_coll);
querysep = ",";
qualsep = "AND";
queryoids[i] = pk_type;
@@ -1232,8 +1220,6 @@ ri_set(TriggerData *trigdata, bool is_set_null, int tgkind)
{
Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
- Oid pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
- Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
quoteOneName(attname,
RIAttName(fk_rel, riinfo->fk_attnums[i]));
@@ -1243,8 +1229,6 @@ ri_set(TriggerData *trigdata, bool is_set_null, int tgkind)
paramname, pk_type,
riinfo->pf_eq_oprs[i],
attname, fk_type);
- if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
- ri_GenerateQualCollation(&querybuf, pk_coll);
qualsep = "AND";
queryoids[i] = pk_type;
}
@@ -1998,19 +1982,16 @@ ri_GenerateQual(StringInfo buf,
/*
* ri_GenerateQualCollation --- add a COLLATE spec to a WHERE clause
*
- * At present, we intentionally do not use this function for RI queries that
- * compare a variable to a $n parameter. Since parameter symbols always have
- * default collation, the effect will be to use the variable's collation.
- * Now that is only strictly correct when testing the referenced column, since
- * the SQL standard specifies that RI comparisons should use the referenced
- * column's collation. However, so long as all collations have the same
- * notion of equality (which they do, because texteq reduces to bitwise
- * equality), there's no visible semantic impact from using the referencing
- * column's collation when testing it, and this is a good thing to do because
- * it lets us use a normal index on the referencing column. However, we do
- * have to use this function when directly comparing the referencing and
- * referenced columns, if they are of different collations; else the parser
- * will fail to resolve the collation to use.
+ * We only have to use this function when directly comparing the referencing
+ * and referenced columns, if they are of different collations; else the
+ * parser will fail to resolve the collation to use. We don't need to use
+ * this function for RI queries that compare a variable to a $n parameter.
+ * Since parameter symbols always have default collation, the effect will be
+ * to use the variable's collation.
+ *
+ * Note that we require that the collations of the referencing and the
+ * referenced column have the some notion of equality: Either they have to
+ * both be deterministic or else they both have to be the same.
*/
static void
ri_GenerateQualCollation(StringInfo buf, Oid collation)
@@ -2952,7 +2933,7 @@ ri_KeysEqual(Relation rel, TupleTableSlot *oldslot, TupleTableSlot *newslot,
* operator. Changes that compare equal will still satisfy the
* constraint after the update.
*/
- if (!ri_CompareWithCast(eq_opr, RIAttType(rel, attnums[i]),
+ if (!ri_CompareWithCast(eq_opr, RIAttType(rel, attnums[i]), RIAttCollation(rel, attnums[i]),
newvalue, oldvalue))
return false;
}
@@ -2972,7 +2953,7 @@ ri_KeysEqual(Relation rel, TupleTableSlot *oldslot, TupleTableSlot *newslot,
* NB: we have already checked that neither value is null.
*/
static bool
-ri_CompareWithCast(Oid eq_opr, Oid typeid,
+ri_CompareWithCast(Oid eq_opr, Oid typeid, Oid collid,
Datum lhs, Datum rhs)
{
RI_CompareHashEntry *entry = ri_HashCompareOp(eq_opr, typeid);
@@ -2998,9 +2979,9 @@ ri_CompareWithCast(Oid eq_opr, Oid typeid,
* on the other side of a foreign-key constraint. Therefore, the
* comparison here would need to be done with the collation of the *other*
* table. For simplicity (e.g., we might not even have the other table
- * open), we'll just use the default collation here, which could lead to
- * some false negatives. All this would break if we ever allow
- * database-wide collations to be nondeterministic.
+ * open), we'll use our own collation. This is fine because we require
+ * that both collations have the same notion of equality (either they are
+ * both deterministic or else they are both the same).
*
* With range/multirangetypes, the collation of the base type is stored as
* part of the rangetype (pg_range.rngcollation), and always used, so
@@ -3008,9 +2989,7 @@ ri_CompareWithCast(Oid eq_opr, Oid typeid,
* But if we support arbitrary types with PERIOD, we should perhaps just
* always force a re-check.
*/
- return DatumGetBool(FunctionCall2Coll(&entry->eq_opr_finfo,
- DEFAULT_COLLATION_OID,
- lhs, rhs));
+ return DatumGetBool(FunctionCall2Coll(&entry->eq_opr_finfo, collid, lhs, rhs));
}
/*
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index faa376e060c..7bfdad5b099 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1840,101 +1840,15 @@ SELECT * FROM test4 WHERE b = 'Cote' COLLATE case_insensitive;
1 | cote
(1 row)
--- foreign keys (should use collation of primary key)
--- PK is case-sensitive, FK is case-insensitive
+-- foreign keys (mixing different nondeterministic collations not allowed)
CREATE TABLE test10pk (x text COLLATE case_sensitive PRIMARY KEY);
-INSERT INTO test10pk VALUES ('abc'), ('def'), ('ghi');
-CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test10fk VALUES ('abc'); -- ok
-INSERT INTO test10fk VALUES ('ABC'); -- error
-ERROR: insert or update on table "test10fk" violates foreign key constraint "test10fk_x_fkey"
-DETAIL: Key (x)=(ABC) is not present in table "test10pk".
-INSERT INTO test10fk VALUES ('xyz'); -- error
-ERROR: insert or update on table "test10fk" violates foreign key constraint "test10fk_x_fkey"
-DETAIL: Key (x)=(xyz) is not present in table "test10pk".
-SELECT * FROM test10pk;
- x
------
- abc
- def
- ghi
-(3 rows)
-
-SELECT * FROM test10fk;
- x
------
- abc
-(1 row)
-
--- restrict update even though the values are "equal" in the FK table
-UPDATE test10fk SET x = 'ABC' WHERE x = 'abc'; -- error
-ERROR: insert or update on table "test10fk" violates foreign key constraint "test10fk_x_fkey"
-DETAIL: Key (x)=(ABC) is not present in table "test10pk".
-SELECT * FROM test10fk;
- x
------
- abc
-(1 row)
-
-DELETE FROM test10pk WHERE x = 'abc';
-SELECT * FROM test10pk;
- x
------
- def
- ghi
-(2 rows)
-
-SELECT * FROM test10fk;
- x
----
-(0 rows)
-
--- PK is case-insensitive, FK is case-sensitive
+CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
+ERROR: foreign key constraint "test10fk_x_fkey" cannot be implemented
+DETAIL: Key columns "x" and "x" have incompatible collations: "case_insensitive" and "case_sensitive". If either collation is nondeterministic, then both collations have to be the same.
CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
-INSERT INTO test11pk VALUES ('abc'), ('def'), ('ghi');
-CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test11fk VALUES ('abc'); -- ok
-INSERT INTO test11fk VALUES ('ABC'); -- ok
-INSERT INTO test11fk VALUES ('xyz'); -- error
-ERROR: insert or update on table "test11fk" violates foreign key constraint "test11fk_x_fkey"
-DETAIL: Key (x)=(xyz) is not present in table "test11pk".
-SELECT * FROM test11pk;
- x
------
- abc
- def
- ghi
-(3 rows)
-
-SELECT * FROM test11fk;
- x
------
- abc
- ABC
-(2 rows)
-
--- cascade update even though the values are "equal" in the PK table
-UPDATE test11pk SET x = 'ABC' WHERE x = 'abc';
-SELECT * FROM test11fk;
- x
------
- ABC
- ABC
-(2 rows)
-
-DELETE FROM test11pk WHERE x = 'abc';
-SELECT * FROM test11pk;
- x
------
- def
- ghi
-(2 rows)
-
-SELECT * FROM test11fk;
- x
----
-(0 rows)
-
+CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
+ERROR: foreign key constraint "test11fk_x_fkey" cannot be implemented
+DETAIL: Key columns "x" and "x" have incompatible collations: "case_sensitive" and "case_insensitive". If either collation is nondeterministic, then both collations have to be the same.
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 80f28a97d78..31f6aff0acd 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -687,39 +687,12 @@ CREATE TABLE test4 (a int, b text);
SELECT * FROM test4 WHERE b = 'Cote' COLLATE ignore_accents; -- still case-sensitive
SELECT * FROM test4 WHERE b = 'Cote' COLLATE case_insensitive;
--- foreign keys (should use collation of primary key)
-
--- PK is case-sensitive, FK is case-insensitive
+-- foreign keys (mixing different nondeterministic collations not allowed)
CREATE TABLE test10pk (x text COLLATE case_sensitive PRIMARY KEY);
-INSERT INTO test10pk VALUES ('abc'), ('def'), ('ghi');
-CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test10fk VALUES ('abc'); -- ok
-INSERT INTO test10fk VALUES ('ABC'); -- error
-INSERT INTO test10fk VALUES ('xyz'); -- error
-SELECT * FROM test10pk;
-SELECT * FROM test10fk;
--- restrict update even though the values are "equal" in the FK table
-UPDATE test10fk SET x = 'ABC' WHERE x = 'abc'; -- error
-SELECT * FROM test10fk;
-DELETE FROM test10pk WHERE x = 'abc';
-SELECT * FROM test10pk;
-SELECT * FROM test10fk;
-
--- PK is case-insensitive, FK is case-sensitive
+CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
+
CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
-INSERT INTO test11pk VALUES ('abc'), ('def'), ('ghi');
-CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test11fk VALUES ('abc'); -- ok
-INSERT INTO test11fk VALUES ('ABC'); -- ok
-INSERT INTO test11fk VALUES ('xyz'); -- error
-SELECT * FROM test11pk;
-SELECT * FROM test11fk;
--- cascade update even though the values are "equal" in the PK table
-UPDATE test11pk SET x = 'ABC' WHERE x = 'abc';
-SELECT * FROM test11fk;
-DELETE FROM test11pk WHERE x = 'abc';
-SELECT * FROM test11pk;
-SELECT * FROM test11fk;
+CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
base-commit: 665785d85f8b9a7250802d7fadd84841bb0aafa1
--
2.47.0
On Thu, Oct 17, 2024 at 7:14 PM Peter Eisentraut <peter@eisentraut.org> wrote:
So I took the v5 patch you had posted and started working from there.
The rule that you had picked isn't quite what we want, I think. It's
okay to have nondeterministic collations on foreign keys, as long as the
collation is the same on both sides. That's what I have implemented.
See attached.This approach also allows cleaning up a bunch of hackiness in
ri_triggers.c, which feels satisfying.
yech, i missed FK, PK both are nondeterministic but with same collation OID.
your work is more neat.
However FK, PK same nondeterministic collation OID have implications
for ri_KeysEqual.
ri_KeysEqual definitely deserves some comments.
for rel_is_pk, the equality is collation agnostic;
for rel_is_pk is false, the equality is collation aware.
for example:
DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE fktable (x text collate case_insensitive REFERENCES
pktable on update restrict on delete restrict);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('a');
update pktable set x = 'a' where x = 'A';
ERROR: update or delete on table "pktable" violates foreign key
constraint "fktable_x_fkey" on table "fktable"
DETAIL: Key (x)=(A) is still referenced from table "fktable".
this should not happen?
If so, the below change can solve the problem.
@@ -2930,6 +2915,16 @@ ri_KeysEqual(Relation rel, TupleTableSlot
*oldslot, TupleTableSlot *newslot,
*/
Form_pg_attribute att =
TupleDescAttr(oldslot->tts_tupleDescriptor, attnums[i] - 1);
+ Oid collation = RIAttCollation(rel, attnums[i]);
+ if (OidIsValid(collation) &&
!get_collation_isdeterministic(collation))
+ {
+ Oid eq_opr;
+ bool result;
+ eq_opr = riinfo->pp_eq_oprs[i];
+ result = ri_CompareWithCast(eq_opr, RIAttType(rel, attnums[i]),
+ collation, newvalue, oldvalue);
+ return result;
+ }
if (!datum_image_eq(oldvalue, newvalue, att->attbyval,
att->attlen))
return false;
The above change will make the ri_KeysEqual equality coalition aware
regardless rel_is_pk's value.
to see the effect, we can test it BEFORE and AFTER applying the above
ri_KeysEqual changes
with the attached sql script.
Attachments:
So for the moment this is a master-only patch. I think once we have
tied down the behavior we want for the future
/*
* ri_GenerateQualCollation --- add a COLLATE spec to a WHERE clause
*
- * At present, we intentionally do not use this function for RI queries that
- * compare a variable to a $n parameter. Since parameter symbols always have
- * default collation, the effect will be to use the variable's collation.
- * Now that is only strictly correct when testing the referenced column, since
- * the SQL standard specifies that RI comparisons should use the referenced
- * column's collation. However, so long as all collations have the same
- * notion of equality (which they do, because texteq reduces to bitwise
- * equality), there's no visible semantic impact from using the referencing
- * column's collation when testing it, and this is a good thing to do because
- * it lets us use a normal index on the referencing column. However, we do
- * have to use this function when directly comparing the referencing and
- * referenced columns, if they are of different collations; else the parser
- * will fail to resolve the collation to use.
+ * We only have to use this function when directly comparing the referencing
+ * and referenced columns, if they are of different collations; else the
+ * parser will fail to resolve the collation to use. We don't need to use
+ * this function for RI queries that compare a variable to a $n parameter.
+ * Since parameter symbols always have default collation, the effect will be
+ * to use the variable's collation.
+ *
+ * Note that we require that the collations of the referencing and the
+ * referenced column have the some notion of equality: Either they have to
+ * both be deterministic or else they both have to be the same.
*/
" some notion of equality" should it be "same notion of equality"?
maybe we can add "see ATAddForeignKeyConstraint also"
at the end of the whole comment.
/*
* transformColumnNameList - transform list of column names
*
* Lookup each name and return its attnum and, optionally, type OID
*
* Note: the name of this function suggests that it's general-purpose,
* but actually it's only used to look up names appearing in foreign-key
* clauses. The error messages would need work to use it in other cases,
* and perhaps the validity checks as well.
*/
static int
transformColumnNameList(Oid relId, List *colList,
int16 *attnums, Oid *atttypids, Oid *attcollids)
comments need slight adjustment?
comments in transformFkeyGetPrimaryKey also need slight change?
ri_CompareWithCast, we can aslo add comments saying the output is
collation aware.
+ /*
+ * This shouldn't be possible, but better check to make sure we have a
+ * consistent state for the check below.
+ */
+ if ((pkcoll && !fkcoll) || (!pkcoll && fkcoll))
+ elog(ERROR, "key columns are not both collatable");
+
+ if (pkcoll && fkcoll)
cosmetic. pkcoll can change to OidIsValid(pkcoll)
fkcoll change to OidIsValid(fkcoll).
ereport(ERROR,
(errcode(ERRCODE_COLLATION_MISMATCH),
errmsg("foreign key constraint \"%s\" cannot be implemented",
fkconstraint->conname),
errdetail("Key columns \"%s\" and \"%s\" have incompatible
collations: \"%s\" and \"%s\"."
"If either collation is nondeterministic, then both collations
have to be the same.",
strVal(list_nth(fkconstraint->fk_attrs, i)),
strVal(list_nth(fkconstraint->pk_attrs, i)),
get_collation_name(fkcoll),
get_collation_name(pkcoll))));
ERROR: foreign key constraint "a_fk_col1_fkey" cannot be implemented
DETAIL: Key columns "col1" and "col1" have incompatible collations:
"default" and "case_insensitive". If either collation is
nondeterministic, then both collations have to be the same.
"DETAIL: Key columns "col1" and "col1"" may be confusing.
we can be more verbose. like
DETAIL: Key columns "col1" in relation "X" and "col1" in relation "Y"
have incompatible collations......
(just a idea, don't have much preference)
old_check_ok = (new_pathtype == old_pathtype &&
new_castfunc == old_castfunc &&
(!IsPolymorphicType(pfeqop_right) ||
new_fktype == old_fktype) &&
(new_fkcoll == old_fkcoll ||
(get_collation_isdeterministic(old_fkcoll) &&
get_collation_isdeterministic(new_fkcoll))));
I am wondering if one of the new_fkcoll, old_fkcoll is zero, the other is not.
turns out that would n't happen.
before "if (old_check_ok)" we already did extensionsive check that
new_fkcoll is ok
for the job.
in ATAddForeignKeyConstraint, we can leave the old_check_ok part as is.
what do you think of the following:
old_check_ok = (new_pathtype == old_pathtype &&
new_castfunc == old_castfunc &&
(!IsPolymorphicType(pfeqop_right) ||
new_fktype == old_fktype));
if (OidIsValid(new_fkcoll) && OidIsValid(old_fkcoll) &&
new_fkcoll != old_fkcoll)
{
if(!get_collation_isdeterministic(new_fkcoll))
elog(ERROR,"new_fkcoll should be deterministic");
}
On Fri, Oct 25, 2024 at 2:27 PM jian he <jian.universality@gmail.com> wrote:
/* * ri_GenerateQualCollation --- add a COLLATE spec to a WHERE clause * - * At present, we intentionally do not use this function for RI queries that - * compare a variable to a $n parameter. Since parameter symbols always have - * default collation, the effect will be to use the variable's collation. - * Now that is only strictly correct when testing the referenced column, since - * the SQL standard specifies that RI comparisons should use the referenced - * column's collation. However, so long as all collations have the same - * notion of equality (which they do, because texteq reduces to bitwise - * equality), there's no visible semantic impact from using the referencing - * column's collation when testing it, and this is a good thing to do because - * it lets us use a normal index on the referencing column. However, we do - * have to use this function when directly comparing the referencing and - * referenced columns, if they are of different collations; else the parser - * will fail to resolve the collation to use. + * We only have to use this function when directly comparing the referencing + * and referenced columns, if they are of different collations; else the + * parser will fail to resolve the collation to use. We don't need to use + * this function for RI queries that compare a variable to a $n parameter. + * Since parameter symbols always have default collation, the effect will be + * to use the variable's collation. + * + * Note that we require that the collations of the referencing and the + * referenced column have the some notion of equality: Either they have to + * both be deterministic or else they both have to be the same. */
drop table if exists pktable, fktable;
CREATE TABLE pktable (x text COLLATE "POSIX" PRIMARY KEY);
CREATE TABLE fktable (x text COLLATE "C" REFERENCES pktable on update
cascade on delete cascade);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('A');
update pktable set x = 'a' collate "C" where x = 'A' collate "POSIX";
the cascade update fktable query string is:
UPDATE ONLY "public"."fktable" SET "x" = $1 WHERE $2 OPERATOR(pg_catalog.=) "x"
ideally it should be
UPDATE ONLY "public"."fktable" SET "x" = $1 collate "C" WHERE $2
collate "POSIX" OPERATOR(pg_catalog.=) "x"
as we already mentioned in several places: PK-FK tie either they have to
both be deterministic or else they both have to be the same collation
oid.
so the reduction to
UPDATE ONLY "public"."fktable" SET "x" = $1 WHERE $2 OPERATOR(pg_catalog.=) "x"
is safe.
but you look at SPI_execute_snapshot, _SPI_convert_params. then we can see
the collation metadata is not keeped.
+ We don't need to use + * this function for RI queries that compare a variable to a $n parameter. + * Since parameter symbols always have default collation, the effect will be + * to use the variable's collation.
so I think a better description is
+ We don't need to use + * this function for RI queries that compare a variable to a $n parameter. + * Since parameter symbols don't have collation information, the effect will be + * to use the variable's collation.
you can see related discovery in
/messages/by-id/CACJufxEtPBWAk7nEn69ww2LKi9w1i4dLwd5gnjD1DQ2vaYoi2g@mail.gmail.com
Here is a new patch that addresses all of your review comments. I also
added an example of a problem in the commit message.
On 25.10.24 08:27, jian he wrote:
+ * Note that we require that the collations of the referencing and the + * referenced column have the some notion of equality: Either they have to + * both be deterministic or else they both have to be the same. */ " some notion of equality" should it be "same notion of equality"? maybe we can add "see ATAddForeignKeyConstraint also" at the end of the whole comment.
both fixed
/*
* transformColumnNameList - transform list of column names
*
* Lookup each name and return its attnum and, optionally, type OID
*
* Note: the name of this function suggests that it's general-purpose,
* but actually it's only used to look up names appearing in foreign-key
* clauses. The error messages would need work to use it in other cases,
* and perhaps the validity checks as well.
*/
static int
transformColumnNameList(Oid relId, List *colList,
int16 *attnums, Oid *atttypids, Oid *attcollids)
comments need slight adjustment?
comments in transformFkeyGetPrimaryKey also need slight change?
ri_CompareWithCast, we can aslo add comments saying the output is
collation aware.
all fixed
+ /* + * This shouldn't be possible, but better check to make sure we have a + * consistent state for the check below. + */ + if ((pkcoll && !fkcoll) || (!pkcoll && fkcoll)) + elog(ERROR, "key columns are not both collatable"); + + if (pkcoll && fkcoll)cosmetic. pkcoll can change to OidIsValid(pkcoll)
fkcoll change to OidIsValid(fkcoll).
done
ERROR: foreign key constraint "a_fk_col1_fkey" cannot be implemented
DETAIL: Key columns "col1" and "col1" have incompatible collations:
"default" and "case_insensitive". If either collation is
nondeterministic, then both collations have to be the same."DETAIL: Key columns "col1" and "col1"" may be confusing.
we can be more verbose. like
DETAIL: Key columns "col1" in relation "X" and "col1" in relation "Y"
have incompatible collations......
(just a idea, don't have much preference)
Done. I also changed the error message about incompatible types in the
same way, in a separate commit.
old_check_ok = (new_pathtype == old_pathtype &&
new_castfunc == old_castfunc &&
(!IsPolymorphicType(pfeqop_right) ||
new_fktype == old_fktype) &&
(new_fkcoll == old_fkcoll ||
(get_collation_isdeterministic(old_fkcoll) &&
get_collation_isdeterministic(new_fkcoll))));I am wondering if one of the new_fkcoll, old_fkcoll is zero, the other is not.
turns out that would n't happen.before "if (old_check_ok)" we already did extensionsive check that
new_fkcoll is ok
for the job.
in ATAddForeignKeyConstraint, we can leave the old_check_ok part as is.
what do you think of the following:old_check_ok = (new_pathtype == old_pathtype &&
new_castfunc == old_castfunc &&
(!IsPolymorphicType(pfeqop_right) ||
new_fktype == old_fktype));
if (OidIsValid(new_fkcoll) && OidIsValid(old_fkcoll) &&
new_fkcoll != old_fkcoll)
{
if(!get_collation_isdeterministic(new_fkcoll))
elog(ERROR,"new_fkcoll should be deterministic");
}
I don't think this is right. The "old_check_ok" business is only used
when changing an existing foreign key, to see if the check needs to be
run again. The earlier code I add already ensures that if the
collations are nondeterministic, then they have to be the same between
PK and FK. So if this is the case, the only way to change the foreign
key collation is if you have a foreign key that references the same
table and you change the primary key column types in the same ALTER
TABLE statement. Then this conditional ensures that the recheck is run
under appropriate circumstances. But we don't want to error here; the
new situation is valid, we just need to determine whether we have to run
the check again.
Attachments:
v7-0001-Fix-collation-handling-for-foreign-keys.patchtext/plain; charset=UTF-8; name=v7-0001-Fix-collation-handling-for-foreign-keys.patchDownload
From f6e7a70f5a65ea5a683039bfcd782552b50a860d Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 7 Nov 2024 12:56:36 +0100
Subject: [PATCH v7] Fix collation handling for foreign keys
Allowing foreign keys where the referenced and the referencing columns
have collations with different notions of equality is problematic.
This can only happen when using nondeterministic collations, for
example, if the referencing column is case-insensitive and the
referenced column is not, or vice versa. It does not happen if both
collations are deterministic.
To show one example:
CREATE COLLATION case_insensitive (provider = icu, deterministic = false, locale = 'und-u-ks-level2');
CREATE TABLE pktable (x text COLLATE "C" PRIMARY KEY);
CREATE TABLE fktable (x text COLLATE case_insensitive REFERENCES pktable ON UPDATE CASCADE ON DELETE CASCADE);
INSERT INTO pktable VALUES ('A'), ('a');
INSERT INTO fktable VALUES ('A');
BEGIN; DELETE FROM pktable WHERE x = 'a'; TABLE fktable; ROLLBACK;
BEGIN; DELETE FROM pktable WHERE x = 'A'; TABLE fktable; ROLLBACK;
Both of these DELETE statements delete the one row from fktable. So
this means that one row from fktable references two rows in pktable,
which should not happen. (That's why a primary key or unique
constraint is required on pktable.)
When nondeterministic collations were implemented, the SQL standard
available to yours truly said that referential integrity checks should
be performed with the collation of the referenced column, and so
that's how we implemented it. But this turned out to be a mistake in
the SQL standard, for the same reasons as above, that was later
(SQL:2016) fixed to require both collations to be the same. So that's
what we are aiming for here.
We don't have to be quite so strict. We can allow different
collations if they are both deterministic. This is also good for
backward compatibility.
So the new rule is that the collations either have to be the same or
both deterministic. Or in other words, if one of them is
nondeterministic, then both have to be the same.
Some of the nice test cases for the previous situation in
collate.icu.utf8.sql are now obsolete. They are changed to just check
the error checking of the new rule. Note that collate.sql already
contained a test for foreign keys with different deterministic
colllations.
A bunch of code in ri_triggers.c that added a COLLATE clause to
enforce the referenced column's collation can be removed, because both
columns now have to have the same notion of equality, so it doesn't
matter which one to use.
Reported-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/78d824e0-b21e-480d-a252-e4b84bc2c24b@illuminatedcomputing.com
---
doc/src/sgml/ref/create_table.sgml | 7 ++
src/backend/commands/tablecmds.c | 84 ++++++++++++---
src/backend/utils/adt/ri_triggers.c | 57 ++++------
.../regress/expected/collate.icu.utf8.out | 100 ++----------------
src/test/regress/sql/collate.icu.utf8.sql | 35 +-----
5 files changed, 105 insertions(+), 178 deletions(-)
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 83859bac76f..dd7446117bc 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1199,6 +1199,13 @@ <title>Parameters</title>
must have its final column marked <literal>WITHOUT OVERLAPS</literal>.
</para>
+ <para>
+ For each pair of referencing and referenced column, if they are of a
+ collatable data type, then the collations must either be both
+ deterministic or else both the same. This ensures that both columns
+ have a consistent notion of equality.
+ </para>
+
<para>
The user
must have <literal>REFERENCES</literal> permission on the referenced
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index eaa81424270..7248e2d9bc2 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -397,10 +397,10 @@ static ObjectAddress ATExecValidateConstraint(List **wqueue,
Relation rel, char *constrName,
bool recurse, bool recursing, LOCKMODE lockmode);
static int transformColumnNameList(Oid relId, List *colList,
- int16 *attnums, Oid *atttypids);
+ int16 *attnums, Oid *atttypids, Oid *attcollids);
static int transformFkeyGetPrimaryKey(Relation pkrel, Oid *indexOid,
List **attnamelist,
- int16 *attnums, Oid *atttypids,
+ int16 *attnums, Oid *atttypids, Oid *attcollids,
Oid *opclasses, bool *pk_has_without_overlaps);
static Oid transformFkeyCheckAttrs(Relation pkrel,
int numattrs, int16 *attnums,
@@ -9587,6 +9587,8 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
int16 fkattnum[INDEX_MAX_KEYS] = {0};
Oid pktypoid[INDEX_MAX_KEYS] = {0};
Oid fktypoid[INDEX_MAX_KEYS] = {0};
+ Oid pkcolloid[INDEX_MAX_KEYS] = {0};
+ Oid fkcolloid[INDEX_MAX_KEYS] = {0};
Oid opclasses[INDEX_MAX_KEYS] = {0};
Oid pfeqoperators[INDEX_MAX_KEYS] = {0};
Oid ppeqoperators[INDEX_MAX_KEYS] = {0};
@@ -9683,11 +9685,11 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
/*
* Look up the referencing attributes to make sure they exist, and record
- * their attnums and type OIDs.
+ * their attnums and type and collation OIDs.
*/
numfks = transformColumnNameList(RelationGetRelid(rel),
fkconstraint->fk_attrs,
- fkattnum, fktypoid);
+ fkattnum, fktypoid, fkcolloid);
with_period = fkconstraint->fk_with_period || fkconstraint->pk_with_period;
if (with_period && !fkconstraint->fk_with_period)
ereport(ERROR,
@@ -9696,7 +9698,7 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
numfkdelsetcols = transformColumnNameList(RelationGetRelid(rel),
fkconstraint->fk_del_set_cols,
- fkdelsetcols, NULL);
+ fkdelsetcols, NULL, NULL);
validateFkOnDeleteSetColumns(numfks, fkattnum,
numfkdelsetcols, fkdelsetcols,
fkconstraint->fk_del_set_cols);
@@ -9705,13 +9707,14 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
* If the attribute list for the referenced table was omitted, lookup the
* definition of the primary key and use it. Otherwise, validate the
* supplied attribute list. In either case, discover the index OID and
- * index opclasses, and the attnums and type OIDs of the attributes.
+ * index opclasses, and the attnums and type and collation OIDs of the
+ * attributes.
*/
if (fkconstraint->pk_attrs == NIL)
{
numpks = transformFkeyGetPrimaryKey(pkrel, &indexOid,
&fkconstraint->pk_attrs,
- pkattnum, pktypoid,
+ pkattnum, pktypoid, pkcolloid,
opclasses, &pk_has_without_overlaps);
/* If the primary key uses WITHOUT OVERLAPS, the fk must use PERIOD */
@@ -9724,7 +9727,7 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
{
numpks = transformColumnNameList(RelationGetRelid(pkrel),
fkconstraint->pk_attrs,
- pkattnum, pktypoid);
+ pkattnum, pktypoid, pkcolloid);
/* Since we got pk_attrs, one should be a period. */
if (with_period && !fkconstraint->pk_with_period)
@@ -9826,6 +9829,8 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
Oid pktype = pktypoid[i];
Oid fktype = fktypoid[i];
Oid fktyped;
+ Oid pkcoll = pkcolloid[i];
+ Oid fkcoll = fkcolloid[i];
HeapTuple cla_ht;
Form_pg_opclass cla_tup;
Oid amid;
@@ -9968,6 +9973,41 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
format_type_be(fktype),
format_type_be(pktype))));
+ /*
+ * This shouldn't be possible, but better check to make sure we have a
+ * consistent state for the check below.
+ */
+ if ((OidIsValid(pkcoll) && !OidIsValid(fkcoll)) || (!OidIsValid(pkcoll) && OidIsValid(fkcoll)))
+ elog(ERROR, "key columns are not both collatable");
+
+ if (OidIsValid(pkcoll) && OidIsValid(fkcoll))
+ {
+ bool pkcolldet;
+ bool fkcolldet;
+
+ pkcolldet = get_collation_isdeterministic(pkcoll);
+ fkcolldet = get_collation_isdeterministic(fkcoll);
+
+ /*
+ * SQL requires that both collations are the same. This is
+ * because we need a consistent notion of equality on both
+ * columns. We relax this by allowing different collations if
+ * they are both deterministic. (This is also for backward
+ * compatibility, because PostgreSQL has always allowed this.)
+ */
+ if ((!pkcolldet || !fkcolldet) && pkcoll != fkcoll)
+ ereport(ERROR,
+ (errcode(ERRCODE_COLLATION_MISMATCH),
+ errmsg("foreign key constraint \"%s\" cannot be implemented", fkconstraint->conname),
+ errdetail("Key columns \"%s\" of the referencing table and \"%s\" of the referenced table "
+ "have incompatible collations: \"%s\" and \"%s\". "
+ "If either collation is nondeterministic, then both collations have to be the same.",
+ strVal(list_nth(fkconstraint->fk_attrs, i)),
+ strVal(list_nth(fkconstraint->pk_attrs, i)),
+ get_collation_name(fkcoll),
+ get_collation_name(pkcoll))));
+ }
+
if (old_check_ok)
{
/*
@@ -9988,6 +10028,8 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
CoercionPathType new_pathtype;
Oid old_castfunc;
Oid new_castfunc;
+ Oid old_fkcoll;
+ Oid new_fkcoll;
Form_pg_attribute attr = TupleDescAttr(tab->oldDesc,
fkattnum[i] - 1);
@@ -10003,6 +10045,9 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
new_pathtype = findFkeyCast(pfeqop_right, new_fktype,
&new_castfunc);
+ old_fkcoll = attr->attcollation;
+ new_fkcoll = fkcoll;
+
/*
* Upon a change to the cast from the FK column to its pfeqop
* operand, revalidate the constraint. For this evaluation, a
@@ -10026,9 +10071,10 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
* turn conform to the domain. Consequently, we need not treat
* domains specially here.
*
- * Since we require that all collations share the same notion of
- * equality (which they do, because texteq reduces to bitwise
- * equality), we don't compare collation here.
+ * If the collation changes, revalidation is required, unless both
+ * collations are deterministic, because those share the same
+ * notion of equality (because texteq reduces to bitwise
+ * equality).
*
* We need not directly consider the PK type. It's necessarily
* binary coercible to the opcintype of the unique index column,
@@ -10038,7 +10084,9 @@ ATAddForeignKeyConstraint(List **wqueue, AlteredTableInfo *tab, Relation rel,
old_check_ok = (new_pathtype == old_pathtype &&
new_castfunc == old_castfunc &&
(!IsPolymorphicType(pfeqop_right) ||
- new_fktype == old_fktype));
+ new_fktype == old_fktype) &&
+ (new_fkcoll == old_fkcoll ||
+ (get_collation_isdeterministic(old_fkcoll) && get_collation_isdeterministic(new_fkcoll))));
}
pfeqoperators[i] = pfeqop;
@@ -11974,7 +12022,8 @@ ATExecValidateConstraint(List **wqueue, Relation rel, char *constrName,
/*
* transformColumnNameList - transform list of column names
*
- * Lookup each name and return its attnum and, optionally, type OID
+ * Lookup each name and return its attnum and, optionally, type and collation
+ * OIDs
*
* Note: the name of this function suggests that it's general-purpose,
* but actually it's only used to look up names appearing in foreign-key
@@ -11983,7 +12032,7 @@ ATExecValidateConstraint(List **wqueue, Relation rel, char *constrName,
*/
static int
transformColumnNameList(Oid relId, List *colList,
- int16 *attnums, Oid *atttypids)
+ int16 *attnums, Oid *atttypids, Oid *attcollids)
{
ListCell *l;
int attnum;
@@ -12014,6 +12063,8 @@ transformColumnNameList(Oid relId, List *colList,
attnums[attnum] = attform->attnum;
if (atttypids != NULL)
atttypids[attnum] = attform->atttypid;
+ if (attcollids != NULL)
+ attcollids[attnum] = attform->attcollation;
ReleaseSysCache(atttuple);
attnum++;
}
@@ -12024,7 +12075,7 @@ transformColumnNameList(Oid relId, List *colList,
/*
* transformFkeyGetPrimaryKey -
*
- * Look up the names, attnums, and types of the primary key attributes
+ * Look up the names, attnums, types, and collations of the primary key attributes
* for the pkrel. Also return the index OID and index opclasses of the
* index supporting the primary key. Also return whether the index has
* WITHOUT OVERLAPS.
@@ -12037,7 +12088,7 @@ transformColumnNameList(Oid relId, List *colList,
static int
transformFkeyGetPrimaryKey(Relation pkrel, Oid *indexOid,
List **attnamelist,
- int16 *attnums, Oid *atttypids,
+ int16 *attnums, Oid *atttypids, Oid *attcollids,
Oid *opclasses, bool *pk_has_without_overlaps)
{
List *indexoidlist;
@@ -12111,6 +12162,7 @@ transformFkeyGetPrimaryKey(Relation pkrel, Oid *indexOid,
attnums[i] = pkattno;
atttypids[i] = attnumTypeId(pkrel, pkattno);
+ attcollids[i] = attnumCollationId(pkrel, pkattno);
opclasses[i] = indclass->values[i];
*attnamelist = lappend(*attnamelist,
makeString(pstrdup(NameStr(*attnumAttName(pkrel, pkattno)))));
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 6896e1ae638..91792cb2a47 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -207,7 +207,7 @@ static void ri_BuildQueryKey(RI_QueryKey *key,
int32 constr_queryno);
static bool ri_KeysEqual(Relation rel, TupleTableSlot *oldslot, TupleTableSlot *newslot,
const RI_ConstraintInfo *riinfo, bool rel_is_pk);
-static bool ri_CompareWithCast(Oid eq_opr, Oid typeid,
+static bool ri_CompareWithCast(Oid eq_opr, Oid typeid, Oid collid,
Datum lhs, Datum rhs);
static void ri_InitHashTables(void);
@@ -776,8 +776,6 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
{
Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
- Oid pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
- Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
quoteOneName(attname,
RIAttName(fk_rel, riinfo->fk_attnums[i]));
@@ -786,8 +784,6 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
paramname, pk_type,
riinfo->pf_eq_oprs[i],
attname, fk_type);
- if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
- ri_GenerateQualCollation(&querybuf, pk_coll);
querysep = "AND";
queryoids[i] = pk_type;
}
@@ -881,8 +877,6 @@ RI_FKey_cascade_del(PG_FUNCTION_ARGS)
{
Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
- Oid pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
- Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
quoteOneName(attname,
RIAttName(fk_rel, riinfo->fk_attnums[i]));
@@ -891,8 +885,6 @@ RI_FKey_cascade_del(PG_FUNCTION_ARGS)
paramname, pk_type,
riinfo->pf_eq_oprs[i],
attname, fk_type);
- if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
- ri_GenerateQualCollation(&querybuf, pk_coll);
querysep = "AND";
queryoids[i] = pk_type;
}
@@ -996,8 +988,6 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS)
{
Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
- Oid pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
- Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
quoteOneName(attname,
RIAttName(fk_rel, riinfo->fk_attnums[i]));
@@ -1009,8 +999,6 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS)
paramname, pk_type,
riinfo->pf_eq_oprs[i],
attname, fk_type);
- if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
- ri_GenerateQualCollation(&querybuf, pk_coll);
querysep = ",";
qualsep = "AND";
queryoids[i] = pk_type;
@@ -1232,8 +1220,6 @@ ri_set(TriggerData *trigdata, bool is_set_null, int tgkind)
{
Oid pk_type = RIAttType(pk_rel, riinfo->pk_attnums[i]);
Oid fk_type = RIAttType(fk_rel, riinfo->fk_attnums[i]);
- Oid pk_coll = RIAttCollation(pk_rel, riinfo->pk_attnums[i]);
- Oid fk_coll = RIAttCollation(fk_rel, riinfo->fk_attnums[i]);
quoteOneName(attname,
RIAttName(fk_rel, riinfo->fk_attnums[i]));
@@ -1243,8 +1229,6 @@ ri_set(TriggerData *trigdata, bool is_set_null, int tgkind)
paramname, pk_type,
riinfo->pf_eq_oprs[i],
attname, fk_type);
- if (pk_coll != fk_coll && !get_collation_isdeterministic(pk_coll))
- ri_GenerateQualCollation(&querybuf, pk_coll);
qualsep = "AND";
queryoids[i] = pk_type;
}
@@ -1998,19 +1982,17 @@ ri_GenerateQual(StringInfo buf,
/*
* ri_GenerateQualCollation --- add a COLLATE spec to a WHERE clause
*
- * At present, we intentionally do not use this function for RI queries that
- * compare a variable to a $n parameter. Since parameter symbols always have
- * default collation, the effect will be to use the variable's collation.
- * Now that is only strictly correct when testing the referenced column, since
- * the SQL standard specifies that RI comparisons should use the referenced
- * column's collation. However, so long as all collations have the same
- * notion of equality (which they do, because texteq reduces to bitwise
- * equality), there's no visible semantic impact from using the referencing
- * column's collation when testing it, and this is a good thing to do because
- * it lets us use a normal index on the referencing column. However, we do
- * have to use this function when directly comparing the referencing and
- * referenced columns, if they are of different collations; else the parser
- * will fail to resolve the collation to use.
+ * We only have to use this function when directly comparing the referencing
+ * and referenced columns, if they are of different collations; else the
+ * parser will fail to resolve the collation to use. We don't need to use
+ * this function for RI queries that compare a variable to a $n parameter.
+ * Since parameter symbols always have default collation, the effect will be
+ * to use the variable's collation.
+ *
+ * Note that we require that the collations of the referencing and the
+ * referenced column have the same notion of equality: Either they have to
+ * both be deterministic or else they both have to be the same. (See also
+ * ATAddForeignKeyConstraint().)
*/
static void
ri_GenerateQualCollation(StringInfo buf, Oid collation)
@@ -2952,7 +2934,7 @@ ri_KeysEqual(Relation rel, TupleTableSlot *oldslot, TupleTableSlot *newslot,
* operator. Changes that compare equal will still satisfy the
* constraint after the update.
*/
- if (!ri_CompareWithCast(eq_opr, RIAttType(rel, attnums[i]),
+ if (!ri_CompareWithCast(eq_opr, RIAttType(rel, attnums[i]), RIAttCollation(rel, attnums[i]),
newvalue, oldvalue))
return false;
}
@@ -2968,11 +2950,12 @@ ri_KeysEqual(Relation rel, TupleTableSlot *oldslot, TupleTableSlot *newslot,
* Call the appropriate comparison operator for two values.
* Normally this is equality, but for the PERIOD part of foreign keys
* it is ContainedBy, so the order of lhs vs rhs is significant.
+ * See below for how the collation is applied.
*
* NB: we have already checked that neither value is null.
*/
static bool
-ri_CompareWithCast(Oid eq_opr, Oid typeid,
+ri_CompareWithCast(Oid eq_opr, Oid typeid, Oid collid,
Datum lhs, Datum rhs)
{
RI_CompareHashEntry *entry = ri_HashCompareOp(eq_opr, typeid);
@@ -2998,9 +2981,9 @@ ri_CompareWithCast(Oid eq_opr, Oid typeid,
* on the other side of a foreign-key constraint. Therefore, the
* comparison here would need to be done with the collation of the *other*
* table. For simplicity (e.g., we might not even have the other table
- * open), we'll just use the default collation here, which could lead to
- * some false negatives. All this would break if we ever allow
- * database-wide collations to be nondeterministic.
+ * open), we'll use our own collation. This is fine because we require
+ * that both collations have the same notion of equality (either they are
+ * both deterministic or else they are both the same).
*
* With range/multirangetypes, the collation of the base type is stored as
* part of the rangetype (pg_range.rngcollation), and always used, so
@@ -3008,9 +2991,7 @@ ri_CompareWithCast(Oid eq_opr, Oid typeid,
* But if we support arbitrary types with PERIOD, we should perhaps just
* always force a re-check.
*/
- return DatumGetBool(FunctionCall2Coll(&entry->eq_opr_finfo,
- DEFAULT_COLLATION_OID,
- lhs, rhs));
+ return DatumGetBool(FunctionCall2Coll(&entry->eq_opr_finfo, collid, lhs, rhs));
}
/*
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index faa376e060c..7b478879177 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1840,101 +1840,15 @@ SELECT * FROM test4 WHERE b = 'Cote' COLLATE case_insensitive;
1 | cote
(1 row)
--- foreign keys (should use collation of primary key)
--- PK is case-sensitive, FK is case-insensitive
+-- foreign keys (mixing different nondeterministic collations not allowed)
CREATE TABLE test10pk (x text COLLATE case_sensitive PRIMARY KEY);
-INSERT INTO test10pk VALUES ('abc'), ('def'), ('ghi');
-CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test10fk VALUES ('abc'); -- ok
-INSERT INTO test10fk VALUES ('ABC'); -- error
-ERROR: insert or update on table "test10fk" violates foreign key constraint "test10fk_x_fkey"
-DETAIL: Key (x)=(ABC) is not present in table "test10pk".
-INSERT INTO test10fk VALUES ('xyz'); -- error
-ERROR: insert or update on table "test10fk" violates foreign key constraint "test10fk_x_fkey"
-DETAIL: Key (x)=(xyz) is not present in table "test10pk".
-SELECT * FROM test10pk;
- x
------
- abc
- def
- ghi
-(3 rows)
-
-SELECT * FROM test10fk;
- x
------
- abc
-(1 row)
-
--- restrict update even though the values are "equal" in the FK table
-UPDATE test10fk SET x = 'ABC' WHERE x = 'abc'; -- error
-ERROR: insert or update on table "test10fk" violates foreign key constraint "test10fk_x_fkey"
-DETAIL: Key (x)=(ABC) is not present in table "test10pk".
-SELECT * FROM test10fk;
- x
------
- abc
-(1 row)
-
-DELETE FROM test10pk WHERE x = 'abc';
-SELECT * FROM test10pk;
- x
------
- def
- ghi
-(2 rows)
-
-SELECT * FROM test10fk;
- x
----
-(0 rows)
-
--- PK is case-insensitive, FK is case-sensitive
+CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
+ERROR: foreign key constraint "test10fk_x_fkey" cannot be implemented
+DETAIL: Key columns "x" of the referencing table and "x" of the referenced table have incompatible collations: "case_insensitive" and "case_sensitive". If either collation is nondeterministic, then both collations have to be the same.
CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
-INSERT INTO test11pk VALUES ('abc'), ('def'), ('ghi');
-CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test11fk VALUES ('abc'); -- ok
-INSERT INTO test11fk VALUES ('ABC'); -- ok
-INSERT INTO test11fk VALUES ('xyz'); -- error
-ERROR: insert or update on table "test11fk" violates foreign key constraint "test11fk_x_fkey"
-DETAIL: Key (x)=(xyz) is not present in table "test11pk".
-SELECT * FROM test11pk;
- x
------
- abc
- def
- ghi
-(3 rows)
-
-SELECT * FROM test11fk;
- x
------
- abc
- ABC
-(2 rows)
-
--- cascade update even though the values are "equal" in the PK table
-UPDATE test11pk SET x = 'ABC' WHERE x = 'abc';
-SELECT * FROM test11fk;
- x
------
- ABC
- ABC
-(2 rows)
-
-DELETE FROM test11pk WHERE x = 'abc';
-SELECT * FROM test11pk;
- x
------
- def
- ghi
-(2 rows)
-
-SELECT * FROM test11fk;
- x
----
-(0 rows)
-
+CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
+ERROR: foreign key constraint "test11fk_x_fkey" cannot be implemented
+DETAIL: Key columns "x" of the referencing table and "x" of the referenced table have incompatible collations: "case_sensitive" and "case_insensitive". If either collation is nondeterministic, then both collations have to be the same.
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 80f28a97d78..31f6aff0acd 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -687,39 +687,12 @@ CREATE TABLE test4 (a int, b text);
SELECT * FROM test4 WHERE b = 'Cote' COLLATE ignore_accents; -- still case-sensitive
SELECT * FROM test4 WHERE b = 'Cote' COLLATE case_insensitive;
--- foreign keys (should use collation of primary key)
-
--- PK is case-sensitive, FK is case-insensitive
+-- foreign keys (mixing different nondeterministic collations not allowed)
CREATE TABLE test10pk (x text COLLATE case_sensitive PRIMARY KEY);
-INSERT INTO test10pk VALUES ('abc'), ('def'), ('ghi');
-CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test10fk VALUES ('abc'); -- ok
-INSERT INTO test10fk VALUES ('ABC'); -- error
-INSERT INTO test10fk VALUES ('xyz'); -- error
-SELECT * FROM test10pk;
-SELECT * FROM test10fk;
--- restrict update even though the values are "equal" in the FK table
-UPDATE test10fk SET x = 'ABC' WHERE x = 'abc'; -- error
-SELECT * FROM test10fk;
-DELETE FROM test10pk WHERE x = 'abc';
-SELECT * FROM test10pk;
-SELECT * FROM test10fk;
-
--- PK is case-insensitive, FK is case-sensitive
+CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
+
CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
-INSERT INTO test11pk VALUES ('abc'), ('def'), ('ghi');
-CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE);
-INSERT INTO test11fk VALUES ('abc'); -- ok
-INSERT INTO test11fk VALUES ('ABC'); -- ok
-INSERT INTO test11fk VALUES ('xyz'); -- error
-SELECT * FROM test11pk;
-SELECT * FROM test11fk;
--- cascade update even though the values are "equal" in the PK table
-UPDATE test11pk SET x = 'ABC' WHERE x = 'abc';
-SELECT * FROM test11fk;
-DELETE FROM test11pk WHERE x = 'abc';
-SELECT * FROM test11pk;
-SELECT * FROM test11fk;
+CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
base-commit: d7a2b5bd8718a6207fb364318d7f7cccdf6219c3
--
2.47.0
On 25.10.24 16:26, jian he wrote:
drop table if exists pktable, fktable;
CREATE TABLE pktable (x text COLLATE "POSIX" PRIMARY KEY);
CREATE TABLE fktable (x text COLLATE "C" REFERENCES pktable on update
cascade on delete cascade);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('A');update pktable set x = 'a' collate "C" where x = 'A' collate "POSIX";
the cascade update fktable query string is:
UPDATE ONLY "public"."fktable" SET "x" = $1 WHERE $2 OPERATOR(pg_catalog.=) "x"
ideally it should be
UPDATE ONLY "public"."fktable" SET "x" = $1 collate "C" WHERE $2
collate "POSIX" OPERATOR(pg_catalog.=) "x"as we already mentioned in several places: PK-FK tie either they have to
both be deterministic or else they both have to be the same collation
oid.
so the reduction to
UPDATE ONLY "public"."fktable" SET "x" = $1 WHERE $2 OPERATOR(pg_catalog.=) "x"
is safe.
but you look at SPI_execute_snapshot, _SPI_convert_params. then we can see
the collation metadata is not keeped.+ We don't need to use + * this function for RI queries that compare a variable to a $n parameter. + * Since parameter symbols always have default collation, the effect will be + * to use the variable's collation.so I think a better description is
+ We don't need to use + * this function for RI queries that compare a variable to a $n parameter. + * Since parameter symbols don't have collation information, the effect will be + * to use the variable's collation.you can see related discovery in
/messages/by-id/CACJufxEtPBWAk7nEn69ww2LKi9w1i4dLwd5gnjD1DQ2vaYoi2g@mail.gmail.com
I don't know. I don't think there is anything wrong with the existing
code in this respect. Notably a parameter gets assigned its type's
default collation (see src/backend/parser/parse_param.c), so the change
of the comment as you suggest it is not correct.
Also, I don't think this is actually related to the patch discussed in
this thread.
On 25.10.24 08:23, jian he wrote:
ri_KeysEqual definitely deserves some comments.
for rel_is_pk, the equality is collation agnostic;
for rel_is_pk is false, the equality is collation aware.for example:
DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE fktable (x text collate case_insensitive REFERENCES
pktable on update restrict on delete restrict);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('a');
update pktable set x = 'a' where x = 'A';
ERROR: update or delete on table "pktable" violates foreign key
constraint "fktable_x_fkey" on table "fktable"
DETAIL: Key (x)=(A) is still referenced from table "fktable".
this should not happen?
Apparently this is intentional. It's the difference between RESTRICT
and NO ACTION. In ri_restrict(), there is a comment:
/*
* If another PK row now exists providing the old key values, we should
* not do anything. However, this check should only be made in the NO
* ACTION case; in RESTRICT cases we don't wish to allow another
row to be
* substituted.
*/
In any case, this patch does not change this behavior. It exists in old
versions as well.
On Thu, Nov 7, 2024 at 8:15 PM Peter Eisentraut <peter@eisentraut.org> wrote:
Apparently this is intentional. It's the difference between RESTRICT
and NO ACTION. In ri_restrict(), there is a comment:/*
* If another PK row now exists providing the old key values, we should
* not do anything. However, this check should only be made in the NO
* ACTION case; in RESTRICT cases we don't wish to allow another
row to be
* substituted.
*/In any case, this patch does not change this behavior. It exists in old
versions as well.
https://stackoverflow.com/questions/14921668/difference-between-restrict-and-no-action
mentioned about the difference between "no action" and "restrict".
RI_FKey_restrict_upd comments also says:
* The SQL standard intends that this referential action occur exactly when
* the update is performed, rather than after. This appears to be
* the only difference between "NO ACTION" and "RESTRICT". In Postgres
* we still implement this as an AFTER trigger, but it's non-deferrable.
DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE fktable (x text collate case_insensitive REFERENCES
pktable on update restrict on delete restrict);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('a');
update pktable set x = 'a' where x = 'A';
----------------------------------------------------------
DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE fktable (x text collate case_insensitive REFERENCES
pktable on update no action on delete no action);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('a');
update pktable set x = 'a' where x = 'A';
In the above two cases, the last queries behavior difference does not
look like "no action" vs "restrict" mentioned in the doc (create_table.sgml),
or the above stackoverflow link.
so this part, i am still confused.
-----------------------------<<>>>-----------------------------
CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
INSERT INTO pktable VALUES ('A'), ('Å'), ('H');
INSERT INTO fktable VALUES ('a');
fktable foreign key variants:
collate case_insensitive REFERENCES pktable on update set default on
delete set default
collate case_insensitive REFERENCES pktable on update set null on
delete set null
collate case_insensitive REFERENCES pktable on update cascade on delete cascade
`update pktable set x = 'a' where x = 'A';`
will act as if the column "x" value has changed.
so it will do the action to fktable.
following the same logic,
maybe we should let "on update no action on delete no action"
fail for the following case:
DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE fktable (x text collate case_insensitive REFERENCES
pktable on update no action on delete no action);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('a');
---expect it to fail. since column "x " value changed and is still
being referenced.
update pktable set x = 'a' where x = 'A';
-----------------------------<<>>>-----------------------------
I added a "on update cascade, on delete cascade" tests on collate.icu.utf8.sql
for both foreign key and primary key are nondeterministic collation.
Attachments:
v8-0001-add-more-tests-for-pk-fk-tie-with-nondetermini.no-cfbotapplication/octet-stream; name=v8-0001-add-more-tests-for-pk-fk-tie-with-nondetermini.no-cfbotDownload
From 61e185ba7e2444471ae733e941e3ea728d7a6bab Mon Sep 17 00:00:00 2001
From: jian he <jian.universality@gmail.com>
Date: Wed, 13 Nov 2024 19:38:42 +0800
Subject: [PATCH v8 1/1] add more tests for pk-fk tie with nondeterministic
collation
add tests for referential_action as on delete casade, on update cascade.
---
.../regress/expected/collate.icu.utf8.out | 38 +++++++++++++++++++
src/test/regress/sql/collate.icu.utf8.sql | 38 +++++++++++++++++++
2 files changed, 76 insertions(+)
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index de17f7db6c..648a693247 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1849,6 +1849,44 @@ CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
ERROR: foreign key constraint "test11fk_x_fkey" cannot be implemented
DETAIL: Key columns "x" of the referencing table and "x" of the referenced table have incompatible collations: "case_sensitive" and "case_insensitive". If either collation is nondeterministic, then both collations have to be the same.
+CREATE OR REPLACE FUNCTION trigger_func() RETURNS trigger language plpgsql AS $$
+BEGIN
+ IF tg_op IN ('DELETE', 'UPDATE') THEN
+ RAISE INFO '%: %: old = %', TG_NAME, TG_WHEN, OLD;
+ END IF;
+
+ IF tg_op IN ('UPDATE', 'INSERT') THEN
+ RAISE INFO '%: %: new = %', TG_NAME, TG_WHEN, NEW;
+ END IF;
+
+ IF tg_op = 'DELETE' THEN
+ RETURN OLD;
+ ELSE
+ RETURN NEW;
+ END IF;
+END
+$$;
+-- foreign keys (same nondeterministic collations, test on update casade, on delete cascade)
+CREATE TABLE test12pk (x text collate case_insensitive PRIMARY KEY);
+CREATE TABLE test12fk (x text collate case_insensitive REFERENCES test12pk on update cascade on delete cascade);
+CREATE TRIGGER tg_fkside BEFORE UPDATE OR DELETE ON test12fk
+FOR EACH ROW
+EXECUTE PROCEDURE trigger_func();
+INSERT INTO test12pk VALUES ('A'), ('Å');
+INSERT INTO test12fk VALUES ('a'), ('A');
+update test12pk set x = 'A' where x = 'a'; --won't cascade to test12fk
+update test12pk set x = 'a' where x = 'a'; --do cascade to test12fk
+INFO: tg_fkside: BEFORE: old = (a)
+INFO: tg_fkside: BEFORE: new = (a)
+INFO: tg_fkside: BEFORE: old = (A)
+INFO: tg_fkside: BEFORE: new = (a)
+update test12pk set x = 'a' where x = 'a'; --won't cascade to test12fk
+delete from test12pk where x = 'A'; --do cascade to test12fk
+INFO: tg_fkside: BEFORE: old = (a)
+INFO: tg_fkside: BEFORE: old = (a)
+delete from test12pk where x = 'Å'; --won't cascade to test12fk
+DROP TABLE test12pk, test12fk;
+DROP FUNCTION trigger_func;
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 0c9491c260..ae7ede2ea7 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -694,6 +694,44 @@ CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) O
CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
+CREATE OR REPLACE FUNCTION trigger_func() RETURNS trigger language plpgsql AS $$
+BEGIN
+ IF tg_op IN ('DELETE', 'UPDATE') THEN
+ RAISE INFO '%: %: old = %', TG_NAME, TG_WHEN, OLD;
+ END IF;
+
+ IF tg_op IN ('UPDATE', 'INSERT') THEN
+ RAISE INFO '%: %: new = %', TG_NAME, TG_WHEN, NEW;
+ END IF;
+
+ IF tg_op = 'DELETE' THEN
+ RETURN OLD;
+ ELSE
+ RETURN NEW;
+ END IF;
+END
+$$;
+
+-- foreign keys (same nondeterministic collations, test on update casade, on delete cascade)
+CREATE TABLE test12pk (x text collate case_insensitive PRIMARY KEY);
+CREATE TABLE test12fk (x text collate case_insensitive REFERENCES test12pk on update cascade on delete cascade);
+
+CREATE TRIGGER tg_fkside BEFORE UPDATE OR DELETE ON test12fk
+FOR EACH ROW
+EXECUTE PROCEDURE trigger_func();
+
+INSERT INTO test12pk VALUES ('A'), ('Å');
+INSERT INTO test12fk VALUES ('a'), ('A');
+
+update test12pk set x = 'A' where x = 'a'; --won't cascade to test12fk
+update test12pk set x = 'a' where x = 'a'; --do cascade to test12fk
+update test12pk set x = 'a' where x = 'a'; --won't cascade to test12fk
+
+delete from test12pk where x = 'A'; --do cascade to test12fk
+delete from test12pk where x = 'Å'; --won't cascade to test12fk
+DROP TABLE test12pk, test12fk;
+DROP FUNCTION trigger_func;
+
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
--
2.34.1
On Wed, Nov 13, 2024 at 8:56 PM jian he <jian.universality@gmail.com> wrote:
https://stackoverflow.com/questions/14921668/difference-between-restrict-and-no-action
mentioned about the difference between "no action" and "restrict".RI_FKey_restrict_upd comments also says:
* The SQL standard intends that this referential action occur exactly when
* the update is performed, rather than after. This appears to be
* the only difference between "NO ACTION" and "RESTRICT". In Postgres
* we still implement this as an AFTER trigger, but it's non-deferrable.DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE fktable (x text collate case_insensitive REFERENCES
pktable on update restrict on delete restrict);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('a');
update pktable set x = 'a' where x = 'A';
my previous email was too verbose.
my point is this: given PK, FK both are case_insensitive. and
PK side values are all being referenced.
case like `update pktable set x = 'a' where x = 'A';`
Do we consider PK side value changed?
it seems not mentioned anywhere.
I added a "on update cascade, on delete cascade" tests on collate.icu.utf8.sql
for both foreign key and primary key are nondeterministic collation.
I found out that we don't have tests for
ALTER TABLE ADD FOREIGN KEY
ALTER TABLE ALTER COLUMN SET DATA TYPE.
so I added these. Please ignore previous email attachments.
minor issue on commit message:
"colllations", typo?
Attachments:
v8-0001-add-more-tests-for-pk-fk-tie-with-nondetermin.no-cfbot1application/octet-stream; name=v8-0001-add-more-tests-for-pk-fk-tie-with-nondetermin.no-cfbot1Download
From 0b70579b695c878addf4d99bd93051b3667fce1a Mon Sep 17 00:00:00 2001
From: jian he <jian.universality@gmail.com>
Date: Thu, 14 Nov 2024 10:02:30 +0800
Subject: [PATCH v8 1/1] add more tests for pk-fk tie with nondeterministic
collation
add tests for referential_action as on delete casade, on update cascade.
add tests: alter table add foreign key
add tests: alter table alter column set data type
---
.../regress/expected/collate.icu.utf8.out | 50 +++++++++++++++++++
src/test/regress/sql/collate.icu.utf8.sql | 46 +++++++++++++++++
2 files changed, 96 insertions(+)
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index de17f7db6c..2d79bbdf42 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1849,6 +1849,56 @@ CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
ERROR: foreign key constraint "test11fk_x_fkey" cannot be implemented
DETAIL: Key columns "x" of the referencing table and "x" of the referenced table have incompatible collations: "case_sensitive" and "case_insensitive". If either collation is nondeterministic, then both collations have to be the same.
+CREATE TABLE test11fail (x text);
+ALTER TABLE test11fail add foreign key (x) REFERENCES test11pk ON UPDATE CASCADE ON DELETE CASCADE; --error
+ERROR: foreign key constraint "test11fail_x_fkey" cannot be implemented
+DETAIL: Key columns "x" of the referencing table and "x" of the referenced table have incompatible collations: "default" and "case_insensitive". If either collation is nondeterministic, then both collations have to be the same.
+CREATE TABLE test11fk (x text COLLATE case_insensitive );
+ALTER TABLE test11fk add foreign key (x) REFERENCES test11pk ON UPDATE CASCADE ON DELETE CASCADE; --ok
+ALTER TABLE test11pk alter column x set data type text; --error
+ERROR: foreign key constraint "test11fk_x_fkey" cannot be implemented
+DETAIL: Key columns "x" of the referencing table and "x" of the referenced table have incompatible collations: "case_insensitive" and "default". If either collation is nondeterministic, then both collations have to be the same.
+ALTER TABLE test11fk alter column x set data type text; --error
+ERROR: foreign key constraint "test11fk_x_fkey" cannot be implemented
+DETAIL: Key columns "x" of the referencing table and "x" of the referenced table have incompatible collations: "default" and "case_insensitive". If either collation is nondeterministic, then both collations have to be the same.
+CREATE OR REPLACE FUNCTION trigger_func() RETURNS trigger language plpgsql AS $$
+BEGIN
+ IF tg_op IN ('DELETE', 'UPDATE') THEN
+ RAISE INFO '%: %: old = %', TG_NAME, TG_WHEN, OLD;
+ END IF;
+
+ IF tg_op IN ('UPDATE', 'INSERT') THEN
+ RAISE INFO '%: %: new = %', TG_NAME, TG_WHEN, NEW;
+ END IF;
+
+ IF tg_op = 'DELETE' THEN
+ RETURN OLD;
+ ELSE
+ RETURN NEW;
+ END IF;
+END
+$$;
+-- foreign keys (same nondeterministic collations, test on update casade, on delete cascade)
+CREATE TABLE test12pk (x text collate case_insensitive PRIMARY KEY);
+CREATE TABLE test12fk (x text collate case_insensitive REFERENCES test12pk on update cascade on delete cascade);
+CREATE TRIGGER tg_fkside BEFORE UPDATE OR DELETE ON test12fk
+FOR EACH ROW
+EXECUTE PROCEDURE trigger_func();
+INSERT INTO test12pk VALUES ('A'), ('Å');
+INSERT INTO test12fk VALUES ('a'), ('A');
+update test12pk set x = 'A' where x = 'a'; --won't cascade to test12fk
+update test12pk set x = 'a' where x = 'a'; --do cascade to test12fk
+INFO: tg_fkside: BEFORE: old = (a)
+INFO: tg_fkside: BEFORE: new = (a)
+INFO: tg_fkside: BEFORE: old = (A)
+INFO: tg_fkside: BEFORE: new = (a)
+update test12pk set x = 'a' where x = 'a'; --won't cascade to test12fk
+delete from test12pk where x = 'A'; --do cascade to test12fk
+INFO: tg_fkside: BEFORE: old = (a)
+INFO: tg_fkside: BEFORE: old = (a)
+delete from test12pk where x = 'Å'; --won't cascade to test12fk
+DROP TABLE test12pk, test12fk;
+DROP FUNCTION trigger_func;
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 0c9491c260..b8823c81ac 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -694,6 +694,52 @@ CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) O
CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
+CREATE TABLE test11fail (x text);
+ALTER TABLE test11fail add foreign key (x) REFERENCES test11pk ON UPDATE CASCADE ON DELETE CASCADE; --error
+
+CREATE TABLE test11fk (x text COLLATE case_insensitive );
+ALTER TABLE test11fk add foreign key (x) REFERENCES test11pk ON UPDATE CASCADE ON DELETE CASCADE; --ok
+ALTER TABLE test11pk alter column x set data type text; --error
+ALTER TABLE test11fk alter column x set data type text; --error
+
+CREATE OR REPLACE FUNCTION trigger_func() RETURNS trigger language plpgsql AS $$
+BEGIN
+ IF tg_op IN ('DELETE', 'UPDATE') THEN
+ RAISE INFO '%: %: old = %', TG_NAME, TG_WHEN, OLD;
+ END IF;
+
+ IF tg_op IN ('UPDATE', 'INSERT') THEN
+ RAISE INFO '%: %: new = %', TG_NAME, TG_WHEN, NEW;
+ END IF;
+
+ IF tg_op = 'DELETE' THEN
+ RETURN OLD;
+ ELSE
+ RETURN NEW;
+ END IF;
+END
+$$;
+
+-- foreign keys (same nondeterministic collations, test on update casade, on delete cascade)
+CREATE TABLE test12pk (x text collate case_insensitive PRIMARY KEY);
+CREATE TABLE test12fk (x text collate case_insensitive REFERENCES test12pk on update cascade on delete cascade);
+
+CREATE TRIGGER tg_fkside BEFORE UPDATE OR DELETE ON test12fk
+FOR EACH ROW
+EXECUTE PROCEDURE trigger_func();
+
+INSERT INTO test12pk VALUES ('A'), ('Å');
+INSERT INTO test12fk VALUES ('a'), ('A');
+
+update test12pk set x = 'A' where x = 'a'; --won't cascade to test12fk
+update test12pk set x = 'a' where x = 'a'; --do cascade to test12fk
+update test12pk set x = 'a' where x = 'a'; --won't cascade to test12fk
+
+delete from test12pk where x = 'A'; --do cascade to test12fk
+delete from test12pk where x = 'Å'; --won't cascade to test12fk
+DROP TABLE test12pk, test12fk;
+DROP FUNCTION trigger_func;
+
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
--
2.34.1
On 14.11.24 03:21, jian he wrote:
On Wed, Nov 13, 2024 at 8:56 PM jian he <jian.universality@gmail.com> wrote:
https://stackoverflow.com/questions/14921668/difference-between-restrict-and-no-action
mentioned about the difference between "no action" and "restrict".RI_FKey_restrict_upd comments also says:
* The SQL standard intends that this referential action occur exactly when
* the update is performed, rather than after. This appears to be
* the only difference between "NO ACTION" and "RESTRICT". In Postgres
* we still implement this as an AFTER trigger, but it's non-deferrable.DROP TABLE IF EXISTS fktable, pktable;
CREATE TABLE pktable (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE fktable (x text collate case_insensitive REFERENCES
pktable on update restrict on delete restrict);
INSERT INTO pktable VALUES ('A'), ('Å');
INSERT INTO fktable VALUES ('a');
update pktable set x = 'a' where x = 'A';my previous email was too verbose.
my point is this: given PK, FK both are case_insensitive. and
PK side values are all being referenced.case like `update pktable set x = 'a' where x = 'A';`
Do we consider PK side value changed?
it seems not mentioned anywhere.
I think the PostgreSQL documentation is selling the difference between
NO ACTION and RESTRICT a bit short. The SQL standard says this:
ON UPDATE RESTRICT: any change to a referenced column in the
referenced table is prohibited if there is a matching row.
and
If a non-null value of a referenced column RC in the referenced table
is updated to a value that is distinct from the current value of RC,
then, [...] If UR specifies RESTRICT and there exists some matching
row, then an exception con- dition is raised [...]
So this is exactly what is happening here.
You can also reproduce this with things that are not strings with
collations. You just need to find a type that has values that are
"equal" but "distinct", which is not common, but it exists, for example
0.0 and -0.0 in floats. Example:
create table parent (val float8 primary key);
insert into parent values ('0.0');
create table child (id int, val float8 references parent (val));
insert into child values (1, '0.0');
insert into child values (2, '-0.0');
update parent set val = '-0.0'; -- ok with NO ACTION
but
create table child (id int, val float8 references parent (val) on
update restrict);
insert into child values (1, '0.0');
insert into child values (2, '-0.0');
update parent set val = '-0.0'; -- error with RESTRICT
So this is a meaningful difference.
There is also a bug here in that the update in the case of NO ACTION
doesn't actually run, because it thinks the values are the same and the
update can be skipped.
I think there is room for improvement here, in the documentation, the
tests, and maybe in the code. And while these are thematically related
to this thread, they are actually separate issues.
I propose that I go ahead with committing the v7 patch (with your typo
fixes) and then we continue discussing these other issues afterwards in
a separate thread.
Given this overall picture, I'm also less and less inclined to do any
backpatching bug fixing here. The status of this feature combination in
the backbranches is just "don't push it too hard". Maybe someone has
some feasible mitigation ideas, but I haven't seen any yet.
On Thu, Nov 14, 2024 at 4:04 PM Peter Eisentraut <peter@eisentraut.org> wrote:
I propose that I go ahead with committing the v7 patch (with your typo
fixes) and then we continue discussing these other issues afterwards in
a separate thread.
looks good to me.
but in create_table.sgml
<para>
In addition, when the data in the referenced columns is changed,
certain actions are performed on the data in this table's
columns.
<para/>
I actually want to add a sentence like:
If the references columns collation is indeterministic, we use
bitwise comparison to
check if the referenced columns data is being changed.
searching the internet found out:
— ON UPDATE RESTRICT: any change to a referenced column in the
referenced table is prohibited if there
is a matching row.
— ON UPDATE NO ACTION (the default): there is no referential update
action; the referential constraint
only specifies a constraint check.
NOTE 53 — Even if constraint checking is not deferred, ON UPDATE
RESTRICT is a stricter condition than ON UPDATE NO
ACTION. ON UPDATE RESTRICT prohibits an update to a particular row if
there are any matching rows; ON UPDATE NO
ACTION does not perform its constraint check until the entire set of
rows to be updated has been processed.
looking at NOTE 53, overall i think i get it.
On 14.11.24 12:35, jian he wrote:
On Thu, Nov 14, 2024 at 4:04 PM Peter Eisentraut <peter@eisentraut.org> wrote:
I propose that I go ahead with committing the v7 patch (with your typo
fixes) and then we continue discussing these other issues afterwards in
a separate thread.looks good to me.
done
but in create_table.sgml
<para>
In addition, when the data in the referenced columns is changed,
certain actions are performed on the data in this table's
columns.
<para/>
I actually want to add a sentence like:If the references columns collation is indeterministic, we use
bitwise comparison to
check if the referenced columns data is being changed.
I think this is also part of the foreign key action behavior that is a
separate topic. We can discuss documentation improvements as part of that.
On 14.11.24 09:04, Peter Eisentraut wrote:
You can also reproduce this with things that are not strings with
collations. You just need to find a type that has values that are
"equal" but "distinct", which is not common, but it exists, for example
0.0 and -0.0 in floats. Example:create table parent (val float8 primary key);
insert into parent values ('0.0');create table child (id int, val float8 references parent (val));
insert into child values (1, '0.0');
insert into child values (2, '-0.0');update parent set val = '-0.0'; -- ok with NO ACTION
but
create table child (id int, val float8 references parent (val) on
update restrict);insert into child values (1, '0.0');
insert into child values (2, '-0.0');update parent set val = '-0.0'; -- error with RESTRICT
So this is a meaningful difference.
There is also a bug here in that the update in the case of NO ACTION
doesn't actually run, because it thinks the values are the same and the
update can be skipped.I think there is room for improvement here, in the documentation, the
tests, and maybe in the code. And while these are thematically related
to this thread, they are actually separate issues.
Back to this. First, there is no bug above. This is all working
correctly, I was just confused.
I made a few patches to clarify this:
1. We were using the wrong error code for RESTRICT. A RESTRICT
violation is not the same as a foreign-key violation. (The foreign key
might in theory still be satisfied, but RESTRICT prevents the action
anyway.) I fixed that.
2. Added some tests to illustrate all of this (similar to above). I
used case-insensitive collations, which I think is easiest to
understand, but there is nothing special about that.
3. Some documentation updates to explain some of the differences between
NO ACTION and RESTRICT better.
Attachments:
0001-Fix-error-code-for-referential-action-RESTRICT.patchtext/plain; charset=UTF-8; name=0001-Fix-error-code-for-referential-action-RESTRICT.patchDownload
From 1e6034fb09b1aab0eddf24b720ff040758db59fb Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 19 Nov 2024 17:11:50 +0100
Subject: [PATCH 1/3] Fix error code for referential action RESTRICT
According to the SQL standard, if the referential action RESTRICT is
triggered, it has its own error code. We previously didn't use that,
we just used the error code for foreign key violation. But RESTRICT
is not necessarily an actual foreign key violation. The foreign key
might still be satisfied in theory afterwards, but the RESTRICT
setting prevents the action even then. So it's a separate kind of
error condition.
---
src/backend/utils/adt/ri_triggers.c | 32 ++++++++++++---
src/test/regress/expected/foreign_key.out | 16 ++++----
.../regress/expected/without_overlaps.out | 40 +++++++++----------
3 files changed, 55 insertions(+), 33 deletions(-)
diff --git a/src/backend/utils/adt/ri_triggers.c b/src/backend/utils/adt/ri_triggers.c
index 91792cb2a47..3185f48afa6 100644
--- a/src/backend/utils/adt/ri_triggers.c
+++ b/src/backend/utils/adt/ri_triggers.c
@@ -228,6 +228,7 @@ static bool ri_PerformCheck(const RI_ConstraintInfo *riinfo,
RI_QueryKey *qkey, SPIPlanPtr qplan,
Relation fk_rel, Relation pk_rel,
TupleTableSlot *oldslot, TupleTableSlot *newslot,
+ bool is_restrict,
bool detectNewRows, int expect_OK);
static void ri_ExtractValues(Relation rel, TupleTableSlot *slot,
const RI_ConstraintInfo *riinfo, bool rel_is_pk,
@@ -235,7 +236,7 @@ static void ri_ExtractValues(Relation rel, TupleTableSlot *slot,
static void ri_ReportViolation(const RI_ConstraintInfo *riinfo,
Relation pk_rel, Relation fk_rel,
TupleTableSlot *violatorslot, TupleDesc tupdesc,
- int queryno, bool partgone) pg_attribute_noreturn();
+ int queryno, bool is_restrict, bool partgone) pg_attribute_noreturn();
/*
@@ -449,6 +450,7 @@ RI_FKey_check(TriggerData *trigdata)
ri_PerformCheck(riinfo, &qkey, qplan,
fk_rel, pk_rel,
NULL, newslot,
+ false,
pk_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE,
SPI_OK_SELECT);
@@ -613,6 +615,7 @@ ri_Check_Pk_Match(Relation pk_rel, Relation fk_rel,
result = ri_PerformCheck(riinfo, &qkey, qplan,
fk_rel, pk_rel,
oldslot, NULL,
+ false,
true, /* treat like update */
SPI_OK_SELECT);
@@ -800,6 +803,7 @@ ri_restrict(TriggerData *trigdata, bool is_no_action)
ri_PerformCheck(riinfo, &qkey, qplan,
fk_rel, pk_rel,
oldslot, NULL,
+ !is_no_action,
true, /* must detect new rows */
SPI_OK_SELECT);
@@ -901,6 +905,7 @@ RI_FKey_cascade_del(PG_FUNCTION_ARGS)
ri_PerformCheck(riinfo, &qkey, qplan,
fk_rel, pk_rel,
oldslot, NULL,
+ false,
true, /* must detect new rows */
SPI_OK_DELETE);
@@ -1017,6 +1022,7 @@ RI_FKey_cascade_upd(PG_FUNCTION_ARGS)
ri_PerformCheck(riinfo, &qkey, qplan,
fk_rel, pk_rel,
oldslot, newslot,
+ false,
true, /* must detect new rows */
SPI_OK_UPDATE);
@@ -1244,6 +1250,7 @@ ri_set(TriggerData *trigdata, bool is_set_null, int tgkind)
ri_PerformCheck(riinfo, &qkey, qplan,
fk_rel, pk_rel,
oldslot, NULL,
+ false,
true, /* must detect new rows */
SPI_OK_UPDATE);
@@ -1690,7 +1697,7 @@ RI_Initial_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
ri_ReportViolation(&fake_riinfo,
pk_rel, fk_rel,
slot, tupdesc,
- RI_PLAN_CHECK_LOOKUPPK, false);
+ RI_PLAN_CHECK_LOOKUPPK, false, false);
ExecDropSingleTupleTableSlot(slot);
}
@@ -1906,7 +1913,7 @@ RI_PartitionRemove_Check(Trigger *trigger, Relation fk_rel, Relation pk_rel)
fake_riinfo.pk_attnums[i] = i + 1;
ri_ReportViolation(&fake_riinfo, pk_rel, fk_rel,
- slot, tupdesc, 0, true);
+ slot, tupdesc, 0, false, true);
}
if (SPI_finish() != SPI_OK_FINISH)
@@ -2387,6 +2394,7 @@ ri_PerformCheck(const RI_ConstraintInfo *riinfo,
RI_QueryKey *qkey, SPIPlanPtr qplan,
Relation fk_rel, Relation pk_rel,
TupleTableSlot *oldslot, TupleTableSlot *newslot,
+ bool is_restrict,
bool detectNewRows, int expect_OK)
{
Relation query_rel,
@@ -2511,7 +2519,7 @@ ri_PerformCheck(const RI_ConstraintInfo *riinfo,
pk_rel, fk_rel,
newslot ? newslot : oldslot,
NULL,
- qkey->constr_queryno, false);
+ qkey->constr_queryno, is_restrict, false);
return SPI_processed != 0;
}
@@ -2552,7 +2560,7 @@ static void
ri_ReportViolation(const RI_ConstraintInfo *riinfo,
Relation pk_rel, Relation fk_rel,
TupleTableSlot *violatorslot, TupleDesc tupdesc,
- int queryno, bool partgone)
+ int queryno, bool is_restrict, bool partgone)
{
StringInfoData key_names;
StringInfoData key_values;
@@ -2682,6 +2690,20 @@ ri_ReportViolation(const RI_ConstraintInfo *riinfo,
errdetail("Key is not present in table \"%s\".",
RelationGetRelationName(pk_rel)),
errtableconstraint(fk_rel, NameStr(riinfo->conname))));
+ else if (is_restrict)
+ ereport(ERROR,
+ (errcode(ERRCODE_RESTRICT_VIOLATION),
+ errmsg("update or delete on table \"%s\" violates RESTRICT setting of foreign key constraint \"%s\" on table \"%s\"",
+ RelationGetRelationName(pk_rel),
+ NameStr(riinfo->conname),
+ RelationGetRelationName(fk_rel)),
+ has_perm ?
+ errdetail("Key (%s)=(%s) is referenced from table \"%s\".",
+ key_names.data, key_values.data,
+ RelationGetRelationName(fk_rel)) :
+ errdetail("Key is referenced from table \"%s\".",
+ RelationGetRelationName(fk_rel)),
+ errtableconstraint(fk_rel, NameStr(riinfo->conname))));
else
ereport(ERROR,
(errcode(ERRCODE_FOREIGN_KEY_VIOLATION),
diff --git a/src/test/regress/expected/foreign_key.out b/src/test/regress/expected/foreign_key.out
index a5165270c2d..3f459f70ac1 100644
--- a/src/test/regress/expected/foreign_key.out
+++ b/src/test/regress/expected/foreign_key.out
@@ -1439,11 +1439,11 @@ insert into pp values(11);
update pp set f1=f1+1;
insert into cc values(13);
update pp set f1=f1+1; -- fail
-ERROR: update or delete on table "pp" violates foreign key constraint "cc_f1_fkey" on table "cc"
-DETAIL: Key (f1)=(13) is still referenced from table "cc".
+ERROR: update or delete on table "pp" violates RESTRICT setting of foreign key constraint "cc_f1_fkey" on table "cc"
+DETAIL: Key (f1)=(13) is referenced from table "cc".
delete from pp where f1 = 13; -- fail
-ERROR: update or delete on table "pp" violates foreign key constraint "cc_f1_fkey" on table "cc"
-DETAIL: Key (f1)=(13) is still referenced from table "cc".
+ERROR: update or delete on table "pp" violates RESTRICT setting of foreign key constraint "cc_f1_fkey" on table "cc"
+DETAIL: Key (f1)=(13) is referenced from table "cc".
drop table pp, cc;
--
-- Test interaction of foreign-key optimization with rules (bug #14219)
@@ -2664,11 +2664,11 @@ ALTER TABLE fk ADD FOREIGN KEY (a) REFERENCES pk ON UPDATE RESTRICT ON DELETE RE
CREATE TABLE fk_d PARTITION OF fk DEFAULT;
INSERT INTO fk VALUES (20), (30);
DELETE FROM pk WHERE a = 20;
-ERROR: update or delete on table "pk11" violates foreign key constraint "fk_a_fkey2" on table "fk"
-DETAIL: Key (a)=(20) is still referenced from table "fk".
+ERROR: update or delete on table "pk11" violates RESTRICT setting of foreign key constraint "fk_a_fkey2" on table "fk"
+DETAIL: Key (a)=(20) is referenced from table "fk".
UPDATE pk SET a = 90 WHERE a = 30;
-ERROR: update or delete on table "pk" violates foreign key constraint "fk_a_fkey" on table "fk"
-DETAIL: Key (a)=(30) is still referenced from table "fk".
+ERROR: update or delete on table "pk" violates RESTRICT setting of foreign key constraint "fk_a_fkey" on table "fk"
+DETAIL: Key (a)=(30) is referenced from table "fk".
SELECT tableoid::regclass, * FROM fk;
tableoid | a
----------+----
diff --git a/src/test/regress/expected/without_overlaps.out b/src/test/regress/expected/without_overlaps.out
index d6cb65e9a63..a621c2e9bbc 100644
--- a/src/test/regress/expected/without_overlaps.out
+++ b/src/test/regress/expected/without_overlaps.out
@@ -1743,8 +1743,8 @@ UPDATE temporal_rng
SET valid_at = CASE WHEN lower(valid_at) = '2018-01-01' THEN daterange('2018-01-01', '2018-01-05')
WHEN lower(valid_at) = '2018-02-01' THEN daterange('2018-01-05', '2018-03-01') END
WHERE id = '[6,7)';
-ERROR: update or delete on table "temporal_rng" violates foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng"
-DETAIL: Key (id, valid_at)=([6,7), [2018-01-01,2018-02-01)) is still referenced from table "temporal_fk_rng2rng".
+ERROR: update or delete on table "temporal_rng" violates RESTRICT setting of foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng"
+DETAIL: Key (id, valid_at)=([6,7), [2018-01-01,2018-02-01)) is referenced from table "temporal_fk_rng2rng".
-- a PK update that fails because both are referenced (even before commit):
BEGIN;
ALTER TABLE temporal_fk_rng2rng
@@ -1752,14 +1752,14 @@ BEGIN;
DEFERRABLE INITIALLY DEFERRED;
UPDATE temporal_rng SET valid_at = daterange('2016-01-01', '2016-02-01')
WHERE id = '[5,6)' AND valid_at = daterange('2018-01-01', '2018-02-01');
-ERROR: update or delete on table "temporal_rng" violates foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng"
-DETAIL: Key (id, valid_at)=([5,6), [2018-01-01,2018-02-01)) is still referenced from table "temporal_fk_rng2rng".
+ERROR: update or delete on table "temporal_rng" violates RESTRICT setting of foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng"
+DETAIL: Key (id, valid_at)=([5,6), [2018-01-01,2018-02-01)) is referenced from table "temporal_fk_rng2rng".
ROLLBACK;
-- changing the scalar part fails:
UPDATE temporal_rng SET id = '[7,8)'
WHERE id = '[5,6)' AND valid_at = daterange('2018-01-01', '2018-02-01');
-ERROR: update or delete on table "temporal_rng" violates foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng"
-DETAIL: Key (id, valid_at)=([5,6), [2018-01-01,2018-02-01)) is still referenced from table "temporal_fk_rng2rng".
+ERROR: update or delete on table "temporal_rng" violates RESTRICT setting of foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng"
+DETAIL: Key (id, valid_at)=([5,6), [2018-01-01,2018-02-01)) is referenced from table "temporal_fk_rng2rng".
-- then delete the objecting FK record and the same PK update succeeds:
DELETE FROM temporal_fk_rng2rng WHERE id = '[3,4)';
UPDATE temporal_rng SET valid_at = daterange('2016-01-01', '2016-02-01')
@@ -1824,8 +1824,8 @@ BEGIN;
ALTER CONSTRAINT temporal_fk_rng2rng_fk
DEFERRABLE INITIALLY DEFERRED;
DELETE FROM temporal_rng WHERE id = '[5,6)' AND valid_at = daterange('2018-01-01', '2018-02-01');
-ERROR: update or delete on table "temporal_rng" violates foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng"
-DETAIL: Key (id, valid_at)=([5,6), [2018-01-01,2018-02-01)) is still referenced from table "temporal_fk_rng2rng".
+ERROR: update or delete on table "temporal_rng" violates RESTRICT setting of foreign key constraint "temporal_fk_rng2rng_fk" on table "temporal_fk_rng2rng"
+DETAIL: Key (id, valid_at)=([5,6), [2018-01-01,2018-02-01)) is referenced from table "temporal_fk_rng2rng".
ROLLBACK;
-- then delete the objecting FK record and the same PK delete succeeds:
DELETE FROM temporal_fk_rng2rng WHERE id = '[3,4)';
@@ -2227,8 +2227,8 @@ UPDATE temporal_mltrng
SET valid_at = CASE WHEN lower(valid_at) = '2018-01-01' THEN datemultirange(daterange('2018-01-01', '2018-01-05'))
WHEN lower(valid_at) = '2018-02-01' THEN datemultirange(daterange('2018-01-05', '2018-03-01')) END
WHERE id = '[6,7)';
-ERROR: update or delete on table "temporal_mltrng" violates foreign key constraint "temporal_fk_mltrng2mltrng_fk" on table "temporal_fk_mltrng2mltrng"
-DETAIL: Key (id, valid_at)=([6,7), {[2018-01-01,2018-02-01)}) is still referenced from table "temporal_fk_mltrng2mltrng".
+ERROR: update or delete on table "temporal_mltrng" violates RESTRICT setting of foreign key constraint "temporal_fk_mltrng2mltrng_fk" on table "temporal_fk_mltrng2mltrng"
+DETAIL: Key (id, valid_at)=([6,7), {[2018-01-01,2018-02-01)}) is referenced from table "temporal_fk_mltrng2mltrng".
-- a PK update that fails because both are referenced (even before commit):
BEGIN;
ALTER TABLE temporal_fk_mltrng2mltrng
@@ -2236,14 +2236,14 @@ BEGIN;
DEFERRABLE INITIALLY DEFERRED;
UPDATE temporal_mltrng SET valid_at = datemultirange(daterange('2016-01-01', '2016-02-01'))
WHERE id = '[5,6)' AND valid_at = datemultirange(daterange('2018-01-01', '2018-02-01'));
-ERROR: update or delete on table "temporal_mltrng" violates foreign key constraint "temporal_fk_mltrng2mltrng_fk" on table "temporal_fk_mltrng2mltrng"
-DETAIL: Key (id, valid_at)=([5,6), {[2018-01-01,2018-02-01)}) is still referenced from table "temporal_fk_mltrng2mltrng".
+ERROR: update or delete on table "temporal_mltrng" violates RESTRICT setting of foreign key constraint "temporal_fk_mltrng2mltrng_fk" on table "temporal_fk_mltrng2mltrng"
+DETAIL: Key (id, valid_at)=([5,6), {[2018-01-01,2018-02-01)}) is referenced from table "temporal_fk_mltrng2mltrng".
ROLLBACK;
-- changing the scalar part fails:
UPDATE temporal_mltrng SET id = '[7,8)'
WHERE id = '[5,6)' AND valid_at = datemultirange(daterange('2018-01-01', '2018-02-01'));
-ERROR: update or delete on table "temporal_mltrng" violates foreign key constraint "temporal_fk_mltrng2mltrng_fk" on table "temporal_fk_mltrng2mltrng"
-DETAIL: Key (id, valid_at)=([5,6), {[2018-01-01,2018-02-01)}) is still referenced from table "temporal_fk_mltrng2mltrng".
+ERROR: update or delete on table "temporal_mltrng" violates RESTRICT setting of foreign key constraint "temporal_fk_mltrng2mltrng_fk" on table "temporal_fk_mltrng2mltrng"
+DETAIL: Key (id, valid_at)=([5,6), {[2018-01-01,2018-02-01)}) is referenced from table "temporal_fk_mltrng2mltrng".
--
-- test FK referenced deletes NO ACTION
--
@@ -2301,8 +2301,8 @@ BEGIN;
ALTER CONSTRAINT temporal_fk_mltrng2mltrng_fk
DEFERRABLE INITIALLY DEFERRED;
DELETE FROM temporal_mltrng WHERE id = '[5,6)' AND valid_at = datemultirange(daterange('2018-01-01', '2018-02-01'));
-ERROR: update or delete on table "temporal_mltrng" violates foreign key constraint "temporal_fk_mltrng2mltrng_fk" on table "temporal_fk_mltrng2mltrng"
-DETAIL: Key (id, valid_at)=([5,6), {[2018-01-01,2018-02-01)}) is still referenced from table "temporal_fk_mltrng2mltrng".
+ERROR: update or delete on table "temporal_mltrng" violates RESTRICT setting of foreign key constraint "temporal_fk_mltrng2mltrng_fk" on table "temporal_fk_mltrng2mltrng"
+DETAIL: Key (id, valid_at)=([5,6), {[2018-01-01,2018-02-01)}) is referenced from table "temporal_fk_mltrng2mltrng".
ROLLBACK;
--
-- FK between partitioned tables: ranges
@@ -2416,8 +2416,8 @@ INSERT INTO temporal_partitioned_fk_rng2rng (id, valid_at, parent_id) VALUES ('[
DELETE FROM temporal_partitioned_rng WHERE id = '[5,6)' AND valid_at = daterange('2018-02-01', '2018-03-01');
-- should fail:
DELETE FROM temporal_partitioned_rng WHERE id = '[5,6)' AND valid_at = daterange('2018-01-01', '2018-02-01');
-ERROR: update or delete on table "tp1" violates foreign key constraint "temporal_partitioned_fk_rng2rng_parent_id_valid_at_fkey" on table "temporal_partitioned_fk_rng2rng"
-DETAIL: Key (id, valid_at)=([5,6), [2018-01-01,2018-02-01)) is still referenced from table "temporal_partitioned_fk_rng2rng".
+ERROR: update or delete on table "tp1" violates RESTRICT setting of foreign key constraint "temporal_partitioned_fk_rng2rng_parent_id_valid_at_fkey" on table "temporal_partitioned_fk_rng2rng"
+DETAIL: Key (id, valid_at)=([5,6), [2018-01-01,2018-02-01)) is referenced from table "temporal_partitioned_fk_rng2rng".
--
-- partitioned FK referenced updates CASCADE
--
@@ -2572,8 +2572,8 @@ INSERT INTO temporal_partitioned_fk_mltrng2mltrng (id, valid_at, parent_id) VALU
DELETE FROM temporal_partitioned_mltrng WHERE id = '[5,6)' AND valid_at = datemultirange(daterange('2018-02-01', '2018-03-01'));
-- should fail:
DELETE FROM temporal_partitioned_mltrng WHERE id = '[5,6)' AND valid_at = datemultirange(daterange('2018-01-01', '2018-02-01'));
-ERROR: update or delete on table "tp1" violates foreign key constraint "temporal_partitioned_fk_mltrng2mltrng_parent_id_valid_at_fkey1" on table "temporal_partitioned_fk_mltrng2mltrng"
-DETAIL: Key (id, valid_at)=([5,6), {[2018-01-01,2018-02-01)}) is still referenced from table "temporal_partitioned_fk_mltrng2mltrng".
+ERROR: update or delete on table "tp1" violates RESTRICT setting of foreign key constraint "temporal_partitioned_fk_mltrng2mltrng_parent_id_valid_at_fkey1" on table "temporal_partitioned_fk_mltrng2mltrng"
+DETAIL: Key (id, valid_at)=([5,6), {[2018-01-01,2018-02-01)}) is referenced from table "temporal_partitioned_fk_mltrng2mltrng".
--
-- partitioned FK referenced updates CASCADE
--
base-commit: c1c09007e219ae68d1f8428a54baf68ccc1f8683
--
2.47.0
0002-Add-tests-for-foreign-keys-with-case-insensitive-col.patchtext/plain; charset=UTF-8; name=0002-Add-tests-for-foreign-keys-with-case-insensitive-col.patchDownload
From 9156d35ede18d63264a9629ebeec409d635d8178 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 19 Nov 2024 17:15:28 +0100
Subject: [PATCH 2/3] Add tests for foreign keys with case-insensitive
collations
Some of the behaviors of the different referential actions, such as
the difference between NO ACTION and RESTRICT are best illustrated
using a case-insensitive collation. So add some tests for that.
(What is actually being tested here is the behavior with values that
are "distinct" (binary different) but compare as equal. Another way
to do that would be with positive and negative zeroes with float
types. But this way seems nicer and more flexible.)
---
.../regress/expected/collate.icu.utf8.out | 62 +++++++++++++++++++
src/test/regress/sql/collate.icu.utf8.sql | 30 +++++++++
2 files changed, 92 insertions(+)
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index de17f7db6ce..fa08a454944 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1849,6 +1849,68 @@ CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
ERROR: foreign key constraint "test11fk_x_fkey" cannot be implemented
DETAIL: Key columns "x" of the referencing table and "x" of the referenced table have incompatible collations: "case_sensitive" and "case_insensitive". If either collation is nondeterministic, then both collations have to be the same.
+-- foreign key actions
+-- Some of the behaviors are most easily visible with a
+-- case-insensitive collation.
+CREATE TABLE test12pk (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE test12fk (a int, b text COLLATE case_insensitive REFERENCES test12pk (x) ON UPDATE NO ACTION);
+INSERT INTO test12pk VALUES ('abc');
+INSERT INTO test12fk VALUES (1, 'abc'), (2, 'ABC');
+UPDATE test12pk SET x = 'ABC' WHERE x = 'abc'; -- ok
+SELECT * FROM test12pk;
+ x
+-----
+ ABC
+(1 row)
+
+SELECT * FROM test12fk; -- no updates here
+ a | b
+---+-----
+ 1 | abc
+ 2 | ABC
+(2 rows)
+
+DROP TABLE test12pk, test12fk;
+CREATE TABLE test12pk (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE test12fk (a int, b text COLLATE case_insensitive REFERENCES test12pk (x) ON UPDATE RESTRICT);
+INSERT INTO test12pk VALUES ('abc');
+INSERT INTO test12fk VALUES (1, 'abc'), (2, 'ABC');
+UPDATE test12pk SET x = 'ABC' WHERE x = 'abc'; -- restrict violation
+ERROR: update or delete on table "test12pk" violates RESTRICT setting of foreign key constraint "test12fk_b_fkey" on table "test12fk"
+DETAIL: Key (x)=(abc) is referenced from table "test12fk".
+SELECT * FROM test12pk;
+ x
+-----
+ abc
+(1 row)
+
+SELECT * FROM test12fk;
+ a | b
+---+-----
+ 1 | abc
+ 2 | ABC
+(2 rows)
+
+DROP TABLE test12pk, test12fk;
+CREATE TABLE test12pk (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE test12fk (a int, b text COLLATE case_insensitive REFERENCES test12pk (x) ON UPDATE CASCADE);
+INSERT INTO test12pk VALUES ('abc');
+INSERT INTO test12fk VALUES (1, 'abc'), (2, 'ABC');
+UPDATE test12pk SET x = 'ABC' WHERE x = 'abc'; -- ok
+SELECT * FROM test12pk;
+ x
+-----
+ ABC
+(1 row)
+
+SELECT * FROM test12fk; -- was updated
+ a | b
+---+-----
+ 1 | ABC
+ 2 | ABC
+(2 rows)
+
+DROP TABLE test12pk, test12fk;
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 0c9491c260e..512b57bddc9 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -694,6 +694,36 @@ CREATE TABLE test10fk (x text COLLATE case_insensitive REFERENCES test10pk (x) O
CREATE TABLE test11pk (x text COLLATE case_insensitive PRIMARY KEY);
CREATE TABLE test11fk (x text COLLATE case_sensitive REFERENCES test11pk (x) ON UPDATE CASCADE ON DELETE CASCADE); -- error
+-- foreign key actions
+-- Some of the behaviors are most easily visible with a
+-- case-insensitive collation.
+CREATE TABLE test12pk (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE test12fk (a int, b text COLLATE case_insensitive REFERENCES test12pk (x) ON UPDATE NO ACTION);
+INSERT INTO test12pk VALUES ('abc');
+INSERT INTO test12fk VALUES (1, 'abc'), (2, 'ABC');
+UPDATE test12pk SET x = 'ABC' WHERE x = 'abc'; -- ok
+SELECT * FROM test12pk;
+SELECT * FROM test12fk; -- no updates here
+DROP TABLE test12pk, test12fk;
+
+CREATE TABLE test12pk (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE test12fk (a int, b text COLLATE case_insensitive REFERENCES test12pk (x) ON UPDATE RESTRICT);
+INSERT INTO test12pk VALUES ('abc');
+INSERT INTO test12fk VALUES (1, 'abc'), (2, 'ABC');
+UPDATE test12pk SET x = 'ABC' WHERE x = 'abc'; -- restrict violation
+SELECT * FROM test12pk;
+SELECT * FROM test12fk;
+DROP TABLE test12pk, test12fk;
+
+CREATE TABLE test12pk (x text COLLATE case_insensitive PRIMARY KEY);
+CREATE TABLE test12fk (a int, b text COLLATE case_insensitive REFERENCES test12pk (x) ON UPDATE CASCADE);
+INSERT INTO test12pk VALUES ('abc');
+INSERT INTO test12fk VALUES (1, 'abc'), (2, 'ABC');
+UPDATE test12pk SET x = 'ABC' WHERE x = 'abc'; -- ok
+SELECT * FROM test12pk;
+SELECT * FROM test12fk; -- was updated
+DROP TABLE test12pk, test12fk;
+
-- partitioning
CREATE TABLE test20 (a int, b text COLLATE case_insensitive) PARTITION BY LIST (b);
CREATE TABLE test20_1 PARTITION OF test20 FOR VALUES IN ('abc');
--
2.47.0
0003-doc-Improve-description-of-referential-actions.patchtext/plain; charset=UTF-8; name=0003-doc-Improve-description-of-referential-actions.patchDownload
From 64476388bb806352b96b64d13df5718af6ac8415 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 19 Nov 2024 17:18:39 +0100
Subject: [PATCH 3/3] doc: Improve description of referential actions
Some of the differences between NO ACTION and RESTRICT were not
explained fully.
---
doc/src/sgml/ddl.sgml | 41 ++++++++++++++++++++++++------
doc/src/sgml/ref/create_table.sgml | 20 ++++++++-------
2 files changed, 44 insertions(+), 17 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 3c56610d2ac..203d6ce0c30 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -1233,16 +1233,32 @@ <title>Foreign Keys</title>
</para>
<para>
- Restricting and cascading deletes are the two most common options.
- <literal>RESTRICT</literal> prevents deletion of a
- referenced row. <literal>NO ACTION</literal> means that if any
- referencing rows still exist when the constraint is checked, an error
- is raised; this is the default behavior if you do not specify anything.
- (The essential difference between these two choices is that
- <literal>NO ACTION</literal> allows the check to be deferred until
- later in the transaction, whereas <literal>RESTRICT</literal> does not.)
+ The default <literal>ON DELETE</literal> action is <literal>ON DELETE NO
+ ACTION</literal>; this does not need to be specified. This means that the
+ deletion in the referenced table is allowed to proceed. But the
+ foreign-key constraint is still required to be satisfied, so this
+ operation will usually result in an error. But checking of foreign-key
+ constraints can also be deferred to later in the transaction (not covered
+ in this chapter). In that case, the <literal>NO ACTION</literal> setting
+ would allow other commands to <quote>fix</quote> the situation before the
+ constraint is checked, for example by inserting another suitable row into
+ the referenced table or by deleting the now-dangling rows from the
+ referencing table.
+ </para>
+
+ <para>
+ <literal>RESTRICT</literal> is a stricter setting than <literal>NO
+ ACTION</literal>. It prevents deletion of a referenced row.
+ <literal>RESTRICT</literal> does not allow the check to be deferred until
+ later in the transaction.
+ </para>
+
+ <para>
<literal>CASCADE</literal> specifies that when a referenced row is deleted,
row(s) referencing it should be automatically deleted as well.
+ </para>
+
+ <para>
There are two other options:
<literal>SET NULL</literal> and <literal>SET DEFAULT</literal>.
These cause the referencing column(s) in the referencing row(s)
@@ -1312,6 +1328,15 @@ <title>Foreign Keys</title>
NULL</literal> and <literal>SET DEFAULT</literal>.
In this case, <literal>CASCADE</literal> means that the updated values of the
referenced column(s) should be copied into the referencing row(s).
+ There is also a noticeable difference between <literal>ON UPDATE NO
+ ACTION</literal> (the default) and <literal>NO UPDATE RESTRICT</literal>.
+ The former will allow the update to proceed and the foreign-key constraint
+ will be checked against the state after the update. The latter will
+ prevent the update to run even if the state after the update would still
+ satisfy the constraint. This prevents updating a referenced row to a
+ value that is distinct but compares as equal (for example, a character
+ string with a different case variant, if a character string type with a
+ case-insensitive collation is used).
</para>
<para>
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index 1a1adc5ae87..70fa929caa4 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1248,17 +1248,16 @@ <title>Parameters</title>
clause specifies the action to perform when a referenced column
in the referenced table is being updated to a new value. If the
row is updated, but the referenced column is not actually
- changed, no action is done. Referential actions other than the
- <literal>NO ACTION</literal> check cannot be deferred, even if
- the constraint is declared deferrable. There are the following possible
- actions for each clause:
+ changed, no action is done. Referential actions are executed as part of
+ the data changing command, even if the constraint is deferred. There
+ are the following possible actions for each clause:
<variablelist>
<varlistentry id="sql-createtable-parms-references-refact-no-action">
<term><literal>NO ACTION</literal></term>
<listitem>
<para>
- Produce an error indicating that the deletion or update
+ Produce an error if the deletion or update
would create a foreign key constraint violation.
If the constraint is deferred, this
error will be produced at constraint check time if there still
@@ -1271,10 +1270,13 @@ <title>Parameters</title>
<term><literal>RESTRICT</literal></term>
<listitem>
<para>
- Produce an error indicating that the deletion or update
- would create a foreign key constraint violation.
- This is the same as <literal>NO ACTION</literal> except that
- the check is not deferrable.
+ Produce an error if a row to be deleted or updated matches a row in
+ the referencing table. This prevents the action even if the state
+ after the action would not violate the foreign key constraint. In
+ particular, it prevents updates of referenced rows to values that
+ are distinct but compare as equal. (But it does not prevent
+ <quote>no-op</quote> updates that update a column to the same
+ value.)
</para>
</listitem>
</varlistentry>
--
2.47.0
Em ter., 19 de nov. de 2024 às 13:27, Peter Eisentraut <peter@eisentraut.org>
escreveu:
3. Some documentation updates to explain some of the differences between
NO ACTION and RESTRICT better.
There is a typo on your commit "doc: Improve description of referential
actions"
There is also a noticeable difference between <literal>ON UPDATE NO
+ ACTION</literal> (the default) and <literal>NO UPDATE
RESTRICT</literal>.
Should be ON UPDATE RESTRICT, right ?
On 29.11.24 15:25, Marcos Pegoraro wrote:
Em ter., 19 de nov. de 2024 às 13:27, Peter Eisentraut
<peter@eisentraut.org <mailto:peter@eisentraut.org>> escreveu:3. Some documentation updates to explain some of the differences
between
NO ACTION and RESTRICT better.There is a typo on your commit "doc: Improve description of referential
actions"There is also a noticeable difference between <literal>ON UPDATE NO
+ ACTION</literal> (the default) and <literal>NO UPDATE RESTRICT</
literal>.Should be ON UPDATE RESTRICT, right ?
Fixed, thanks!
On 19.11.24 17:27, Peter Eisentraut wrote:
On 14.11.24 09:04, Peter Eisentraut wrote:
You can also reproduce this with things that are not strings with
collations. You just need to find a type that has values that are
"equal" but "distinct", which is not common, but it exists, for
example 0.0 and -0.0 in floats. Example:create table parent (val float8 primary key);
insert into parent values ('0.0');create table child (id int, val float8 references parent (val));
insert into child values (1, '0.0');
insert into child values (2, '-0.0');update parent set val = '-0.0'; -- ok with NO ACTION
but
create table child (id int, val float8 references parent (val) on
update restrict);insert into child values (1, '0.0');
insert into child values (2, '-0.0');update parent set val = '-0.0'; -- error with RESTRICT
So this is a meaningful difference.
There is also a bug here in that the update in the case of NO ACTION
doesn't actually run, because it thinks the values are the same and
the update can be skipped.I think there is room for improvement here, in the documentation, the
tests, and maybe in the code. And while these are thematically
related to this thread, they are actually separate issues.Back to this. First, there is no bug above. This is all working
correctly, I was just confused.I made a few patches to clarify this:
1. We were using the wrong error code for RESTRICT. A RESTRICT
violation is not the same as a foreign-key violation. (The foreign key
might in theory still be satisfied, but RESTRICT prevents the action
anyway.) I fixed that.2. Added some tests to illustrate all of this (similar to above). I
used case-insensitive collations, which I think is easiest to
understand, but there is nothing special about that.3. Some documentation updates to explain some of the differences between
NO ACTION and RESTRICT better.
I have committed these patches. I think that concludes this thread.