Order of enforcement of CHECK constraints?
My Salesforce colleagues noticed some tests flapping as a result of table
CHECK constraints not always being enforced in the same order; ie, if a
tuple insertion/update violates more than one CHECK constraint, it's not
deterministic which one is reported. This is evidently because
relcache.c's CheckConstraintFetch() just happily loads up the constraints
in whatever order it happens to find them in pg_constraint.
There's at least one regression test case where this can happen, so we've
been lucky so far that this hasn't caused buildfarm noise.
We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.
In principle the same problem could occur for domain CHECK constraints,
though the odds of multiple CHECKs failing are probably a lot lower.
Do people think this is worth fixing?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Mar 20, 2015 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.
What not by OID, as with indexes? Are you suggesting that this would
become documented behavior?
--
Peter Geoghegan
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
My Salesforce colleagues noticed some tests flapping as a result of table
CHECK constraints not always being enforced in the same order; ie, if a
tuple insertion/update violates more than one CHECK constraint, it's not
deterministic which one is reported. This is evidently because
relcache.c's CheckConstraintFetch() just happily loads up the constraints
in whatever order it happens to find them in pg_constraint.There's at least one regression test case where this can happen, so we've
been lucky so far that this hasn't caused buildfarm noise.We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.In principle the same problem could occur for domain CHECK constraints,
though the odds of multiple CHECKs failing are probably a lot lower.Do people think this is worth fixing?
Yes... I had thought they were sorted and enforced in alphabetical
order similar to how triggers are fired. Having non-deterministic
check constraint firing seems bad to me.
Thanks,
Stephen
On Fri, Mar 20, 2015 at 4:19 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Fri, Mar 20, 2015 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.What not by OID, as with indexes? Are you suggesting that this would
become documented behavior?
I think they should be executed in alphabetical order like triggers.
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Show quoted text
Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello@gmail.com> writes:
On Fri, Mar 20, 2015 at 4:19 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Fri, Mar 20, 2015 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.
What not by OID, as with indexes? Are you suggesting that this would
become documented behavior?
I think they should be executed in alphabetical order like triggers.
Yeah. We already have a comparable, and documented, behavior for
triggers, so if we're going to do anything about this I'd vote for
sorting by name (or more specifically, by strcmp()).
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 21/03/15 08:15, Tom Lane wrote:
My Salesforce colleagues noticed some tests flapping as a result of table
CHECK constraints not always being enforced in the same order; ie, if a
tuple insertion/update violates more than one CHECK constraint, it's not
deterministic which one is reported. This is evidently because
relcache.c's CheckConstraintFetch() just happily loads up the constraints
in whatever order it happens to find them in pg_constraint.There's at least one regression test case where this can happen, so we've
been lucky so far that this hasn't caused buildfarm noise.We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.In principle the same problem could occur for domain CHECK constraints,
though the odds of multiple CHECKs failing are probably a lot lower.Do people think this is worth fixing?
regards, tom lane
I think that this is a good idea, I would have implicitly assumed that
it was deterministic.
Additionally, people could then name CHECK constraints in a way to get
whatever order they wanted.
The documentation of CREATE TRIGGER says (reading Fabrizio's post
inspired me to look it up):
"If multiple triggers of the same kind are defined for the same event,
they will be fired in alphabetical order by name."
So I would have implicitly assumed the same for CHECK constraints, had I
recently read that.
So I think the current situation is a violation of POLA.
Cheers,
Gavin
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 3/20/15 3:37 PM, Tom Lane wrote:
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello@gmail.com> writes:
On Fri, Mar 20, 2015 at 4:19 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Fri, Mar 20, 2015 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.What not by OID, as with indexes? Are you suggesting that this would
become documented behavior?I think they should be executed in alphabetical order like triggers.
Yeah. We already have a comparable, and documented, behavior for
triggers, so if we're going to do anything about this I'd vote for
sorting by name (or more specifically, by strcmp()).regards, tom lane
+1 for strcmp() ordering. Not only is this logical and consistent with
the way triggers are fired, names can be manipulated by the user to
force order when desired. Not sure if that's as important for check
constraints as it is for triggers but it might be useful, even if only
for things like unit tests.
--
- David Steele
david@pgmasters.net
On Sunday, March 22, 2015, David Steele <david@pgmasters.net> wrote:
On 3/20/15 3:37 PM, Tom Lane wrote:
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello@gmail.com
<javascript:;>> writes:
On Fri, Mar 20, 2015 at 4:19 PM, Peter Geoghegan <pg@heroku.com
<javascript:;>> wrote:
On Fri, Mar 20, 2015 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us
<javascript:;>> wrote:
We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.What not by OID, as with indexes? Are you suggesting that this would
become documented behavior?I think they should be executed in alphabetical order like triggers.
Yeah. We already have a comparable, and documented, behavior for
triggers, so if we're going to do anything about this I'd vote for
sorting by name (or more specifically, by strcmp()).regards, tom lane
+1 for strcmp() ordering. Not only is this logical and consistent with
the way triggers are fired, names can be manipulated by the user to
force order when desired. Not sure if that's as important for check
constraints as it is for triggers but it might be useful, even if only
for things like unit tests.--
- David Steele
david@pgmasters.net <javascript:;>
It would be nice to know that, at scale, the added comparisons are
negligible since it almost all cases all checks are run and pass and the
order is irrelevant. Would it be possible for all check constraints to
always be run and the resultant error outputs stored in an array which
could then be sorted before it is printed? You get better and stable
output for errors while minimally impacting good inserts.
David J.
I might be only one objecting here but ...
On Sat, Mar 21, 2015 at 12:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
My Salesforce colleagues noticed some tests flapping as a result of table
CHECK constraints not always being enforced in the same order; ie, if a
tuple insertion/update violates more than one CHECK constraint, it's not
deterministic which one is reported. This is evidently because
relcache.c's CheckConstraintFetch() just happily loads up the constraints
in whatever order it happens to find them in pg_constraint.
Why is it important to report in deterministic manner? If it really
matters, we should probably report all the failing constraints. A
comparable example would be compiler throwing errors.
There's at least one regression test case where this can happen, so we've
been lucky so far that this hasn't caused buildfarm noise.We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.
In principle the same problem could occur for domain CHECK constraints,
though the odds of multiple CHECKs failing are probably a lot lower.Do people think this is worth fixing?
Downthread, parallels are being drawn with triggers. The order in which
triggers are fired matters since the triggers can affect each other's
results but constraint don't do that. Do they? So, why should we waste some
cycles in sorting the constraints?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
I might be only one objecting here but ...
On Sat, Mar 21, 2015 at 12:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:My Salesforce colleagues noticed some tests flapping as a result of table
CHECK constraints not always being enforced in the same order; ie, if a
tuple insertion/update violates more than one CHECK constraint, it's not
deterministic which one is reported. This is evidently because
relcache.c's CheckConstraintFetch() just happily loads up the constraints
in whatever order it happens to find them in pg_constraint.
Why is it important to report in deterministic manner?
If nothing else, so as not to have regression-test failures.
If it really
matters, we should probably report all the failing constraints.
That wouldn't in itself make the output deterministic (you'd still have to
sort); and in any case that's not going to happen because it would require
running each CHECK constraint in its own subtransaction. Catching errors
that way is *expensive*. And there's been zero field demand for such a
behavior, so I don't see us adding cycles for something no one's asked
for. Sorting the check constraints during relcache load, on the other
hand, is a negligible burden compared to the cost of reading
pg_constraint.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Mar 20, 2015 at 4:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello@gmail.com> writes:
On Fri, Mar 20, 2015 at 4:19 PM, Peter Geoghegan <pg@heroku.com> wrote:
On Fri, Mar 20, 2015 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.What not by OID, as with indexes? Are you suggesting that this would
become documented behavior?I think they should be executed in alphabetical order like triggers.
Yeah. We already have a comparable, and documented, behavior for
triggers, so if we're going to do anything about this I'd vote for
sorting by name (or more specifically, by strcmp()).
Isn't better do this to read pg_constraint in name order?
- conscan = systable_beginscan(conrel, ConstraintRelidIndexId, true,
+ conscan = systable_beginscan(conrel, ConstraintNameNspIndexId, true,
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Show quoted text
Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello@gmail.com> writes:
On Fri, Mar 20, 2015 at 4:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.
Isn't better do this to read pg_constraint in name order?
- conscan = systable_beginscan(conrel, ConstraintRelidIndexId, true, + conscan = systable_beginscan(conrel, ConstraintNameNspIndexId, true,
Surely not. That would end up having to read *all* of pg_constraint, not
only the rows applicable to the current relation.
We could get the index to do the work for us if we changed it from an
index on conrelid to one on conrelid, conname. However, seeing that that
would bloat the index by a factor of sixteen, it hardly sounds like a
free fix either.
I really think that a quick application of qsort is the best-performing
way to do this.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 23, 2015 at 7:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> writes:
I might be only one objecting here but ...
On Sat, Mar 21, 2015 at 12:45 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:My Salesforce colleagues noticed some tests flapping as a result of
table
CHECK constraints not always being enforced in the same order; ie, if a
tuple insertion/update violates more than one CHECK constraint, it's not
deterministic which one is reported. This is evidently because
relcache.c's CheckConstraintFetch() just happily loads up theconstraints
in whatever order it happens to find them in pg_constraint.
Why is it important to report in deterministic manner?
If nothing else, so as not to have regression-test failures.
May be then we are writing tests which are not doing the intended thing.
Unless the test is explicitly testing the behaviour in case of more than
one failing constraint, it looks like the test is not doing the right
thing. If it's testing that case explicitly, by imposing an order, we are
still testing a single constraint failure case.
In case of million inserts or bulk load with constraints on, these few
cycles spent in ordering the constraints might be problematic, unless the
ordering happens only once for a series of inserts.
If it really
matters, we should probably report all the failing constraints.That wouldn't in itself make the output deterministic (you'd still have to
sort); and in any case that's not going to happen because it would require
running each CHECK constraint in its own subtransaction. Catching errors
that way is *expensive*. And there's been zero field demand for such a
behavior, so I don't see us adding cycles for something no one's asked
for. Sorting the check constraints during relcache load, on the other
hand, is a negligible burden compared to the cost of reading
pg_constraint.regards, tom lane
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company
On 24 March 2015 at 17:51, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
wrote:
In case of million inserts or bulk load with constraints on, these few
cycles spent in ordering the constraints might be problematic, unless the
ordering happens only once for a series of inserts.
As far as I can see this sort only occurs when a new relation is loaded
into the relcache. Relations are cached in the relcache when they're
accessed for the first time in each backend, and the cache will be
invalidated when the current backend or another backend perform DDL on that
relation.
Without relcache we'd need to lookup the system catalogs for just about
every SQL statement... This would be very slow indeed!
Regards
David Rowley
On Mon, Mar 23, 2015 at 12:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello@gmail.com> writes:
On Fri, Mar 20, 2015 at 4:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.Isn't better do this to read pg_constraint in name order?
- conscan = systable_beginscan(conrel, ConstraintRelidIndexId, true, + conscan = systable_beginscan(conrel, ConstraintNameNspIndexId, true,Surely not. That would end up having to read *all* of pg_constraint, not
only the rows applicable to the current relation.
Yeah... you're correct... we need the oid in the index.
We could get the index to do the work for us if we changed it from an
index on conrelid to one on conrelid, conname. However, seeing that that
would bloat the index by a factor of sixteen, it hardly sounds like a
free fix either.
But in this way we can save some cicles as Ashutosh complains... or am I
missing something?
I really think that a quick application of qsort is the best-performing
way to do this.
Something like the attached?
With current master:
fabrizio=# create table foo(a integer, b integer);
CREATE TABLE
fabrizio=# alter table foo add constraint aa check(a>0);
ALTER TABLE
fabrizio=# alter table foo add constraint bb check(b>0);
ALTER TABLE
fabrizio=# insert into foo values (0,0);
ERROR: new row for relation "foo" violates check constraint "bb"
DETAIL: Failing row contains (0, 0).
With the attached patch:
fabrizio=# create table foo(a integer, b integer);
CREATE TABLE
fabrizio=# alter table foo add constraint aa check(a>0);
ALTER TABLE
fabrizio=# alter table foo add constraint bb check(b>0);
ALTER TABLE
fabrizio=# insert into foo values (0,0);
ERROR: new row for relation "foo" violates check constraint "aa"
DETAIL: Failing row contains (0, 0).
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Show quoted text
Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello
Attachments:
order-of-check-constraints-v1.patchtext/x-diff; charset=US-ASCII; name=order-of-check-constraints-v1.patchDownload
diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 1db4ba84..d072031 100644
--- a/src/backend/utils/cache/relcache.c
+++ b/src/backend/utils/cache/relcache.c
@@ -282,6 +282,7 @@ static OpClassCacheEnt *LookupOpclassInfo(Oid operatorClassOid,
StrategyNumber numSupport);
static void RelationCacheInitFileRemoveInDir(const char *tblspcpath);
static void unlink_initfile(const char *initfilename);
+static int constrcheck_cmp(const void *p1, const void *p2);
/*
@@ -3734,6 +3735,20 @@ CheckConstraintFetch(Relation relation)
if (found != ncheck)
elog(ERROR, "%d constraint record(s) missing for rel %s",
ncheck - found, RelationGetRelationName(relation));
+
+ qsort((void *) check, ncheck, sizeof(ConstrCheck), constrcheck_cmp);
+}
+
+/*
+ * ConstrCheck qsort/bsearch comparator.
+ */
+static int
+constrcheck_cmp(const void *p1, const void *p2)
+{
+ ConstrCheck c1 = *(ConstrCheck *) p1;
+ ConstrCheck c2 = *(ConstrCheck *) p2;
+
+ return strcmp(c1.ccname, c2.ccname);
}
/*
On Tue, Mar 24, 2015 at 4:28 PM, Fabrízio de Royes Mello <
fabriziomello@gmail.com> wrote:
On Mon, Mar 23, 2015 at 12:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= <fabriziomello@gmail.com>
writes:
On Fri, Mar 20, 2015 at 4:37 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
We could fix it by, say, having CheckConstraintFetch() sort the
constraints by name after loading them.Isn't better do this to read pg_constraint in name order?
- conscan = systable_beginscan(conrel, ConstraintRelidIndexId, true, + conscan = systable_beginscan(conrel, ConstraintNameNspIndexId,
true,
Surely not. That would end up having to read *all* of pg_constraint,
not
only the rows applicable to the current relation.
Yeah... you're correct... we need the oid in the index.
We could get the index to do the work for us if we changed it from an
index on conrelid to one on conrelid, conname. However, seeing that
that
would bloat the index by a factor of sixteen, it hardly sounds like a
free fix either.But in this way we can save some cicles as Ashutosh complains... or am I
missing something?
I really think that a quick application of qsort is the best-performing
way to do this.Something like the attached?
With current master:
fabrizio=# create table foo(a integer, b integer);
CREATE TABLE
fabrizio=# alter table foo add constraint aa check(a>0);
ALTER TABLE
fabrizio=# alter table foo add constraint bb check(b>0);
ALTER TABLE
fabrizio=# insert into foo values (0,0);
ERROR: new row for relation "foo" violates check constraint "bb"
DETAIL: Failing row contains (0, 0).With the attached patch:
fabrizio=# create table foo(a integer, b integer);
CREATE TABLE
fabrizio=# alter table foo add constraint aa check(a>0);
ALTER TABLE
fabrizio=# alter table foo add constraint bb check(b>0);
ALTER TABLE
fabrizio=# insert into foo values (0,0);
ERROR: new row for relation "foo" violates check constraint "aa"
DETAIL: Failing row contains (0, 0).
Forgot this patch... you've already pushed to the master the qsort for
check constraints [1]http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=e5f455f59fed0632371cddacddd79895b148dc07. Really nice!!
Regards,
[1]: http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=e5f455f59fed0632371cddacddd79895b148dc07
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=e5f455f59fed0632371cddacddd79895b148dc07
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Show quoted text
Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello