Re: [PROPOSAL] New feature "... VALIDATE CONSTRAINT ... USING INDEX ..."

Started by Vitaly Burovoyabout 10 years ago1 messages
#1Vitaly Burovoy
vitaly.burovoy@gmail.com

On 1/8/16, Simon Riggs <simon@2ndquadrant.com> wrote:

On 8 January 2016 at 13:13, Vitaly Burovoy <vitaly.burovoy@gmail.com>
wrote:

On 1/8/16, Simon Riggs <simon@2ndquadrant.com> wrote:

On 8 January 2016 at 12:49, Vitaly Burovoy <vitaly.burovoy@gmail.com>
wrote:

In Postgres9.1 a new feature was implemented [1] for adding PK and
UNIQUE constraints using indexes created concurrently, but constraints
NOT NULL and CHECK still require full seqscan of a table. New CHECK
constraint allows "NOT VALID" option but VALIDATE CONSTRAINT still
does seqscan (with RowExclusiveLock, but for big and constantly
updatable table it is still awful).

It is possible to find wrong rows in a table without seqscan if there
is an index with a predicate allows to find such rows. There is no
sense what columns it has since it is enough to check whether
index_getnext for it returns NULL (table is OK) or any tuple (table
has wrong rows).

You avoid a full seqscan by creating an index which also does a full
seq
scan.

How does this help? The lock and scan times are the same.

I avoid not a full seqscan, but a time when table is under
ExclusiveLock: index can be build concurrently without locking table.

That is exactly what ADD ...NOT VALID and VALIDATE already does, as of
9.4.

--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/&gt;
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

I'm so sorry, I was wrong. It is a result of my old experience with
Postgres 9.2. There tables were locked by an ACCESS EXCLUSIVE lock...
=(
I missed p. E.6.3.5 in the release notes[1]http://www.postgresql.org/docs/9.4/static/release-9-4.html#AEN120302 for 9.4.

Nevertheless, let me ask why do you reject an ability to use indexes
at a validation process?

Let's imagine a user has to add a CHECK constraint.
He tries to command:
ALTER TABLE tablename ADD CONSTRAINT tablename_expr_chk CHECK
(check_expr) NOT VALID;

It is ok. Then the command:
ALTER TABLE tablename VALIDATE CONSTRAINT tablename_expr_chk;

after some time gives an error:
ERROR: check constraint "tablename_expr_chk" is violated by some row

Hmm... It must be fixed, but which row is wrong? How many wrong rows are there?
The best way is to create an index to find rows (there can be
thousands or more...) and understand how it turns out they violate the
constraint (the user was absolutely sure there's all OK before sending
"VALIDATE CONSTRAINT").
Then he deals with it (using the index for a fast access to wrong
rows), it is time to revalidate the constraint. Hmm... The user has
already had the actual index with a special predicate for being sure
there table has no wrong rows! Why he must wait for the third(!)
seqscan (the first two were validating and indexing) instead of just
using already present index with no entries?

Moreover the most often case of SET NOT NULL constraint is setting
default value without locking a table and set a constraint after all
rows have at least default values as I wrote in the initial letter.
Index there is important and always present at the end of the UPDATE
process (before applying the constraint). Why (even when NOT NULL
moves to the "pg_constraint" table) don't use the index but do seqscan
instead?

It is possible to use another syntax (currently for CHECK constraints
and for NOT NULLs when they appear in the pg_catalog):
ALTER TABLE tablename VALIDATE CONSTRAINT tablename_expr_chk USING
INDEX indexname;

which will use the predicate as it was described in the initial letter.

[1]: http://www.postgresql.org/docs/9.4/static/release-9-4.html#AEN120302

--
Best regards,
Vitaly Burovoy

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers