Forcefully adding a CHECK constrained

Started by Catalin(ux) M. Boiealmost 14 years ago7 messagesgeneral
Jump to latest
#1Catalin(ux) M. Boie
catab@embedromix.ro

Hello.

When I discovered the wonders of partitioning I quickly jumped on-board.
I had big tables used for statistics and a split was needed.

I created the parent, I linked the big table with this new parent and I
added other childs, per month.

Example:
new_stats - parent (empty)
old_stats - child, no CHECK
stats_2012_04 - child, with CHECK
stats_2012_05 - child, with CHECK

The old_stats is so big that I cannot afford to add a check constraint.
But, I know that all values of the itime field are before 2012_04, so,
would be great if I could run something like:

ALTER TABLE old_stats ADD CONSTRAINT xxx CHECK (itime < 2012_04_timestamp)
FORCE;

Of course I can create an index concurrently and then add constraint using
"USING INDEX", but this means a scan of the big table.

I never looked at PostgreSQL sources, but the commit
Enable CHECK constraints to be declared NOT VALID
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=897795240cfaaed724af2f53ed2c50c9862f951f
inspired me to dive.
Is PostgreSQL's team willing to accept such a feature?

Thank you for your time!

(Please keep me on cc)

--
Catalin(ux) M. BOIE
http://kernel.embedromix.ro/

#2Jeff Davis
pgsql@j-davis.com
In reply to: Catalin(ux) M. Boie (#1)
Re: Forcefully adding a CHECK constrained

On Tue, 2012-05-15 at 12:52 +0300, Catalin(ux) M. BOIE wrote:

The old_stats is so big that I cannot afford to add a check constraint.
But, I know that all values of the itime field are before 2012_04, so,
would be great if I could run something like:

ALTER TABLE old_stats ADD CONSTRAINT xxx CHECK (itime < 2012_04_timestamp)
FORCE;

I never looked at PostgreSQL sources, but the commit
Enable CHECK constraints to be declared NOT VALID
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=897795240cfaaed724af2f53ed2c50c9862f951f
inspired me to dive.
Is PostgreSQL's team willing to accept such a feature?

It looks like you already found the answer! Create the constraint using
NOT VALID, and then sometime later (when you can afford the full scan)
do a VALIDATE CONSTRAINT.

Unfortunately, this is only available in 9.2, which is still in beta.

http://www.postgresql.org/docs/9.2/static/sql-altertable.html

CHECK constraints don't use indexes, so CREATE INDEX CONCURRENTLY
doesn't help you.

Regards,
Jeff Davis

#3Catalin(ux) M. Boie
catab@embedromix.ro
In reply to: Jeff Davis (#2)
Re: [GENERAL] Forcefully adding a CHECK constrained

Hello.
Thanks for the answer.

I really want to avoid reading the whole table. It is too expensive, and with the proposed feature will be not needed. I think is much faster to forcefully add the check if you know the range of data.

What do you think?

--
Catalin(ux) M. BOIE
http://kernel.embedromix.ro

----- Reply message -----
From: "Jeff Davis" <pgsql@j-davis.com>
To: "Catalin(ux) M. BOIE" <catab@embedromix.ro>
Cc: <pgsql-general@postgresql.org>
Subject: [GENERAL] Forcefully adding a CHECK constrained
Date: Sat, May 26, 2012 20:48

On Tue, 2012-05-15 at 12:52 +0300, Catalin(ux) M. BOIE wrote:

The old_stats is so big that I cannot afford to add a check constraint.
But, I know that all values of the itime field are before 2012_04, so,
would be great if I could run something like:

ALTER TABLE old_stats ADD CONSTRAINT xxx CHECK (itime < 2012_04_timestamp)
FORCE;

I never looked at PostgreSQL sources, but the commit
Enable CHECK constraints to be declared NOT VALID
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=897795240cfaaed724af2f53ed2c50c9862f951f
inspired me to dive.
Is PostgreSQL's team willing to accept such a feature?

It looks like you already found the answer! Create the constraint using
NOT VALID, and then sometime later (when you can afford the full scan)
do a VALIDATE CONSTRAINT.

Unfortunately, this is only available in 9.2, which is still in beta.

http://www.postgresql.org/docs/9.2/static/sql-altertable.html

CHECK constraints don't use indexes, so CREATE INDEX CONCURRENTLY
doesn't help you.

Regards,
Jeff Davis

#4Jeff Davis
pgsql@j-davis.com
In reply to: Catalin(ux) M. Boie (#3)
Re: Forcefully adding a CHECK constrained

On Sat, 2012-05-26 at 22:06 +0300, Catalin(ux) M. Boie wrote:

Hello.
Thanks for the answer.

I really want to avoid reading the whole table. It is too expensive,
and with the proposed feature will be not needed. I think is much
faster to forcefully add the check if you know the range of data.

What do you think?

Why not just create the CHECK constraint as NOT VALID, and never
validate it? It will still enforce the constraint, it just won't
validate it against your old data, which sounds like what you want.

Regards,
Jeff Davis

#5Catalin(ux) M. Boie
catab@embedromix.ro
In reply to: Jeff Davis (#4)
Re: [GENERAL] Forcefully adding a CHECK constrained

Hello.

Now I understand why I was not clear.

From what I understood, NOT VALID feature will not allow for the CHECK to be used in queries. So, for partitioning, my goal, is critical that the CHECK condition to be used.

I hope I make myself clear now: I want a possibility to add a CHECK that will be used for partitioning without having to read all data for validation.

of course, as I said, I will implement the future if PostgreSQL developers think that is useful.

Thank you for your time.

--
Catalin(ux) M. BOIE
http://kernel.embedromix.ro

----- Reply message -----
From: "Jeff Davis" <pgsql@j-davis.com>
To: "Catalin(ux) M. Boie" <catab@embedromix.ro>
Cc: <pgsql-general@postgresql.org>
Subject: [GENERAL] Forcefully adding a CHECK constrained
Date: Sun, May 27, 2012 19:46

On Sat, 2012-05-26 at 22:06 +0300, Catalin(ux) M. Boie wrote:

Hello.
Thanks for the answer.

I really want to avoid reading the whole table. It is too expensive,
and with the proposed feature will be not needed. I think is much
faster to forcefully add the check if you know the range of data.

What do you think?

Why not just create the CHECK constraint as NOT VALID, and never
validate it? It will still enforce the constraint, it just won't
validate it against your old data, which sounds like what you want.

Regards,
Jeff Davis

#6Marti Raudsepp
marti@juffo.org
In reply to: Catalin(ux) M. Boie (#1)
Re: Forcefully adding a CHECK constrained

On Tue, May 15, 2012 at 12:52 PM, Catalin(ux) M. BOIE
<catab@embedromix.ro> wrote:

The old_stats is so big that I cannot afford to add a check constraint.
But, I know that all values of the itime field are before 2012_04, so, would
be great if I could run something like:

If you Really Really need it and if you're feeling adventurous, you
can attempt to create the constraint on another table and then
"rename" the constraint definition in pg_constraint and pg_depend
catalogs. But doing this WILL VOID YOUR WARRANTY! :)

It's fairly complicated, you have to take care to get every bit of
information right. In particular, if there are any deleted columns in
the old_stats table, the attribute numbers will not match, which can
cause errors or segfaults. Double check from pg_attribute.

And certainly do it in a test environment first; make sure with
pg_dump and restore that it understands the constraint correctly.

Regards,
Marti

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Catalin(ux) M. Boie (#5)
Re: Re: [GENERAL] Forcefully adding a CHECK constrained

"=?utf-8?B?Q2F0YWxpbih1eCkgTS4gQm9pZQ==?=" <catab@embedromix.ro> writes:

I hope I make myself clear now: I want a possibility to add a CHECK that will be used for partitioning without having to read all data for validation.

Basically, this is unlikely to be accepted, because it does not conform
with the project's worldview.

If you are intent on being smarter than the database is, you could
probably create a NOT VALID constraint and then manually poke the system
catalogs to mark it as valid. Then, when (not if) you make a mistake,
you will have only yourself to blame.

A patch that would stand some chance of getting accepted would be one
that would act like CREATE INDEX CONCURRENTLY: create the constraint as
NOT VALID (so it's getting enforced against new rows), then in
background scan all the existing rows to make sure they meet the
constraint too, then finally mark the constraint VALID. But we don't
consider that second step to be optional.

regards, tom lane