Teaching pg_dump to use NOT VALID constraints

Started by Simon Riggsover 11 years ago7 messageshackers
Jump to latest
#1Simon Riggs
simon@2ndQuadrant.com

Magnus and I discussed the need for pg_dump to offer the use of NOT
VALID constraints.
Here's the patch.

pg_dump --no-revalidaton

will add "NOT VALID" onto the recreation SQL for any FKs, but only for
ones that were already known to be valid.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

pg_dump_revalidation.v1.patchapplication/octet-stream; name=pg_dump_revalidation.v1.patchDownload+37-9
#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Simon Riggs (#1)
Re: Teaching pg_dump to use NOT VALID constraints

Simon Riggs wrote:

Magnus and I discussed the need for pg_dump to offer the use of NOT
VALID constraints.
Here's the patch.

pg_dump --no-revalidaton

will add "NOT VALID" onto the recreation SQL for any FKs, but only for
ones that were already known to be valid.

Well. Constraints that haven't been validated already have a NOT VALID
emitted by ruleutils.c, yes? So what this patch does is add such a
clause for all *other* constraints. Right? In other words what it aims
to do is speed up loading of data by skipping the validation step on
restore. Is that right?

ISTM we could have the default pg_dump behavior emit NOT VALID
constraints, and add VALIDATE CONSTRAINT commands at the end; that way
the database is usable sooner but the constraints end up marked as
validated by the time the dump is finished.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Simon Riggs
simon@2ndQuadrant.com
In reply to: Alvaro Herrera (#2)
Re: Teaching pg_dump to use NOT VALID constraints

On 10 November 2014 17:33, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

pg_dump --no-revalidaton

will add "NOT VALID" onto the recreation SQL for any FKs, but only for
ones that were already known to be valid.

Well. Constraints that haven't been validated already have a NOT VALID
emitted by ruleutils.c, yes? So what this patch does is add such a
clause for all *other* constraints. Right? In other words what it aims
to do is speed up loading of data by skipping the validation step on
restore. Is that right?

Correct. CHECK constraints are added onto main table so they validate at load.

ISTM we could have the default pg_dump behavior emit NOT VALID
constraints, and add VALIDATE CONSTRAINT commands at the end; that way
the database is usable sooner but the constraints end up marked as
validated by the time the dump is finished.

Yes, may be an even better idea. We'd still want the --no-revalidation
option, AFAICS.

FKs are already "at the end". Perhaps we should add another
"validation" section?

I like the idea, just not sure how long it would take.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Simon Riggs (#3)
Re: Teaching pg_dump to use NOT VALID constraints

On 11/10/14, 12:00 PM, Simon Riggs wrote:

On 10 November 2014 17:33, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

pg_dump --no-revalidaton

will add "NOT VALID" onto the recreation SQL for any FKs, but only for
ones that were already known to be valid.

Well. Constraints that haven't been validated already have a NOT VALID
emitted by ruleutils.c, yes? So what this patch does is add such a
clause for all *other* constraints. Right? In other words what it aims
to do is speed up loading of data by skipping the validation step on
restore. Is that right?

Correct. CHECK constraints are added onto main table so they validate at load.

ISTM we could have the default pg_dump behavior emit NOT VALID
constraints, and add VALIDATE CONSTRAINT commands at the end; that way
the database is usable sooner but the constraints end up marked as
validated by the time the dump is finished.

Yes, may be an even better idea. We'd still want the --no-revalidation
option, AFAICS.

FKs are already "at the end". Perhaps we should add another
"validation" section?

I like the idea, just not sure how long it would take.

Isn't the real use-case here that if constraints were valid when you dumped then we shouldn't have to *any* re-validate when we load? (Though, we'd have to be careful of that with CHECK because that can call user code...)
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Jim Nasby (#4)
Re: Teaching pg_dump to use NOT VALID constraints

On 13 November 2014 00:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

Isn't the real use-case here that if constraints were valid when you dumped
then we shouldn't have to *any* re-validate when we load? (Though, we'd have
to be careful of that with CHECK because that can call user code...)

Yes, that is the use case this patch would improve considerably.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Simon Riggs (#5)
Re: Teaching pg_dump to use NOT VALID constraints

On 11/13/14 5:07 AM, Simon Riggs wrote:

On 13 November 2014 00:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

Isn't the real use-case here that if constraints were valid when you dumped
then we shouldn't have to *any* re-validate when we load? (Though, we'd have
to be careful of that with CHECK because that can call user code...)

Yes, that is the use case this patch would improve considerably.

But your patch doesn't really address that. It just leaves the
constraints as invalid, and someone will have to revalidate them later
(how?). What Jim was describing was a mode that creates the constraints
as valid but doesn't actually validate them. I can see both sides of
that kind of feature.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Eisentraut (#6)
Re: Teaching pg_dump to use NOT VALID constraints

Peter Eisentraut wrote:

On 11/13/14 5:07 AM, Simon Riggs wrote:

On 13 November 2014 00:20, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:

Isn't the real use-case here that if constraints were valid when you dumped
then we shouldn't have to *any* re-validate when we load? (Though, we'd have
to be careful of that with CHECK because that can call user code...)

Yes, that is the use case this patch would improve considerably.

But your patch doesn't really address that. It just leaves the
constraints as invalid, and someone will have to revalidate them later
(how?). What Jim was describing was a mode that creates the constraints
as valid but doesn't actually validate them. I can see both sides of
that kind of feature.

This might lead to users introducing invalid data by way of declaring
constants as valid but not checked by the system; if they turn out to be
invalid, the final state is a mess. I would only buy such a feature if
we had some way to pass down the knowledge of the constraint being valid
in the original system through some other means; say emit a CRC of the
copy data in the pg_dump output that can be verified while loading, and
only allow unvalidated constraints to be marked VALID if the sum
matches.

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers