BUG #2308: pg_dump -a does not respect referential dependencies

Started by Matthew Georgeabout 20 years ago4 messagesbugs
Jump to latest
#1Matthew George
georgema@corp.earthlink.net

The following bug has been logged online:

Bug reference: 2308
Logged by: Matthew George
Email address: georgema@corp.earthlink.net
PostgreSQL version: 8.0.3
Operating system: Darwin Kernel Version 8.5.0: Sun Jan 22 10:38:46 PST
2006; root:xnu-792.6.61.obj~1/RELEASE_PPC Power Macintosh powerpc
Description: pg_dump -a does not respect referential dependencies
Details:

When using pg_dump to dump a database, the schema and data are arranged
within the dump so that it may be imported without violating referential
integrity.

When using the -a option to get a data-only dump, the data is ordered in the
dump alphabetically by table.

If a new schema is loaded into a fresh database, the output from the
data-only dump cannot be imported via \i in psql without manually editing
the dump file and reordering the inserts / copies such that dependent tables
have their data loaded before the tables that depend on them. This is
inconvenient at best.

Since the logic obviously exists within pg_dump already to arrange the data
in the correct order of reference dependencies, can this be added to the
code path for `pg_dump -a` as well? Or can another option be added that
specifies alphabetic vs. dependency order?

This would really cut down the time it takes to do schema upgrades on
projects that I work with.

Thanks

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Matthew George (#1)
Re: BUG #2308: pg_dump -a does not respect referential dependencies

"Matthew George" <georgema@corp.earthlink.net> writes:

Since the logic obviously exists within pg_dump already to arrange the data
in the correct order of reference dependencies, can this be added to the
code path for `pg_dump -a` as well?

No.  In a data-only restore there may not be *any* ordering that works
--- consider circular dependencies.

The best practice is to do standard schema+data dumps, wherein the
ordering problem can be handled properly by not creating the FK
constraints until after the data is loaded. If you really want to
do a data-only restore, I'd suggest dropping the FK constraints
and re-adding them afterwards (which will be a lot faster than
row-by-row retail checks would be, anyway).

Another possibility is the --disable-triggers option, but I can't
really recommend that, because if there are any referential problems
in the data you load, that way will fail to catch it.

regards, tom lane

#3Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Tom Lane (#2)
Re: BUG #2308: pg_dump -a does not respect referential dependencies

On Mar 8, 2006, at 3:01 PM, Tom Lane wrote:

"Matthew George" <georgema@corp.earthlink.net> writes:
The best practice is to do standard schema+data dumps, wherein the
ordering problem can be handled properly by not creating the FK
constraints until after the data is loaded. If you really want to
do a data-only restore, I'd suggest dropping the FK constraints
and re-adding them afterwards (which will be a lot faster than
row-by-row retail checks would be, anyway).

BTW, it would be really nice if we provided a better way to do this
than manually dropping all the FK constraints and adding them back in
later. Would it be difficult to allow deferring all constraints in
the database during a specified transaction? That would allow for
loading the data in a transaction and doing the constraint checking
later...
--
Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#3)
Re: BUG #2308: pg_dump -a does not respect referential dependencies

Jim Nasby <decibel@decibel.org> writes:

BTW, it would be really nice if we provided a better way to do this
than manually dropping all the FK constraints and adding them back in
later. Would it be difficult to allow deferring all constraints in
the database during a specified transaction? That would allow for
loading the data in a transaction and doing the constraint checking
later...

You can try SET CONSTRAINTS ALL DEFERRED, but that only works for
constraints that are declared deferrable, which by default FK
constraints are not (stupid but that's what the spec requires).
In any case this would still have performance issues because the
behavior is tuned for transactions that update relatively small
numbers of rows. Drop/add constraint is a lot better choice in
the context of a bulk load.

I was toying just now with the idea of a pg_dump mode that would issue
the drop and re-add constraint commands for you. This would only help
for constraints that pg_dump knows of (ie were in the source database),
not any random new FK constraints that might be in the DB you are
loading into, but it'd sure beat doing it manually.

regards, tom lane