Proposed patch: make pg_dump --data-only consider FK constraints
Okay, I got tired of seeing people complain about foreign-key constraint
violations in data-only dumps. While it's true that the problem can't
be solved in the general case (because of potentially circular
references), we could certainly make pg_dump at least *try* to order the
tables according to foreign key relationships. It turns out not to even
require a whole lot of new code. Accordingly I propose the attached
patch. It will order the tables safely if it can, and otherwise
complain like this:
pg_dump: WARNING: circular foreign-key constraints among these table(s):
pg_dump: master
pg_dump: child
pg_dump: You may not be able to restore the dump without using --disable-triggers or temporarily dropping the constraints.
Comments?
regards, tom lane
Tom Lane wrote:
Okay, I got tired of seeing people complain about foreign-key constraint
violations in data-only dumps. While it's true that the problem can't
be solved in the general case (because of potentially circular
references), we could certainly make pg_dump at least *try* to order the
tables according to foreign key relationships. It turns out not to even
require a whole lot of new code. Accordingly I propose the attached
patch. It will order the tables safely if it can, and otherwise
complain like this:
+1
pg_dump: WARNING: circular foreign-key constraints among these table(s):
pg_dump: master
pg_dump: child
pg_dump: You may not be able to restore the dump without using --disable-triggers or temporarily dropping the constraints.
WARNING feels a bit too strong. I realize that that message isn't going
to the postmaster's log, bloating it, but if a user does that regularly,
always disabling triggers as instructed, or there is in fact never
circular references in the data with a particular schema, seeing that
big fat warning every time is going to become a bit tiresome. Perhaps
"NOTE: ..." ?
How about printing that notice at the top of the dump file as well? Most
people probably don't look at the dump files, but if someone needs to
deal with a data-only dumps that contain circular constraints, and also
those that don't, it would be invaluable information.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Sun, Sep 07, 2008 at 02:06:40PM -0400, Tom Lane wrote:
Okay, I got tired of seeing people complain about foreign-key
constraint violations in data-only dumps.
Isn't this something solved in the more general case by having
pre-data, data, and post-data dump options?
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
Heikki Linnakangas wrote:
Tom Lane wrote:
pg_dump: WARNING: circular foreign-key constraints among these table(s):
pg_dump: master
pg_dump: child
pg_dump: You may not be able to restore the dump without using --disable-triggers or temporarily dropping the constraints.WARNING feels a bit too strong. I realize that that message isn't going
to the postmaster's log, bloating it, but if a user does that regularly,
always disabling triggers as instructed, or there is in fact never
circular references in the data with a particular schema, seeing that
big fat warning every time is going to become a bit tiresome. Perhaps
"NOTE: ..." ?
But the warning is only going to be emitted if there are actual circular
FK constraints, so it seems OK.
How about printing that notice at the top of the dump file as well? Most
people probably don't look at the dump files, but if someone needs to
deal with a data-only dumps that contain circular constraints, and also
those that don't, it would be invaluable information.
I assume that this trick will only work at restore time only for custom
or tar dumps. A text-only dump would produce the warning to stderr at
dump time, no?
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
David Fetter <david@fetter.org> writes:
On Sun, Sep 07, 2008 at 02:06:40PM -0400, Tom Lane wrote:
Okay, I got tired of seeing people complain about foreign-key
constraint violations in data-only dumps.
Isn't this something solved in the more general case by having
pre-data, data, and post-data dump options?
No, not unless you expect that that patch will somehow forbid people
from using data-only dumps.
regards, tom lane
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Tom Lane wrote:
pg_dump: WARNING: circular foreign-key constraints among these table(s):
pg_dump: master
pg_dump: child
pg_dump: You may not be able to restore the dump without using --disable-triggers or temporarily dropping the constraints.
WARNING feels a bit too strong. I realize that that message isn't going
to the postmaster's log, bloating it, but if a user does that regularly,
always disabling triggers as instructed, or there is in fact never
circular references in the data with a particular schema, seeing that
big fat warning every time is going to become a bit tiresome. Perhaps
"NOTE: ..." ?
I doubt that very many people will ever see it at all, actually --- how
common are circular FK relationships? And it does seem appropriate to
me for pg_dump to be noisy about the possibility of trouble at restore
time. (Maybe the message should also suggest using a schema+data dump,
since that would be a solution at dump time?)
How about printing that notice at the top of the dump file as well?
Hmm ... that might be feasible in plain text output, but I don't see
any easy way to get a similar effect in archive modes.
regards, tom lane
Alvaro Herrera <alvherre@commandprompt.com> writes:
Heikki Linnakangas wrote:
How about printing that notice at the top of the dump file as well?
I assume that this trick will only work at restore time only for custom
or tar dumps. A text-only dump would produce the warning to stderr at
dump time, no?
Yes, the warning (and the re-sorting) must happen at dump time. Given a
data-only dump, pg_restore wouldn't even have the information needed to
do anything about this.
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes:
I doubt that very many people will ever see it at all, actually --- how
common are circular FK relationships? And it does seem appropriate to
me for pg_dump to be noisy about the possibility of trouble at restore
time. (Maybe the message should also suggest using a schema+data dump,
since that would be a solution at dump time?)
I think they're surprisingly common actually. Most complex databases end up
with them one way or another. Either through a parent-child relationship or
from two different types of relationships (such as "user which owns this
directory" and "home directory of this user").
The other reason to think NOTICE might be better is that it's something which,
if it occurs once, will always occur for that database. So a sysadmin will
become inured to seeing WARNING on his backups. Are there any other warning
conditions which could occur spontaneously that this would mask?
One minor thought -- surely the main use case for data-only dumps is for
importing into another brand of database. In which case the message seems a
bit awkward -- it could talk generically about disabling or dropping the
constraints and then have a hint to indicate how to do that with Postgres.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's PostGIS support!
Gregory Stark <stark@enterprisedb.com> writes:
The other reason to think NOTICE might be better is that it's something which,
if it occurs once, will always occur for that database. So a sysadmin will
become inured to seeing WARNING on his backups. Are there any other warning
conditions which could occur spontaneously that this would mask?
[ shrug ... ] Seems I'm outvoted, so NOTICE it will be.
One minor thought -- surely the main use case for data-only dumps is for
importing into another brand of database. In which case the message seems a
bit awkward -- it could talk generically about disabling or dropping the
constraints and then have a hint to indicate how to do that with Postgres.
I'm not convinced that data-only dumps are used mainly for that purpose.
In any case I don't want to turn this message into a paragraph;
mentioning all the things you might do about it in a Postgres context is
already making it longer than I would like...
regards, tom lane
* Gregory Stark (stark@enterprisedb.com) wrote:
The other reason to think NOTICE might be better is that it's something which,
if it occurs once, will always occur for that database. So a sysadmin will
become inured to seeing WARNING on his backups. Are there any other warning
conditions which could occur spontaneously that this would mask?
Impartial on NOTICE vs. WARNING, it could go either way for me.
One minor thought -- surely the main use case for data-only dumps is for
importing into another brand of database. In which case the message seems a
bit awkward -- it could talk generically about disabling or dropping the
constraints and then have a hint to indicate how to do that with Postgres.
I have to disagree strongly with this. We have multiple PG instances
and often have cause to copy between them using data-only pg_dump. On
the other side, I've never used pg_dump (data-only or not) to generate
something to load data into a different database.
Thanks,
Stephen
Tom Lane wrote:
How about printing that notice at the top of the dump file as well?
Hmm ... that might be feasible in plain text output, but I don't see
any easy way to get a similar effect in archive modes.
Just saw this, obviously very late, but from memory there is a TOC entry
type for comments or warnings that get output when the dump is used.