pg_dump bug fixing
Hi everyone,
I've decided to attempt to nail all known bugs in pg_dump for 7.5 :)
So, please send me ALL your known bugs/issues with pg_dump, pg_dumpall
and pg_restore. Note that I am NOT interested in feature requests, ONLY
bugs.
A bug is considered to be an issue in pg_dump that means that when a
legally arrived at state in your PostgreSQL database, running pg_dump
and then restoring that dump does not result in an identical state.
Legally means 'obtained without manual catalog hacking' and 'identical'
means except for object OIDs.
The current list of known issues (for which I haven't submitted a fix
yet) that I have are as follows:
* Circular view dependencies (a pretty minor/rare issue...can only be
"fixed" by banning it in the backend)
* Alter object owner, privileges get a bit messed up. This is really a
backend bug, but there might be a pg_dump workaround for it.
* Tablespace that primary key and unique constraint indexes are in are
not dumped
* If you drop your public schema, a drop command is not issued for it in
the dump, so when you restore your public schema is back
Does anyone have any others?
Chris
On Sun, Jul 18, 2004 at 14:33:09 +0800,
Christopher Kings-Lynne <chriskl@familyhealth.com.au> wrote:
* If you drop your public schema, a drop command is not issued for it in
the dump, so when you restore your public schema is back
I am not sure that is really a bug. If someone really wants less than
what is in template1, they should be dropping stuff from template1
before recreating the database.
I am not sure that is really a bug. If someone really wants less than
what is in template1, they should be dropping stuff from template1
before recreating the database.
No, because pg_dump itself dumps template1's contents...
Chris
On Sun, Jul 18, 2004 at 19:42:09 +0800,
Christopher Kings-Lynne <chriskl@familyhealth.com.au> wrote:
I am not sure that is really a bug. If someone really wants less than
what is in template1, they should be dropping stuff from template1
before recreating the database.No, because pg_dump itself dumps template1's contents...
Are you planning on extending this line of thought to other things in
template1 such as operators?
On Sun, 2004-07-18 at 07:42, Christopher Kings-Lynne wrote:
I am not sure that is really a bug. If someone really wants less than
what is in template1, they should be dropping stuff from template1
before recreating the database.No, because pg_dump itself dumps template1's contents...
Hmm.
1. Add language handler (say plpgsql) to template1.
2. Create new database
3. Dump database
4. Restore dump, having pg_dump create a new database
Doh.. errors because language handlers are there twice.
You can play this game with tables, sequences, and all sorts of other
things that might be useful in the default template. pg_dump just skips
it (not necessarily bad unless they did an alter table afterward) but
pg_restore doesn't like this at all.
I think what we want is a clean template without all of the extras that
template1 has.
If we dump & restore the public schema and other items that we might be
interested in having, we can have pg_dump use template_clean as a
template for new databases. Template_clean contains none of the things
that pg_dump can dump/restore (like language handlers) but is NOT the
default template.
I've done this myself, removed lots of stuff from template1 after
creating a template_<companyname> that is used for creating new DBs
with. This works great when we remember to specify WITH TEMPLATE.
No, because pg_dump itself dumps template1's contents...
Are you planning on extending this line of thought to other things in
template1 such as operators?
Ah, I see where you are going with this....
I think that we should treat the public schema specially :) That's
because it is much more likely to be dropped and messed with than the
other system objects, in fact you're encouraged to drop it in the docs.
Chris
Rod Taylor <pg@rbt.ca> writes:
I think what we want is a clean template without all of the extras that
template1 has.
Sounds like a job for ... template0 !
Seriously, this thread would be more convincing if anyone in it betrayed
any knowledge that pg_dump wants you to start from template0 rather than
template1.
regards, tom lane
Sounds like a job for ... template0 !
Seriously, this thread would be more convincing if anyone in it betrayed
any knowledge that pg_dump wants you to start from template0 rather than
template1.
What if we made it so that template1 is always restored last? Won't
that be an improvement?
Chris
On Sun, 2004-07-18 at 23:55, Tom Lane wrote:
Rod Taylor <pg@rbt.ca> writes:
I think what we want is a clean template without all of the extras that
template1 has.Sounds like a job for ... template0 !
It doesn't quite work in my case as I've removed items included in
template0 (public schema).
Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
I think that we should treat the public schema specially :)
We already are to some extent, since pg_dump will dump its comment and
privileges, which it would not do for any other predefined object.
I think this is actually an implementation artifact rather than
something that was explicitly intended at the time, but since no one
has complained about it, it's probably a good thing ;-)
Also, if we're going to apply Fabien's proposed patch to alter the
ownership of the public schema, that's still another way in which the
public schema becomes less like a system-defined object ...
regards, tom lane
Rod,
I think what we want is a clean template without all of the extras that
template1 has.
We have this, it's called Template0.
Actually, KL, that would solve a lot of these duplicate object problems. What
if pg_restore used Template0 and not Template1? It wouldn't fix the "drop
public schema" issue but it would solve the others.
--
Josh Berkus
Aglio Database Solutions
San Francisco
Import Notes
Reply to msg id not found: 20040719034640.CE568D1B16E@svr1.postgresql.orgReference msg id not found: 20040719034640.CE568D1B16E@svr1.postgresql.org | Resolved by subject fallback
On Mon, 2004-07-19 at 12:36, Josh Berkus wrote:
Rod,
I think what we want is a clean template without all of the extras that
template1 has.We have this, it's called Template0.
Doesn't work for me. I remove a number of things that are included by
default in template0, but yes, it's close.
Actually, KL, that would solve a lot of these duplicate object problems. What
if pg_restore used Template0 and not Template1? It wouldn't fix the "drop
public schema" issue but it would solve the others.
Remove the public schema from template0, but leave it in template1. Have
pg_dump treat the public schema the same as all of the other ones.
Rod,
Remove the public schema from template0, but leave it in template1. Have
pg_dump treat the public schema the same as all of the other ones.
Hmmm. No good; it wipes out the primary purpose of Template0, which is to
restore a corrupted Template1.
--
-Josh Berkus
Aglio Database Solutions
San Francisco
Rod Taylor <pg@rbt.ca> writes:
Remove the public schema from template0, but leave it in template1.
Does not sound very workable. One of the functions of template0 is to
be a backup for the virgin state of template1, and you'd lose that.
regards, tom lane
On Mon, 2004-07-19 at 13:30, Josh Berkus wrote:
Rod,
Remove the public schema from template0, but leave it in template1. Have
pg_dump treat the public schema the same as all of the other ones.Hmmm. No good; it wipes out the primary purpose of Template0, which is to
restore a corrupted Template1.
Yeah.. This is why my first suggestion was to create a new template for
this purpose -- of which most responses told me about template0.
We already are to some extent, since pg_dump will dump its comment and
privileges, which it would not do for any other predefined object.
I think this is actually an implementation artifact rather than
something that was explicitly intended at the time, but since no one
has complained about it, it's probably a good thing ;-)
Does that mean your in favour of dumping a DROP SCHEMA public; command
if they have dropped their public schema? It's definitely not worth
doing it for any other "system" object due to upwards compatibility of
the dump files...
Chris
Actually, KL, that would solve a lot of these duplicate object problems. What
if pg_restore used Template0 and not Template1? It wouldn't fix the "drop
public schema" issue but it would solve the others.
Not sure what you mean here, but CVS pg_dump dumps like this:
CREATE DATABASE phppgadmin WITH TEMPLATE = template0 OWNER = chriskl
ENCODING = 'LATIN1';
CREATE DATABASE usa WITH TEMPLATE = template0 OWNER = usadmin ENCODING =
'LATIN1';
Chris
On Mon, 2004-07-19 at 21:20, Christopher Kings-Lynne wrote:
We already are to some extent, since pg_dump will dump its comment and
privileges, which it would not do for any other predefined object.
I think this is actually an implementation artifact rather than
something that was explicitly intended at the time, but since no one
has complained about it, it's probably a good thing ;-)Does that mean your in favour of dumping a DROP SCHEMA public; command
if they have dropped their public schema? It's definitely not worth
doing it for any other "system" object due to upwards compatibility of
the dump files...
Please don't. It would be rather surprising to have stuff disappear from
a database while doing a restore -- especially if it's a CASCADE
operation.
Creating the public schema when needed and leaving it out by default is
preferable.
In article <1090256502.414.17.camel@jester>,
Rod Taylor <pg@rbt.ca> writes:
On Mon, 2004-07-19 at 12:36, Josh Berkus wrote:
Rod,
I think what we want is a clean template without all of the extras that
template1 has.We have this, it's called Template0.
Doesn't work for me. I remove a number of things that are included by
default in template0, but yes, it's close.
I think pg_dump should do a kind of "diff" between template1 and the
database in question and include the necessary statements in the dump
to allow pg_restore to "replay" the diff after it has created the
database from template1.
Import Notes
Reference msg id not found: 20040719034640.CE568D1B16E@svr1.postgresql.org
KL,
CREATE DATABASE phppgadmin WITH TEMPLATE = template0 OWNER = chriskl
ENCODING = 'LATIN1';
CREATE DATABASE usa WITH TEMPLATE = template0 OWNER = usadmin ENCODING =
'LATIN1';
Ok, so that would come under the heading of "already fixed". Great.
--
Josh Berkus
Aglio Database Solutions
San Francisco
Chris,
I've just found a bit of undesirable functionality which I would call a bug in
pg_dump. I'm not sure everyone would, but we'll see.
Problem: the script which dumps globals such as users (pg_dumpall -g)
involves deleting *all* users from the pg_shadow table via a direct update to
that table. What this means in effect is that, should you attempt to use
"pg_dumpall -g" to *move* a set of users from one active server to another
(such as for transferring a database) the resulting pg_dump file will delete
all of the users which previously existed on that server.
This is a non-trivial accident to have happen on a shared machine; once users
are dumped, all of their ownerships and permissions go with them. If you
have a complex permissions system, better hope you backed up first!
I find this behavior highly undesirable, and consider it a bug. The globals
dump should just add users, and not delete any.
--
-Josh Berkus
Aglio Database Solutions
San Francisco
Josh Berkus <josh@agliodbs.com> writes:
Problem: the script which dumps globals such as users (pg_dumpall -g)
involves deleting *all* users from the pg_shadow table via a direct update to
that table.
AFAICS that happens only if you've specified the -c (--clean) option.
Hence, I don't think it's a bug.
regards, tom lane
This is a non-trivial accident to have happen on a shared machine; once users
are dumped, all of their ownerships and permissions go with them. If you
have a complex permissions system, better hope you backed up first!I find this behavior highly undesirable, and consider it a bug. The globals
dump should just add users, and not delete any.
Unless the --clean option is passed, yes I agree with you. The other
issue is that it is silly to have to use pg_dumpall to get the globals.
A person should be able to pull a pg_dump on a particular database and
get everything that is required to run that database. Including users.
Joshua D. Drake
--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL
Tom,
AFAICS that happens only if you've specified the -c (--clean) option.
Hence, I don't think it's a bug.
Nope, happens even if you don't pass --clean.
--
-Josh Berkus
Aglio Database Solutions
San Francisco
On Mon, 2 Aug 2004, Tom Lane wrote:
Josh Berkus <josh@agliodbs.com> writes:
Problem: the script which dumps globals such as users (pg_dumpall -g)
involves deleting *all* users from the pg_shadow table via a direct update to
that table.AFAICS that happens only if you've specified the -c (--clean) option.
Hence, I don't think it's a bug.
Is --clean a new option? Cause I've had the same thing happen to us also
... thank god for backups :)
----
Marc G. Fournier Hub.Org Networking Services (http://www.hub.org)
Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664
Josh Berkus <josh@agliodbs.com> writes:
AFAICS that happens only if you've specified the -c (--clean) option.
Hence, I don't think it's a bug.
Nope, happens even if you don't pass --clean.
Not in CVS tip ... but you're right, older versions did act that way.
Looks like someone addressed this already.
regards, tom lane
I've just found a bit of undesirable functionality which I would call a bug in
pg_dump. I'm not sure everyone would, but we'll see.Problem: the script which dumps globals such as users (pg_dumpall -g)
involves deleting *all* users from the pg_shadow table via a direct update to
that table. What this means in effect is that, should you attempt to use
"pg_dumpall -g" to *move* a set of users from one active server to another
(such as for transferring a database) the resulting pg_dump file will delete
all of the users which previously existed on that server.This is a non-trivial accident to have happen on a shared machine; once users
are dumped, all of their ownerships and permissions go with them. If you
have a complex permissions system, better hope you backed up first!I find this behavior highly undesirable, and consider it a bug. The globals
dump should just add users, and not delete any.
Yeah, it's nasty. One of the fixes that's already in from me is to make
the DELETE FROM pg_shadow and DELETE FROM pg_group only appear when -c
mode is set.
Maybe even when -c mode is set we should use DROP USER commands?
Do others agree?
Chris
AFAICS that happens only if you've specified the -c (--clean) option.
Hence, I don't think it's a bug.Nope, happens even if you don't pass --clean.
Not in CVS tip ... but you're right, older versions did act that way.
Looks like someone addressed this already.
Yeah, was one of my fixes.
Should we change it to use DROP USER commands anyway?
Chris
I find this behavior highly undesirable, and consider it a bug. The
globals dump should just add users, and not delete any.Unless the --clean option is passed, yes I agree with you. The other
issue is that it is silly to have to use pg_dumpall to get the globals.
A person should be able to pull a pg_dump on a particular database and
get everything that is required to run that database. Including users.
Another reason to combine pg_dumpall into pg_dump...
Chris
Chris,
Another reason to combine pg_dumpall into pg_dump...
No argument here. Are you thinking of that?
--
-Josh Berkus
Aglio Database Solutions
San Francisco
Another reason to combine pg_dumpall into pg_dump...
No argument here. Are you thinking of that?
Yeah. Would be a bit of work though.
Chris
At 01:17 PM 3/08/2004, Christopher Kings-Lynne wrote:
Would be a bit of work though.
I've been looking at this for a while now, and will probably give it a go
for 7.6/8.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 03 5330 3172 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp.mit.edu:11371 |/
I've been looking at this for a while now, and will probably give it a
go for 7.6/8.
Let me know when you do, I'd be interested in collaborating.
Chris
At 02:00 PM 3/08/2004, Christopher Kings-Lynne wrote:
I'd be interested in collaborating.
Sounds good.
----------------------------------------------------------------
Philip Warner | __---_____
Albatross Consulting Pty. Ltd. |----/ - \
(A.B.N. 75 008 659 498) | /(@) ______---_
Tel: (+61) 0500 83 82 81 | _________ \
Fax: (+61) 03 5330 3172 | ___________ |
Http://www.rhyme.com.au | / \|
| --________--
PGP key available upon request, | /
and from pgp.mit.edu:11371 |/
Christopher Kings-Lynne wrote:
I've been looking at this for a while now, and will probably give it
a go for 7.6/8.Let me know when you do, I'd be interested in collaborating.
Command Prompt, if would help could help sponsor this project.
Sincerely,
Joshua D. Drake
Chris
--
Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC
Postgresql support, programming shared hosting and dedicated hosting.
+1-503-667-4564 - jd@commandprompt.com - http://www.commandprompt.com
PostgreSQL Replicator -- production quality replication for PostgreSQL
Chris,
I was thinking a pg_export utility that can output to a range of other
databases SQL formats would also be a good idea. It would share about
90% of the pg_dump code, but I'm trying to think of how to avoid
duplicating the code.
I'm not really keen on this idea unless you're eager to make a 5-year
commitment to maintain the code. The load formats of other RDBMSes change
all the time -- MySQL is a particularly egregious example, with 2
incompatible changes in the last year -- and it would become a pain to keep
track.
More to the point, there are a number of 3rd-party OSS and proprietary
utilities which can do this kind of format conversion. For example, Perl
DB::Interpolator will cover PG, MySQL, Oracle and MSSQL once that
functionality is out of beta.
I can see, though, having a --strict-sql switch for pg_dump which would dump
all database objects in strict SQL92, which should be pretty compatible with
other systems. This should also be easier to implement and trivial to
maintain. Though it would mean not dumping functions and doing a few data
type conversions.
--
Josh Berkus
Aglio Database Solutions
San Francisco
I'm not really keen on this idea unless you're eager to make a 5-year
commitment to maintain the code. The load formats of other RDBMSes change
all the time -- MySQL is a particularly egregious example, with 2
incompatible changes in the last year -- and it would become a pain to keep
track.
Well, I could do it on pgfoundry, but it would really suck to have to
dupe all the pg_dump code. Maybe I will have to.
More to the point, there are a number of 3rd-party OSS and proprietary
utilities which can do this kind of format conversion. For example, Perl
DB::Interpolator will cover PG, MySQL, Oracle and MSSQL once that
functionality is out of beta.
Do they convert the sql dumps or dump from the backend? I really,
really want to make a mysql2pgsql converter that doesn't really on text
file parsing. Modifying mysqldump would be easiest, but the problem is
licensing I think...
I can see, though, having a --strict-sql switch for pg_dump which would dump
all database objects in strict SQL92, which should be pretty compatible with
other systems. This should also be easier to implement and trivial to
maintain. Though it would mean not dumping functions and doing a few data
type conversions.
Yeah, perhaps. And issuing a log of warnings so you can see what
information you've lost.
Chris