BUG #5488: pg_dump does not quote column names -> pg_restore may fail when upgrading
The following bug has been logged online:
Bug reference: 5488
Logged by: Hartmut Goebel
Email address: h.goebel@goebel-consult.de
PostgreSQL version: 8.3 / 8.4
Operating system: all
Description: pg_dump does not quote column names -> pg_restore may
fail when upgrading
Details:
If a 8.3 table contains a column named "window", the dump can not be
restored into a 8.4 database. Reasons: a) "window" is a new keyword in 8.4
b) pg_dump does not quote column names.
This is a generic problem with pg_dump. Since it does not quote all column
names, pg_restore may fail whenever migrating to a new version.
Solution: pg_dump should quote *all* column-names, no matter if they are
keywords or not.
"Hartmut Goebel" <h.goebel@goebel-consult.de> wrote:
Description: pg_dump does not quote column names ->
pg_restore may fail when upgrading
If a 8.3 table contains a column named "window", the dump can not
be restored into a 8.4 database. Reasons: a) "window" is a new
keyword in 8.4 b) pg_dump does not quote column names.
Note that the documentation recommends always running pg_dump using
the executable from the target version, not the source version. Are
you using the pg_dump executable from 8.4?
-Kevin
Am 03.06.2010 15:43, schrieb Kevin Grittner:
Note that the documentation recommends always running pg_dump using
the executable from the target version, not the source version. Are
you using the pg_dump executable from 8.4?
I dumped with the executable form 8.3.
8.4 did not allow accessing the 8.3 database, thus I needed to dump
using the 8.3 executable.
--
Schönen Gruß - Regards
Hartmut Goebel
Dipl.-Informatiker (univ.), CISSP, CSSLP
Goebel Consult
Spezialist für IT-Sicherheit in komplexen Umgebungen
http://www.goebel-consult.de
Monatliche Kolumne: http://www.cissp-gefluester.de/
Goebel Consult mit Mitglied bei http://www.7-it.de
Attachments:
"Hartmut Goebel" <h.goebel@goebel-consult.de> writes:
If a 8.3 table contains a column named "window", the dump can not be
restored into a 8.4 database. Reasons: a) "window" is a new keyword in 8.4
b) pg_dump does not quote column names.
This is one of the cases where it's helpful to use the newer version's
pg_dump.
Solution: pg_dump should quote *all* column-names, no matter if they are
keywords or not.
That was considered and rejected long ago. Readability of the dump
script is something that we put a nonzero value on.
regards, tom lane
Hartmut Goebel <h.goebel@goebel-consult.de> wrote:
I dumped with the executable form 8.3.
That's not expected to work for an upgrade to 8.4.
8.4 did not allow accessing the 8.3 database
What do you mean? (What did you try and what happened?)
-Kevin
Am 03.06.2010 16:16, schrieb Kevin Grittner:
8.4 did not allow accessing the 8.3 database
What do you mean? (What did you try and what happened?)
If upgraded the rpm-packages from 8.3 to 8.4. Then postgres failed
starting (something like "Database version mismatch"). So I downgraded
to 8.3, pg_dump'ed there, upgraded and pg_restore'd.
Since 8.4 was not willed to work in the 8.3 database files, I expected
this being a correct upgrade path.
--
Schönen Gruß - Regards
Hartmut Goebel
Attachments:
Hartmut Goebel <h.goebel@goebel-consult.de> wrote:
If upgraded the rpm-packages from 8.3 to 8.4. Then postgres failed
starting (something like "Database version mismatch").
You need to be running the old server using 8.3 software and while
using pg_dump from 8.4 software. Does your packager provide some
way to install the new version at a different location? If not, is
there a separate machine on which you could install 8.4?
-Kevin
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
Hartmut Goebel <h.goebel@goebel-consult.de> wrote:
If upgraded the rpm-packages from 8.3 to 8.4. Then postgres failed
starting (something like "Database version mismatch").
You need to be running the old server using 8.3 software and while
using pg_dump from 8.4 software. Does your packager provide some
way to install the new version at a different location? If not, is
there a separate machine on which you could install 8.4?
In practice, if he has to redo the dump, the easiest fix is really
going to be to rename the column beforehand. He's likely to end up
doing that anyway rather than quoting its name forever ...
regards, tom lane
On Thu, Jun 03, 2010 at 06:04:16PM +0200, Hartmut Goebel wrote:
If upgraded the rpm-packages from 8.3 to 8.4. Then postgres failed
starting (something like "Database version mismatch"). So I downgraded
to 8.3, pg_dump'ed there, upgraded and pg_restore'd.
pg_dump will complain if its version doesn't match the server version, but
that's neither a bug nor, in this case, a bad thing. You have pg_dump's -i
option which will squelch this message and allow pg_dump to continue.
--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com
Am 03.06.2010 16:15, schrieb Tom Lane:
Solution: pg_dump should quote *all* column-names, no matter if they are
keywords or not.That was considered and rejected long ago. Readability of the dump
script is something that we put a nonzero value on.
Sorry, I do not understand this.
I assume you mean readability for humans?!
So if readability is not important, what speaks against always quoting
the column names?
--
Schönen Gruß - Regards
Hartmut Goebel
Dipl.-Informatiker (univ.), CISSP, CSSLP
Goebel Consult
Spezialist für IT-Sicherheit in komplexen Umgebungen
http://www.goebel-consult.de
Monatliche Kolumne: http://www.cissp-gefluester.de/
Goebel Consult mit Mitglied bei http://www.7-it.de
Attachments:
Am 03.06.2010 20:07, schrieb Tom Lane:
"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
Hartmut Goebel <h.goebel@goebel-consult.de> wrote:
If upgraded the rpm-packages from 8.3 to 8.4. Then postgres failed
starting (something like "Database version mismatch").You need to be running the old server using 8.3 software and while
using pg_dump from 8.4 software. Does your packager provide some
way to install the new version at a different location? If not, is
there a separate machine on which you could install 8.4?In practice, if he has to redo the dump, the easiest fix is really
going to be to rename the column beforehand. He's likely to end up
doing that anyway rather than quoting its name forever ...
Both solutions are quite complicated and require a lot of work and
knowledge. Esp. since there seams to be no upgrade or migration guide
available.
(NB: I personally solved the problem using pg_restore | sed | pqsl. But
this bug realy is about a generic problem.)
Given the fact that postgres is not only used in "high end" environments
which have a professional database admin (see below), I strongly suggest
finding a solution which is easier to handle for average admins.
The solution I suggested (simply quoting all column names) would AFAIK
solve this problem once and forever.
An example for Postgresql in a non-database-admin evironment is the
three tier ERP application www.tryton.org. The Tryton admin typically is
not a database guy, but a generic, average server administrator. He
probably knowns about databases, SQL, etc. But he has *a lot* of work
and he is happy about everything which makes his live easier. And he
hates stuff which does not work, while it is commonly expected to work easy.
The Tryton GUI offers backing up the database, which is simply pg_dump
behind. The Tryton admin expects to be able to restore this backup after
upgrade. Because it is such easy to get a database backup, he expects
restore being that easy, too.
The Tryton admin does not understand at first, why this doe not work. It
worked when upgrading 8.1 to 8.2 and when upgrading 8.2 to 8.3. But when
upgrading to 8.4 it does not work.
If the admin is a Mysql-fan, he will be curing on postgres, as soon as
he found out how easy the solution would have been: "Would I have
stayied at mysql, they are able to quote all column names if neccessary.
Sh** postgres!"
And he will be wasting another hour (or more) working around the
problem. While the solution could be *so easy*: simply quote all column
names in pg_dump. (And backport to 8.0, 8.2, 8.3 :-)
--
Schönen Gruß - Regards
Hartmut Goebel
Dipl.-Informatiker (univ.), CISSP, CSSLP
Goebel Consult
Spezialist für IT-Sicherheit in komplexen Umgebungen
http://www.goebel-consult.de
Monatliche Kolumne: http://www.cissp-gefluester.de/
Goebel Consult mit Mitglied bei http://www.7-it.de
Attachments:
* Hartmut Goebel (h.goebel@goebel-consult.de) wrote:
Am 03.06.2010 16:15, schrieb Tom Lane:
That was considered and rejected long ago. Readability of the dump
script is something that we put a nonzero value on.I assume you mean readability for humans?!
Yes, readability for humans is important.
So if readability is not important, what speaks against always quoting
the column names?
Quoting all column names makes the dump script much more difficult for
human consumption, which is important.
Thanks,
Stephen
Am 04.06.2010 13:56, schrieb Stephen Frost:
Quoting all column names makes the dump script much more difficult for
human consumption, which is important.
I don't agree with you here. But this may be a matter of personal taste.
Esp. I think, functionality is much ore important than a small decrees
of readability. At least pg_dump should get an option
"--quote-column-names", so this can be switcced on if necessary.
--
Schönen Gruß - Regards
Hartmut Goebel
Dipl.-Informatiker (univ.), CISSP, CSSLP
Goebel Consult
Spezialist für IT-Sicherheit in komplexen Umgebungen
http://www.goebel-consult.de
Monatliche Kolumne: http://www.cissp-gefluester.de/
Goebel Consult mit Mitglied bei http://www.7-it.de
Attachments:
* Hartmut Goebel (h.goebel@goebel-consult.de) wrote:
Am 04.06.2010 13:56, schrieb Stephen Frost:
Quoting all column names makes the dump script much more difficult for
human consumption, which is important.I don't agree with you here. But this may be a matter of personal taste.
Esp. I think, functionality is much ore important than a small decrees
of readability. At least pg_dump should get an option
"--quote-column-names", so this can be switcced on if necessary.
Something like '--quote-identifiers' might be alright, so long as it's
defaulted to 'off'. Of course, I don't know that it'd actually solve
your problem at all- after all, keywords can and will change between
major versions and even if your pg_dump quotes all identifiers, anything
else using the database (eg: applications) would need to as well.
Keyword changes aren't the only thing an application or other DB user
needs to be concerned about when changing major versions of PG either.
If you're using pg_dump for backups, then when you need to restore, it
should be into the same version of PG that you took the pg_dump from.
If you're using pg_dump to upgrade, use the pg_dump from the version
you're upgrading *to*, and do so in a test environment first to make
sure that the restore works correctly, that the applications and other
DB users are happy with the new version, etc, etc, before even thinking
about upgrading a production system.
Note also that having multiple major versions of PG installed (eg: 8.3
and 8.4) at the same time is made easier on some platforms (Debian-based
ones, specifically).
Thanks,
Stephen
Am 04.06.2010 14:57, schrieb Stephen Frost:
* Hartmut Goebel (h.goebel@goebel-consult.de) wrote:
Am 04.06.2010 13:56, schrieb Stephen Frost:
Quoting all column names makes the dump script much more difficult for
human consumption, which is important.I don't agree with you here. But this may be a matter of personal taste.
Esp. I think, functionality is much ore important than a small decrees
of readability. At least pg_dump should get an option
"--quote-column-names", so this can be switcced on if necessary.Something like '--quote-identifiers' might be alright, so long as it's
defaulted to 'off'. Of course, I don't know that it'd actually solve
your problem at all- after all, keywords can and will change between
major versions and even if your pg_dump quotes all identifiers, anything
else using the database (eg: applications) would need to as well.
The application already quotes all column names :-) It's using a generic
framework which does not (and must not) rely on column names being
non-keywords.
If you're using pg_dump to upgrade, use the pg_dump from the version
you're upgrading *to*, and do so in a test environment first to make
sure that the restore works correctly, that the applications and other
DB users are happy with the new version, etc, etc, before even thinking
about upgrading a production system.
This is correct -- in theory. In practice there are many average system
administrators which need an easy upgrade path. You may call this
unprofessional, but this is reality.
To put it on the point: Is postgres meant for average administrators or
for elite database admins? In the first case, developers should think
about how to make work easier for the average ones.
--
Schönen Gruß - Regards
Hartmut Goebel
Dipl.-Informatiker (univ.), CISSP, CSSLP
Goebel Consult
Spezialist für IT-Sicherheit in komplexen Umgebungen
http://www.goebel-consult.de
Monatliche Kolumne: http://www.cissp-gefluester.de/
Goebel Consult mit Mitglied bei http://www.7-it.de
Attachments:
Hartmut Goebel <h.goebel@goebel-consult.de> wrote:
The application already quotes all column names :-) It's using a
generic framework which does not (and must not) rely on column
names being non-keywords.
Same here. I suspect that this is much more commonn than many
PostgreSQL developers realize; and I think it makes a reasonable
case for at least an *option* to quote all identifiers emitted by
pg_dump.
-Kevin
Kevin Grittner wrote:
Hartmut Goebel <h.goebel@goebel-consult.de> wrote:
The application already quotes all column names :-) It's using a
generic framework which does not (and must not) rely on column
names being non-keywords.Same here. I suspect that this is much more commonn than many
PostgreSQL developers realize; and I think it makes a reasonable
case for at least an *option* to quote all identifiers emitted by
pg_dump.
Even if we quote them in the dump, I assume applications would need to
quote them too, which I doubt many do.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ None of us is going to be here forever. +
On Fri, Jun 04, 2010 at 02:59:48PM -0400, Bruce Momjian wrote:
Kevin Grittner wrote:
Hartmut Goebel <h.goebel@goebel-consult.de> wrote:
The application already quotes all column names :-) It's using a
generic framework which does not (and must not) rely on column
names being non-keywords.Same here. I suspect that this is much more commonn than many
PostgreSQL developers realize; and I think it makes a reasonable
case for at least an *option* to quote all identifiers emitted by
pg_dump.Even if we quote them in the dump, I assume applications would need
to quote them too, which I doubt many do.
It seems like something that's doable by pg_dump as a "default off"
option. TODO for 9.1?
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
David Fetter <david@fetter.org> wrote:
It seems like something that's doable by pg_dump as a "default
off" option. TODO for 9.1?
Sounds good to me.
-Kevin
David Fetter wrote:
On Fri, Jun 04, 2010 at 02:59:48PM -0400, Bruce Momjian wrote:
Kevin Grittner wrote:
Hartmut Goebel <h.goebel@goebel-consult.de> wrote:
The application already quotes all column names :-) It's using a
generic framework which does not (and must not) rely on column
names being non-keywords.Same here. I suspect that this is much more commonn than many
PostgreSQL developers realize; and I think it makes a reasonable
case for at least an *option* to quote all identifiers emitted by
pg_dump.Even if we quote them in the dump, I assume applications would need
to quote them too, which I doubt many do.It seems like something that's doable by pg_dump as a "default off"
option. TODO for 9.1?
This is the bug report that prompted this thread:
http://archives.postgresql.org/pgsql-bugs/2010-06/msg00018.php
I bigger question is why wouldn't we backpatch WINDOW as quoted in
pg_dump when we release back-branches? That would make the bug go away,
rather than require users to use a special flag (and find out only after
they were doing the reload).
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ None of us is going to be here forever. +