pg_upgrade if 'postgres' database is dropped
pg_upgrade doesn't work if the 'postgres' database has been dropped in
the old cluster:
~/pgsql.master$ bin/pg_upgrade -b ~/pgsql.91stable/bin -B bin/ -d
~/pgsql.91stable/data -D data-upgraded/
Performing Consistency Checks
-----------------------------
Checking current, bin, and data directories ok
Checking cluster versions ok
Checking database user is a superuser ok
Checking for prepared transactions ok
Checking for reg* system OID user data types ok
Checking for contrib/isn with bigint-passing mismatch ok
Creating catalog dump ok
Checking for prepared transactions ok
New cluster database "postgres" does not exist in the old cluster
Failure, exiting
That's a bit unfortunate. We have some other tools that don't work
without the 'postgres' database, like 'reindexdb -all', but it would
still be nice if pg_upgrade did.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Tue, Oct 4, 2011 at 12:11 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
pg_upgrade doesn't work if the 'postgres' database has been dropped in the
old cluster:~/pgsql.master$ bin/pg_upgrade -b ~/pgsql.91stable/bin -B bin/ -d
~/pgsql.91stable/data -D data-upgraded/
Performing Consistency Checks
-----------------------------
Checking current, bin, and data directories ok
Checking cluster versions ok
Checking database user is a superuser ok
Checking for prepared transactions ok
Checking for reg* system OID user data types ok
Checking for contrib/isn with bigint-passing mismatch ok
Creating catalog dump ok
Checking for prepared transactions okNew cluster database "postgres" does not exist in the old cluster
Failure, exitingThat's a bit unfortunate. We have some other tools that don't work without
the 'postgres' database, like 'reindexdb -all', but it would still be nice
if pg_upgrade did.
+1. I think our usual policy is to try postgres first and then try
template1, so it would seem logical for pg_upgrade to do the same.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas wrote:
On Tue, Oct 4, 2011 at 12:11 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:pg_upgrade doesn't work if the 'postgres' database has been dropped in the
old cluster:~/pgsql.master$ bin/pg_upgrade -b ~/pgsql.91stable/bin -B bin/ -d
~/pgsql.91stable/data -D data-upgraded/
Performing Consistency Checks
-----------------------------
Checking current, bin, and data directories ? ? ? ? ? ? ? ? ok
Checking cluster versions ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ok
Checking database user is a superuser ? ? ? ? ? ? ? ? ? ? ? ok
Checking for prepared transactions ? ? ? ? ? ? ? ? ? ? ? ? ?ok
Checking for reg* system OID user data types ? ? ? ? ? ? ? ?ok
Checking for contrib/isn with bigint-passing mismatch ? ? ? ok
Creating catalog dump ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ok
Checking for prepared transactions ? ? ? ? ? ? ? ? ? ? ? ? ?okNew cluster database "postgres" does not exist in the old cluster
Failure, exitingThat's a bit unfortunate. We have some other tools that don't work without
the 'postgres' database, like 'reindexdb -all', but it would still be nice
if pg_upgrade did.+1. I think our usual policy is to try postgres first and then try
template1, so it would seem logical for pg_upgrade to do the same.
Well, it is a little tricky. The problem is that I am not just
connecting to a database --- I am creating support functions in the
database. Now, this is complex because template1 is the template for
new databases, except for pg_dump which uses template0.
So, it is going to be confusing to support both databases because there
is the cleanup details I have to document if I use template1.
Also, pg_dumpall unconditionally connects to the postgres database to
restore roles:
fprintf(OPF, "\\connect postgres\n\n");
We could connect to template1 for this, but I am not comfortable
changing this. And when pg_dumpall throws an error for a missing
postgres database, pg_upgrade is going to fail.
We started using the postgres database as a database for connections ---
do we want to change that? We certainly can't have the pg_dumpall
output _conditionally_ connecting to template1.
I am feeling this isn't worth pursuing.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
* Bruce Momjian (bruce@momjian.us) wrote:
So, it is going to be confusing to support both databases because there
is the cleanup details I have to document if I use template1.
Presumably there's some other database in the system besides template1
if postgres doesn't exist.. Would it be possible to just make it
configurable? Then the user could pick a non-template database. Having
it fail if the option isn't used and the default postgres isn't there is
fine, imv.
Thanks,
Stephen
Stephen Frost wrote:
-- Start of PGP signed section.
* Bruce Momjian (bruce@momjian.us) wrote:
So, it is going to be confusing to support both databases because there
is the cleanup details I have to document if I use template1.Presumably there's some other database in the system besides template1
if postgres doesn't exist.. Would it be possible to just make it
configurable? Then the user could pick a non-template database. Having
it fail if the option isn't used and the default postgres isn't there is
fine, imv.
I have not seen enough demand to make this a user-visible configuration.
We can just tell them to create a postgres database. Frankly, they
would have had to _remove_ the postgres database after initdb for it not
to be there, and they are instructed to change nothing about the new
database.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
* Bruce Momjian (bruce@momjian.us) wrote:
I have not seen enough demand to make this a user-visible configuration.
We can just tell them to create a postgres database. Frankly, they
would have had to _remove_ the postgres database after initdb for it not
to be there, and they are instructed to change nothing about the new
database.
Yes, they would have removed it because they didn't want it. As I
recall, part of the agreement to create an extra database by default was
that it could be removed if users didn't want it. Turning around and
then saying "but things won't work if it's not there" isn't exactly
supporting users who decide to remove it.
Regarding pg_dumpall and pg_restore, I'm pretty sure both of those can
be configured to connect to other databases instead, including for
globals.
Thanks,
Stephen
Stephen Frost wrote:
-- Start of PGP signed section.
* Bruce Momjian (bruce@momjian.us) wrote:
I have not seen enough demand to make this a user-visible configuration.
We can just tell them to create a postgres database. Frankly, they
would have had to _remove_ the postgres database after initdb for it not
to be there, and they are instructed to change nothing about the new
database.Yes, they would have removed it because they didn't want it. As I
recall, part of the agreement to create an extra database by default was
that it could be removed if users didn't want it. Turning around and
then saying "but things won't work if it's not there" isn't exactly
supporting users who decide to remove it.
Well, you would have to remove it _after_ you did the pg_upgrade. Right
now if you do a normal dump/restore upgrade, you also have to re-remove
the postgres database. We don't have any mechanism to drop a database
as part of pg_dumpall's restore if it didn't exist in the old cluster.
Regarding pg_dumpall and pg_restore, I'm pretty sure both of those can
be configured to connect to other databases instead, including for
globals.
Well, please show me the code, because the C code I showed you had the
'\connect postgres' string hardcoded in there.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
* Bruce Momjian (bruce@momjian.us) wrote:
Well, you would have to remove it _after_ you did the pg_upgrade. Right
now if you do a normal dump/restore upgrade, you also have to re-remove
the postgres database. We don't have any mechanism to drop a database
as part of pg_dumpall's restore if it didn't exist in the old cluster.
Perhaps not, but it *could* be removed after the restore and all would
be well, yes?
Regarding pg_dumpall and pg_restore, I'm pretty sure both of those can
be configured to connect to other databases instead, including for
globals.Well, please show me the code, because the C code I showed you had the
'\connect postgres' string hardcoded in there.
I guess there's a difference between "can be used and will work
correctly, but might create some extra garbage" and "can't be used at
all". pg_dumpall has a -l option for connecting to whatever *existing*
database you have to pull the global data, and then it'll restore into a
clean initdb'd cluster, after which you could remove postgres.
Admittedly, if you initdb the cluster, drop postgres, and then try a
restore, it would fail. Personally, I'm not a big fan of that (why
don't we use what was passed in to -l for that..?), but, practically,
it's not that big a deal. I don't know many folks who are going to
restore a pg_dumpall dump into an existing configuration where they've
monkied with things (that could cause all kinds of other issues anyway,
role conflicts, etc).
If I understood correctly (perhaps I didn't..), is that pg_upgrade
doesn't have the pg_dumpall equivilant of the '-l' or '--database'
option, and that's what is at issue here.
Thanks,
Stephen
Stephen Frost wrote:
-- Start of PGP signed section.
* Bruce Momjian (bruce@momjian.us) wrote:
Well, you would have to remove it _after_ you did the pg_upgrade. Right
now if you do a normal dump/restore upgrade, you also have to re-remove
the postgres database. We don't have any mechanism to drop a database
as part of pg_dumpall's restore if it didn't exist in the old cluster.Perhaps not, but it *could* be removed after the restore and all would
be well, yes?
I am not sure how much pg_dump worries about removing system objects
during a restore --- I don't think it does that at all actually. I
thought we did that for plpgsql, but I don't see that in the C code now,
and testing doesn't show it being removed by pg_dump.
Regarding pg_dumpall and pg_restore, I'm pretty sure both of those can
be configured to connect to other databases instead, including for
globals.Well, please show me the code, because the C code I showed you had the
'\connect postgres' string hardcoded in there.I guess there's a difference between "can be used and will work
correctly, but might create some extra garbage" and "can't be used at
all". pg_dumpall has a -l option for connecting to whatever *existing*
database you have to pull the global data, and then it'll restore into a
clean initdb'd cluster, after which you could remove postgres.
Keep in mind -l might connect to a specified database to do the dump,
but it will still connect to the 'postgres' database to recreate them.
Admittedly, if you initdb the cluster, drop postgres, and then try a
restore, it would fail. Personally, I'm not a big fan of that (why
Right, same with pg_upgrade.
don't we use what was passed in to -l for that..?), but, practically,
No idea.
it's not that big a deal. I don't know many folks who are going to
restore a pg_dumpall dump into an existing configuration where they've
monkied with things (that could cause all kinds of other issues anyway,
role conflicts, etc).If I understood correctly (perhaps I didn't..), is that pg_upgrade
doesn't have the pg_dumpall equivilant of the '-l' or '--database'
option, and that's what is at issue here.
Well, I can modify pg_upgrade to connect to template1 easily (no need
for a switch) --- the problem is that pg_dumpall requires the 'postgres'
database to restore its dump file.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes:
Stephen Frost wrote:
Yes, they would have removed it because they didn't want it. As I
recall, part of the agreement to create an extra database by default was
that it could be removed if users didn't want it. Turning around and
then saying "but things won't work if it's not there" isn't exactly
supporting users who decide to remove it.
Well, you would have to remove it _after_ you did the pg_upgrade.
As far as the *target* cluster is concerned, I have no sympathy for
someone who messes with its contents before running pg_upgrade.
That's an RTFM matter: you're supposed to upgrade into a virgin
just-initdb'd cluster.
However, it would be nice if pg_upgrade supported transferring from a
*source* cluster that didn't have the postgres DB.
What about creating a new, single-purpose database in the source
cluster and then removing it again after we're done?
regards, tom lane
Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
Stephen Frost wrote:
Yes, they would have removed it because they didn't want it. As I
recall, part of the agreement to create an extra database by default was
that it could be removed if users didn't want it. Turning around and
then saying "but things won't work if it's not there" isn't exactly
supporting users who decide to remove it.Well, you would have to remove it _after_ you did the pg_upgrade.
As far as the *target* cluster is concerned, I have no sympathy for
someone who messes with its contents before running pg_upgrade.
That's an RTFM matter: you're supposed to upgrade into a virgin
just-initdb'd cluster.However, it would be nice if pg_upgrade supported transferring from a
*source* cluster that didn't have the postgres DB.
I have this C comment in pg_upgrade about this:
* If someone removes the 'postgres' database from the old cluster and
* the new cluster has a 'postgres' database, the number of databases
* will not match. We actually could upgrade such a setup, but it would
* violate the 1-to-1 mapping of database counts, so we throw an error
* instead. We would detect this as a database count mismatch during
* upgrade, but we want to detect it during the check phase and report
* the database name.
Is this worth fixing? Another problem is that pg_dumpall doesn't remove
the postgres database in the new cluster if it was not in the old one,
so they are going to end up with a postgres database in the new cluster
anyway. I could argue that pg_upgrade is better because reports it
cannot recreate the new cluster exactly, while pg_dumpall just keeps a
postgres database that was not in the old cluster.
What about creating a new, single-purpose database in the source
cluster and then removing it again after we're done?
That is not a problem --- I can easily use template1.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
On Thu, Oct 27, 2011 at 11:35 PM, Bruce Momjian <bruce@momjian.us> wrote:
What about creating a new, single-purpose database in the source
cluster and then removing it again after we're done?That is not a problem --- I can easily use template1.
Huh?
You just said upthread that you didn't want to use template1 because
you didn't want to modify the template database. I think the point is
that if you're doing something to the database that someone might
object to, you oughtn't be doing it to the postgres database either.
You should create a database just for pg_upgrade's use and install its
crap in there.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas wrote:
On Thu, Oct 27, 2011 at 11:35 PM, Bruce Momjian <bruce@momjian.us> wrote:
What about creating a new, single-purpose database in the source
cluster and then removing it again after we're done?That is not a problem --- I can easily use template1.
Huh?
You just said upthread that you didn't want to use template1 because
you didn't want to modify the template database. I think the point is
I don't want to use postgres and then fall back to template1 if
necessary --- I would just use template1 always.
that if you're doing something to the database that someone might
object to, you oughtn't be doing it to the postgres database either.
You should create a database just for pg_upgrade's use and install its
crap in there.
It installs crap in all databases to set oids on system tables, for
example, so we are only creating it early in postgres (or template1) to
set auth_id. Our sticking point now is that pg_dumpall has the
'postgres' database hardcoded for role creation.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
Bruce Momjian <bruce@momjian.us> writes:
Robert Haas wrote:
that if you're doing something to the database that someone might
object to, you oughtn't be doing it to the postgres database either.
You should create a database just for pg_upgrade's use and install its
crap in there.
It installs crap in all databases to set oids on system tables,
It seems like you're both confusing the source and target clusters.
None of that stuff gets installed in the source, does it?
regards, tom lane
On Oct 28, 2011 5:22 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
Bruce Momjian <bruce@momjian.us> writes:
Stephen Frost wrote:
Yes, they would have removed it because they didn't want it. As I
recall, part of the agreement to create an extra database by default
was
that it could be removed if users didn't want it. Turning around and
then saying "but things won't work if it's not there" isn't exactly
supporting users who decide to remove it.Well, you would have to remove it _after_ you did the pg_upgrade.
As far as the *target* cluster is concerned, I have no sympathy for
someone who messes with its contents before running pg_upgrade.
That's an RTFM matter: you're supposed to upgrade into a virgin
just-initdb'd cluster.However, it would be nice if pg_upgrade supported transferring from a
*source* cluster that didn't have the postgres DB.What about creating a new, single-purpose database in the source
cluster and then removing it again after we're done?
How about naming this newly created database "postgres"? That would make the
code simple enough - always use the postgres database, just drop it at the
end if it didn't exist in the source cluster.
/Magnus
On Oct 28, 2011 5:19 AM, "Bruce Momjian" <bruce@momjian.us> wrote:
Stephen Frost wrote:
Regarding pg_dumpall and pg_restore, I'm pretty sure both of those
can
be configured to connect to other databases instead, including for
globals.Well, please show me the code, because the C code I showed you had the
'\connect postgres' string hardcoded in there.I guess there's a difference between "can be used and will work
correctly, but might create some extra garbage" and "can't be used at
all". pg_dumpall has a -l option for connecting to whatever *existing*
database you have to pull the global data, and then it'll restore into a
clean initdb'd cluster, after which you could remove postgres.Keep in mind -l might connect to a specified database to do the dump,
but it will still connect to the 'postgres' database to recreate them.Admittedly, if you initdb the cluster, drop postgres, and then try a
restore, it would fail. Personally, I'm not a big fan of that (whyRight, same with pg_upgrade.
don't we use what was passed in to -l for that..?), but, practically,
No idea.
Chicken/egg? If we did that, the pg_dumpall dump could no longer be loaded
into an empty cluster since the db it wanted to talk to didn't exist yet.
And restoring into an empty cluster has to be the main use for pg_dumpall
after all....
/Magnus
Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
Robert Haas wrote:
that if you're doing something to the database that someone might
object to, you oughtn't be doing it to the postgres database either.
You should create a database just for pg_upgrade's use and install its
crap in there.It installs crap in all databases to set oids on system tables,
It seems like you're both confusing the source and target clusters.
None of that stuff gets installed in the source, does it?
Right, only in the target.
Let me summarize:
The postgres database is required in the source because pg_upgrade likes
to have a 1-1 database mapping of old and new clusters. This can be
changed, but it makes pg_upgrade slightly more complex.
The postgres database is required on the target because pg_upgrade
creates the support functions first in that database. That can be
changed, but pg_dumpall restores roles in the postgres database by
default, not template1. Again, this can be changed.
Because of this 'postgres' new cluster requirement, you can't just
delete the postgres database from the new cluster and run pg_upgrade.
Tom wants pg_upgrade to work if the old cluster doesn't have a postgres
database. I see two solutions --- either remove the 1-1 mapping of
old/new databases, or remove the pg_upgrade and pg_dumpall dependence on
the postgres database and tell users to remove postgres from the new
cluster before the upgrade.
They already get a clear error message about the problem, which I think
is why we haven't seen more problem reports. My guess is they are just
creating the postgres database on the old cluster before the upgrade
after they get the error.
I have applied the attached patch to at least clarify that they need the
postgres database in the old cluster, rather than them trying to remove
the postgres database from the new cluster.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
Attachments:
/rtmp/pg_upgradetext/x-diffDownload+9-9
Magnus Hagander wrote:
On Oct 28, 2011 5:22 AM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
Bruce Momjian <bruce@momjian.us> writes:
Stephen Frost wrote:
Yes, they would have removed it because they didn't want it. As I
recall, part of the agreement to create an extra database by defaultwas
that it could be removed if users didn't want it. Turning around and
then saying "but things won't work if it's not there" isn't exactly
supporting users who decide to remove it.Well, you would have to remove it _after_ you did the pg_upgrade.
As far as the *target* cluster is concerned, I have no sympathy for
someone who messes with its contents before running pg_upgrade.
That's an RTFM matter: you're supposed to upgrade into a virgin
just-initdb'd cluster.However, it would be nice if pg_upgrade supported transferring from a
*source* cluster that didn't have the postgres DB.What about creating a new, single-purpose database in the source
cluster and then removing it again after we're done?How about naming this newly created database "postgres"? That would make the
code simple enough - always use the postgres database, just drop it at the
end if it didn't exist in the source cluster.
Yes, that would work, but see my summarization email on this. Using
template1 is not a problem for pg_upgrade, it is the modifications to
pg_dumpall that are an issue.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
Magnus Hagander wrote:
On Oct 28, 2011 5:19 AM, "Bruce Momjian" <bruce@momjian.us> wrote:
Stephen Frost wrote:
Regarding pg_dumpall and pg_restore, I'm pretty sure both of those
can
be configured to connect to other databases instead, including for
globals.Well, please show me the code, because the C code I showed you had the
'\connect postgres' string hardcoded in there.I guess there's a difference between "can be used and will work
correctly, but might create some extra garbage" and "can't be used at
all". pg_dumpall has a -l option for connecting to whatever *existing*
database you have to pull the global data, and then it'll restore into a
clean initdb'd cluster, after which you could remove postgres.Keep in mind -l might connect to a specified database to do the dump,
but it will still connect to the 'postgres' database to recreate them.Admittedly, if you initdb the cluster, drop postgres, and then try a
restore, it would fail. Personally, I'm not a big fan of that (whyRight, same with pg_upgrade.
don't we use what was passed in to -l for that..?), but, practically,
No idea.
Chicken/egg? If we did that, the pg_dumpall dump could no longer be loaded
into an empty cluster since the db it wanted to talk to didn't exist yet.
And restoring into an empty cluster has to be the main use for pg_dumpall
after all....
True. My assumption was that they had created some special database
before they did the pg_dumpall restore, but it would be odd because the
database would have been hard-coded into the dump, which isn't good.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
On Fri, Oct 28, 2011 at 8:12 AM, Bruce Momjian <bruce@momjian.us> wrote:
Yes, that would work, but see my summarization email on this. Using
template1 is not a problem for pg_upgrade, it is the modifications to
pg_dumpall that are an issue.
I just did a bit of testing on this. It appears that pg_dumpall, if
given a cluster containing no postgres database, will happily try to
connect to template1 instead. If template1 isn't available either,
you can use "-l SOMEDBNAME" to specify the name of another database to
connect to instead. So there is infinite flexibility there.
But regardless of which database it uses to *generate* the dump, the
dump itself will *always* contain this, right at the very beginning:
\connect postgres
That line is in fact hard-coded as a literal string in pg_dumpall.c.
It seems like the easiest fix here might be to just remove that line
from the dump, because AFAICS it's completely pointless. During the
time for which that setting is in effect, we're just restoring
globals, so it shouldn't matter which database we're connected to;
only that we have a valid connection. So trying to switch the
connection from whatever the user is connected to currently to
postgres doesn't accomplish anything useful, but it does make it
possible for dump restoration to unnecessarily fail.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company