pg_upgrade - add config directory setting

Started by Steve Crawfordover 14 years ago23 messageshackers
Jump to latest
#1Steve Crawford
scrawford@pinpointresearch.com

It would perhaps be useful to add optional --old-confdir and
--new-confdir parameters to pg_upgrade. If these parameters are absent
then pg_upgrade would work as it does now and assume that the config
files are in the datadir.

The reason for this suggestion is that packages for Ubuntu (and I
suppose Debian and possibly others) place the config files in a
different directory than the data files.

The Ubuntu packaging, for example, puts all the configuration files in
/etc/postgresql/VERSION/main/.

If I set the data-directories to /var/lib/postgresql/VERSION/main then
pg_upgrade complains about missing config files.

If I set the data directories to /etc/postgresql/VERSION/main/ then
pg_upgrade complains that the "base" subdirectory is missing.

Temporarily symlinking postgresql.conf and pg_hba.conf from the config
directory to the data directory allowed the upgrade to run successfully
but is a bit more kludgey and non-obvious.

Cheers,
Steve

#2Aaron W. Swenson
titanofold@gentoo.org
In reply to: Steve Crawford (#1)
Re: pg_upgrade - add config directory setting

On Tue, Sep 27, 2011 at 04:13:41PM -0700, Steve Crawford wrote:

It would perhaps be useful to add optional --old-confdir and
--new-confdir parameters to pg_upgrade. If these parameters are absent
then pg_upgrade would work as it does now and assume that the config
files are in the datadir.

The reason for this suggestion is that packages for Ubuntu (and I
suppose Debian and possibly others) place the config files in a
different directory than the data files.

The Ubuntu packaging, for example, puts all the configuration files in
/etc/postgresql/VERSION/main/.

If I set the data-directories to /var/lib/postgresql/VERSION/main then
pg_upgrade complains about missing config files.

If I set the data directories to /etc/postgresql/VERSION/main/ then
pg_upgrade complains that the "base" subdirectory is missing.

Temporarily symlinking postgresql.conf and pg_hba.conf from the config
directory to the data directory allowed the upgrade to run successfully
but is a bit more kludgey and non-obvious.

Cheers,
Steve

I was just about to submit this suggestion. We do the same on Gentoo, as a
default anyway. (Users can pick their own locations for the configuration files
and data directories.) It would simplify the upgrade process by eliminating two
to four steps. (Symlink/copy configuration files in /etc/postgresql-${SLOT}
to /var/lib/postgresql-${SLOT}, same to $version++, pg_upgrade, remove symlinks.)

--
Mr. Aaron W. Swenson
Pseudonym: TitanOfOld
Gentoo Developer

#3Peter Eisentraut
peter_e@gmx.net
In reply to: Steve Crawford (#1)
Re: pg_upgrade - add config directory setting

On tis, 2011-09-27 at 16:13 -0700, Steve Crawford wrote:

It would perhaps be useful to add optional --old-confdir and
--new-confdir parameters to pg_upgrade. If these parameters are absent
then pg_upgrade would work as it does now and assume that the config
files are in the datadir.

It should work the same way the postmaster itself works: If the given
directory is not a data directory, look for the postgresql.conf file and
look there for the location of the data directory.

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Eisentraut (#3)
Re: pg_upgrade - add config directory setting

Excerpts from Peter Eisentraut's message of mié sep 28 04:49:43 -0300 2011:

On tis, 2011-09-27 at 16:13 -0700, Steve Crawford wrote:

It would perhaps be useful to add optional --old-confdir and
--new-confdir parameters to pg_upgrade. If these parameters are absent
then pg_upgrade would work as it does now and assume that the config
files are in the datadir.

It should work the same way the postmaster itself works: If the given
directory is not a data directory, look for the postgresql.conf file and
look there for the location of the data directory.

So we need a postmaster switch:

postmaster --parse-config-and-report=data_directory

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#5Steve Crawford
scrawford@pinpointresearch.com
In reply to: Peter Eisentraut (#3)
Re: pg_upgrade - add config directory setting

On 09/28/2011 12:49 AM, Peter Eisentraut wrote:

On tis, 2011-09-27 at 16:13 -0700, Steve Crawford wrote:

It would perhaps be useful to add optional --old-confdir and
--new-confdir parameters to pg_upgrade. If these parameters are absent
then pg_upgrade would work as it does now and assume that the config
files are in the datadir.

It should work the same way the postmaster itself works: If the given
directory is not a data directory, look for the postgresql.conf file and
look there for the location of the data directory.

That would make sense to me (I actually tried setting the datadirs based
on that assumption). It would require adding that feature to pg_upgrade
and tweaking the docs for --XXX-datadir but would not require any new
parameters.

Cheers,
Steve

#6Peter Eisentraut
peter_e@gmx.net
In reply to: Alvaro Herrera (#4)
Re: pg_upgrade - add config directory setting

On ons, 2011-09-28 at 11:53 -0300, Alvaro Herrera wrote:

Excerpts from Peter Eisentraut's message of mié sep 28 04:49:43 -0300 2011:

On tis, 2011-09-27 at 16:13 -0700, Steve Crawford wrote:

It would perhaps be useful to add optional --old-confdir and
--new-confdir parameters to pg_upgrade. If these parameters are absent
then pg_upgrade would work as it does now and assume that the config
files are in the datadir.

It should work the same way the postmaster itself works: If the given
directory is not a data directory, look for the postgresql.conf file and
look there for the location of the data directory.

So we need a postmaster switch:

postmaster --parse-config-and-report=data_directory

Perhaps. That might have some use for pg_ctl as well.

#7Bruce Momjian
bruce@momjian.us
In reply to: Peter Eisentraut (#6)
Re: pg_upgrade - add config directory setting

Peter Eisentraut wrote:

On ons, 2011-09-28 at 11:53 -0300, Alvaro Herrera wrote:

Excerpts from Peter Eisentraut's message of mi? sep 28 04:49:43 -0300 2011:

On tis, 2011-09-27 at 16:13 -0700, Steve Crawford wrote:

It would perhaps be useful to add optional --old-confdir and
--new-confdir parameters to pg_upgrade. If these parameters are absent
then pg_upgrade would work as it does now and assume that the config
files are in the datadir.

It should work the same way the postmaster itself works: If the given
directory is not a data directory, look for the postgresql.conf file and
look there for the location of the data directory.

So we need a postmaster switch:

postmaster --parse-config-and-report=data_directory

Perhaps. That might have some use for pg_ctl as well.

FYI, unless this is backpatched, which I doubt, it is only going to be
available for the _new_ cluster.

You are right that while pg_upgrade doesn't care about the location of
postgresql.conf and pg_hba.conf, we have to point to those to start the
server, and pg_upgrade does need to access some data files, so it also
needs to know about the data location.

I am unclear how to proceed with this, particularly with the backpatch
requirement.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#8Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#7)
Re: pg_upgrade - add config directory setting

Bruce Momjian wrote:

Peter Eisentraut wrote:

On ons, 2011-09-28 at 11:53 -0300, Alvaro Herrera wrote:

Excerpts from Peter Eisentraut's message of mi? sep 28 04:49:43 -0300 2011:

On tis, 2011-09-27 at 16:13 -0700, Steve Crawford wrote:

It would perhaps be useful to add optional --old-confdir and
--new-confdir parameters to pg_upgrade. If these parameters are absent
then pg_upgrade would work as it does now and assume that the config
files are in the datadir.

It should work the same way the postmaster itself works: If the given
directory is not a data directory, look for the postgresql.conf file and
look there for the location of the data directory.

So we need a postmaster switch:

postmaster --parse-config-and-report=data_directory

Perhaps. That might have some use for pg_ctl as well.

FYI, unless this is backpatched, which I doubt, it is only going to be
available for the _new_ cluster.

You are right that while pg_upgrade doesn't care about the location of
postgresql.conf and pg_hba.conf, we have to point to those to start the
server, and pg_upgrade does need to access some data files, so it also
needs to know about the data location.

I am unclear how to proceed with this, particularly with the backpatch
requirement.

Thinking some more, I don't need to know the data directory while the
server is down --- I already am starting it. pg_upgrade starts both old
and new servers during its check phase, and it could look up the
data_directory GUC variable and use that value when accessing files.
That would work for old and new servers. However, I assume that is
something we would not backpatch to 9.1.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Bruce Momjian (#8)
Re: pg_upgrade - add config directory setting

Excerpts from Bruce Momjian's message of jue sep 29 09:56:09 -0300 2011:

Thinking some more, I don't need to know the data directory while the
server is down --- I already am starting it. pg_upgrade starts both old
and new servers during its check phase, and it could look up the
data_directory GUC variable and use that value when accessing files.
That would work for old and new servers. However, I assume that is
something we would not backpatch to 9.1.

Why not? We've supported separate data/config dirs for a long time now,
so it seems to me that pg_upgrade not coping with them is a bug. If
pg_upgrade starts postmaster, it seems simple to grab the data_directory
setting, is it not?

--
Álvaro Herrera <alvherre@commandprompt.com>
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#10Aaron W. Swenson
titanofold@gentoo.org
In reply to: Alvaro Herrera (#9)
Re: pg_upgrade - add config directory setting

On Thu, Sep 29, 2011 at 10:44:29AM -0300, Alvaro Herrera wrote:

Excerpts from Bruce Momjian's message of jue sep 29 09:56:09 -0300 2011:

Thinking some more, I don't need to know the data directory while the
server is down --- I already am starting it. pg_upgrade starts both old
and new servers during its check phase, and it could look up the
data_directory GUC variable and use that value when accessing files.
That would work for old and new servers. However, I assume that is
something we would not backpatch to 9.1.

Why not? We've supported separate data/config dirs for a long time now,
so it seems to me that pg_upgrade not coping with them is a bug. If
pg_upgrade starts postmaster, it seems simple to grab the data_directory
setting, is it not?

I was going to say. I'd view this as bringing the behavior of pg_upgrade
to a consistent state with postgres. I vote for it being backpatched to
9.0 even. For whatever my vote is worth.

--
Mr. Aaron W. Swenson
Pseudonym: TitanOfOld
Gentoo Developer

#11Bruce Momjian
bruce@momjian.us
In reply to: Aaron W. Swenson (#10)
Re: pg_upgrade - add config directory setting

Mr. Aaron W. Swenson wrote:
-- Start of PGP signed section.

On Thu, Sep 29, 2011 at 10:44:29AM -0300, Alvaro Herrera wrote:

Excerpts from Bruce Momjian's message of jue sep 29 09:56:09 -0300 2011:

Thinking some more, I don't need to know the data directory while the
server is down --- I already am starting it. pg_upgrade starts both old
and new servers during its check phase, and it could look up the
data_directory GUC variable and use that value when accessing files.
That would work for old and new servers. However, I assume that is
something we would not backpatch to 9.1.

Why not? We've supported separate data/config dirs for a long time now,
so it seems to me that pg_upgrade not coping with them is a bug. If
pg_upgrade starts postmaster, it seems simple to grab the data_directory
setting, is it not?

I was going to say. I'd view this as bringing the behavior of pg_upgrade
to a consistent state with postgres. I vote for it being backpatched to
9.0 even. For whatever my vote is worth.

Well, I would call it more of a limitation of pg_upgrade, rather than a
bug --- perhaps documenting the limitation in the back branches is
sufficient. I think the only big argument for backpatching the feature
is that 9.1 is the first release that packagers are going to use
pg_upgrade, and fixing it now makes sense because it avoids packagers
from implementing a workaround on every platform that will go away with
9.2.

So, there are four options:

1 document the limitation and require users to use symlinks
2 add a --old/new-configdir parameter to pg_upgrade
3 have pg_upgrade find the real data dir by starting the server
4 add a flag to some tool to return the real data dir, and backpatch
that

#4 has the problem of backpatching. I looked at #3 and the first thing
pg_upgrade does is to read PG_VERSION, and the second thing is to call
pg_controldata, both of which need the real data dir, so it is going to
require some major code ordering adjustments to do #3.

One interesting trick would be to start the old and new servers to pull
the data dir at the start only if we don't see PG_VERSION in the
specified data directory --- that would limit the overhead of starting
the servers, and limit the risk for backpatching.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#12Steve Crawford
scrawford@pinpointresearch.com
In reply to: Bruce Momjian (#11)
Re: pg_upgrade - add config directory setting

On 09/29/2011 08:20 AM, Bruce Momjian wrote:

...
1 document the limitation and require users to use symlinks
2 add a --old/new-configdir parameter to pg_upgrade
3 have pg_upgrade find the real data dir by starting the server
4 add a flag to some tool to return the real data dir, and backpatch
that

5. (really 3a). Have pg_upgrade itself check the specified --XXX-datadir
for postgresql.conf and use the data_directory setting therein using the
same rules as followed by the server.

This would mean that there are no new options to pg_upgrade and that
pg_upgrade operation would not change when postgresql.conf is in the
data-directory. This would also make it consistent with PostgreSQL's
notion of file-locations:

"If you wish to keep the configuration files elsewhere than the data
directory, the postgres -D command-line option or PGDATA environment
variable must point to the directory containing the configuration files,
and the data_directory parameter must be set in postgresql.conf..."

So for backporting, it could just be considered a "bug fix" that aligns
pg_upgrade's interpretation of datadir to that of the server.

Cheers,
Steve

#13Bruce Momjian
bruce@momjian.us
In reply to: Steve Crawford (#12)
Re: pg_upgrade - add config directory setting

Steve Crawford wrote:

On 09/29/2011 08:20 AM, Bruce Momjian wrote:

...
1 document the limitation and require users to use symlinks
2 add a --old/new-configdir parameter to pg_upgrade
3 have pg_upgrade find the real data dir by starting the server
4 add a flag to some tool to return the real data dir, and backpatch
that

5. (really 3a). Have pg_upgrade itself check the specified --XXX-datadir
for postgresql.conf and use the data_directory setting therein using the
same rules as followed by the server.

This would mean that there are no new options to pg_upgrade and that
pg_upgrade operation would not change when postgresql.conf is in the
data-directory. This would also make it consistent with PostgreSQL's
notion of file-locations:

"If you wish to keep the configuration files elsewhere than the data
directory, the postgres -D command-line option or PGDATA environment
variable must point to the directory containing the configuration files,
and the data_directory parameter must be set in postgresql.conf..."

So for backporting, it could just be considered a "bug fix" that aligns
pg_upgrade's interpretation of datadir to that of the server.

pg_upgrade is not about to start reading through postgresql.conf looking
for a definition for data_directory --- there are too many cases where
this could go wrong. It would need a full postgresql.conf parser.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#13)
Re: pg_upgrade - add config directory setting

Bruce Momjian <bruce@momjian.us> writes:

pg_upgrade is not about to start reading through postgresql.conf looking
for a definition for data_directory --- there are too many cases where
this could go wrong. It would need a full postgresql.conf parser.

Yeah. I think the only sensible way to do this would be to provide an
operating mode for the postgres executable that would just parse the
config file and spit out requested values. We've had requests for that
type of functionality before, IIRC. The --describe-config option does
something related, but not what's needed here.

regards, tom lane

#15Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#14)
Re: pg_upgrade - add config directory setting

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

pg_upgrade is not about to start reading through postgresql.conf looking
for a definition for data_directory --- there are too many cases where
this could go wrong. It would need a full postgresql.conf parser.

Yeah. I think the only sensible way to do this would be to provide an
operating mode for the postgres executable that would just parse the
config file and spit out requested values. We've had requests for that
type of functionality before, IIRC. The --describe-config option does
something related, but not what's needed here.

That would certainly solve the problem, though it would have to be
backpatched all the way back to 8.4, and it would require pg_upgrade
users to be on newer minor versions of Postgres. We could minimize that
by using this feature only if postgresql.conf exists in the specified
data directory but PG_VERSION does not.

Adding this features is similar to this TODO item:

Allow configuration files to be independently validated

This still seems like a lot to backpatch.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#15)
Re: pg_upgrade - add config directory setting

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

Yeah. I think the only sensible way to do this would be to provide an
operating mode for the postgres executable that would just parse the
config file and spit out requested values.

That would certainly solve the problem, though it would have to be
backpatched all the way back to 8.4, and it would require pg_upgrade
users to be on newer minor versions of Postgres.

I would just say "no" to people who expect this to work against older
versions of Postgres. I think it's sufficient if we get this into HEAD
so that it will work in the future.

regards, tom lane

#17Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#16)
Re: pg_upgrade - add config directory setting

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

Yeah. I think the only sensible way to do this would be to provide an
operating mode for the postgres executable that would just parse the
config file and spit out requested values.

That would certainly solve the problem, though it would have to be
backpatched all the way back to 8.4, and it would require pg_upgrade
users to be on newer minor versions of Postgres.

I would just say "no" to people who expect this to work against older
versions of Postgres. I think it's sufficient if we get this into HEAD
so that it will work in the future.

Well, it is going to work in the future only when the _old_ version is
9.2+. Specifically, pg_upgrade using the flag could be patched to just
9.2, but the flag has to be supported on old and new backends for that
to work.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#18Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#17)
Re: pg_upgrade - add config directory setting

Bruce Momjian wrote:

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

Yeah. I think the only sensible way to do this would be to provide an
operating mode for the postgres executable that would just parse the
config file and spit out requested values.

That would certainly solve the problem, though it would have to be
backpatched all the way back to 8.4, and it would require pg_upgrade
users to be on newer minor versions of Postgres.

I would just say "no" to people who expect this to work against older
versions of Postgres. I think it's sufficient if we get this into HEAD
so that it will work in the future.

Well, it is going to work in the future only when the _old_ version is
9.2+. Specifically, pg_upgrade using the flag could be patched to just
9.2, but the flag has to be supported on old and new backends for that
to work.

OK, I started working on #3, which was to start the servers to find the
data_directory setting, and developed the attached patch which mostly
does this. However, I have found serious problems with pg_ctl -w/wait
mode and config-only directories (which pg_upgrade uses), and will start
a new thread to address this issue and then continue with this once that
is resolved.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachments:

/rtmp/pg_upgradetext/x-diffDownload+75-7
#19Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#18)
Re: pg_upgrade - add config directory setting

Bruce Momjian wrote:

Bruce Momjian wrote:

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

Yeah. I think the only sensible way to do this would be to provide an
operating mode for the postgres executable that would just parse the
config file and spit out requested values.

That would certainly solve the problem, though it would have to be
backpatched all the way back to 8.4, and it would require pg_upgrade
users to be on newer minor versions of Postgres.

I would just say "no" to people who expect this to work against older
versions of Postgres. I think it's sufficient if we get this into HEAD
so that it will work in the future.

Well, it is going to work in the future only when the _old_ version is
9.2+. Specifically, pg_upgrade using the flag could be patched to just
9.2, but the flag has to be supported on old and new backends for that
to work.

OK, I started working on #3, which was to start the servers to find the
data_directory setting, and developed the attached patch which mostly
does this. However, I have found serious problems with pg_ctl -w/wait
mode and config-only directories (which pg_upgrade uses), and will start
a new thread to address this issue and then continue with this once that
is resolved.

OK, I have modified the postmaster in PG 9.2 to allow output of the data
directory, and modified pg_ctl to use that, so starting in PG 9.2 pg_ctl
will work cleanly for config-only directories.

I will now work on pg_upgrade to also use the new flag to find the data
directory from a config-only install. However, this is only available
in PG 9.2, and it will only be in PG 9.3 that you can hope to use this
feature (if old is PG 9.2 or later). I am afraid the symlink hack will
have to be used for several more years, and if you are supporting
upgrades from pre-9.2, perhaps forever.

I did find that it is possible to use pg_ctl -w start on a config-only
install using this trick:

su -l postgres \
-c "env PGPORT=\"5432\" /usr/lib/postgresql-9.1/bin/pg_ctl start -w \
-t 60 -s -D /var/lib/postgresql/9.1/data/ \
-o '-D /etc/postgresql-9.1/ \
--data-directory=/var/lib/postgresql/9.1/data/ \
--silent-mode=true'"

Unfortunately pg_upgrade doesn't support the -o option which would make
this possible for pg_upgrade.

One idea would be to add -o/-O options to pg_upgrade 9.2 to allow this
to work even with old installs, but frankly, this is so confusing I am
not sure we want to encourage people to do things like this. Of course,
the symlink hack is even worse, so maybe there is some merit to this.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#20Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#19)
Re: pg_upgrade - add config directory setting

Bruce Momjian wrote:

I will now work on pg_upgrade to also use the new flag to find the data
directory from a config-only install. However, this is only available
in PG 9.2, and it will only be in PG 9.3 that you can hope to use this
feature (if old is PG 9.2 or later). I am afraid the symlink hack will
have to be used for several more years, and if you are supporting
upgrades from pre-9.2, perhaps forever.

The attached patch uses "postmaster -C data_directory" to allow
config-only upgrades. It will allow a normal 9.1 cluster to be upgraded
to a 9.2 config-only cluster.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

Attachments:

/rtmp/pg_upgradetext/x-diffDownload+75-10
#21Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#20)
#22Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#19)
#23Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#22)