pg_upgrade --clone error checking

Started by Jeff Janesalmost 7 years ago7 messageshackers
Jump to latest
#1Jeff Janes
jeff.janes@gmail.com

With the new pg_upgrade --clone, if we are going to end up throwing the
error "file cloning not supported on this platform" (which seems to depend
only on ifdefs) I think we should throw it first thing, before any other
checks are done and certainly before pg_dump gets run.

This might result in some small amount of code duplication, but I think it
would be worth the cost.

For cases where we might throw "could not clone file between old and new
data directories", I wonder if we shouldn't do some kind of dummy copy to
catch that error earlier, as well. Maybe that one is not worth it.

Cheers,

Jeff

#2Peter Eisentraut
peter_e@gmx.net
In reply to: Jeff Janes (#1)
Re: pg_upgrade --clone error checking

On 2019-05-01 22:10, Jeff Janes wrote:

With the new pg_upgrade --clone, if we are going to end up throwing the
error "file cloning not supported on this platform" (which seems to
depend only on ifdefs) I think we should throw it first thing, before
any other checks are done and certainly before pg_dump gets run.

Could you explain in more detail what command you are running, what
messages you are getting, and what you would like to see instead?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Jeff Janes
jeff.janes@gmail.com
In reply to: Peter Eisentraut (#2)
Re: pg_upgrade --clone error checking

On Thu, May 2, 2019 at 11:57 AM Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

On 2019-05-01 22:10, Jeff Janes wrote:

With the new pg_upgrade --clone, if we are going to end up throwing the
error "file cloning not supported on this platform" (which seems to
depend only on ifdefs) I think we should throw it first thing, before
any other checks are done and certainly before pg_dump gets run.

Could you explain in more detail what command you are running, what
messages you are getting, and what you would like to see instead?

I'm running:

pg_upgrade --clone -b /home/jjanes/pgsql/REL9_6_12/bin/ -B
/home/jjanes/pgsql/origin_jit/bin/ -d /home/jjanes/pgsql/data_96/ -D
/home/jjanes/pgsql/data_clone/

And I get:

Performing Consistency Checks
-----------------------------
Checking cluster versions ok
Checking database user is the install user ok
Checking database connection settings ok
Checking for prepared transactions ok
Checking for reg* data types in user tables ok
Checking for contrib/isn with bigint-passing mismatch ok
Checking for tables WITH OIDS ok
Checking for invalid "unknown" user columns ok
Creating dump of global objects ok
Creating dump of database schemas
ok
Checking for presence of required libraries ok

file cloning not supported on this platform
Failure, exiting

I think the error message wording is OK, I think it should be thrown
earlier, before the "Creating dump of database schemas" (which can take a
long time), and preferably before either database is even started. So
ideally it would be something like:

Performing Consistency Checks
-----------------------------
Checking cluster versions
Checking file cloning support
File cloning not supported on this platform
Failure, exiting

When something is doomed to fail, we should report the failure as early as
feasibly detectable.

Cheers,

Jeff

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Jeff Janes (#3)
Re: pg_upgrade --clone error checking

On 2019-May-02, Jeff Janes wrote:

I think the error message wording is OK, I think it should be thrown
earlier, before the "Creating dump of database schemas" (which can take a
long time), and preferably before either database is even started. So
ideally it would be something like:

Performing Consistency Checks
-----------------------------
Checking cluster versions
Checking file cloning support
File cloning not supported on this platform
Failure, exiting

When something is doomed to fail, we should report the failure as early as
feasibly detectable.

I agree -- this check should be done before checking the database
contents. Maybe even before "Checking cluster versions".

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#5Jeff Janes
jeff.janes@gmail.com
In reply to: Alvaro Herrera (#4)
Re: pg_upgrade --clone error checking

On Thu, May 2, 2019 at 12:28 PM Alvaro Herrera <alvherre@2ndquadrant.com>
wrote:

On 2019-May-02, Jeff Janes wrote:

When something is doomed to fail, we should report the failure as early

as

feasibly detectable.

I agree -- this check should be done before checking the database
contents. Maybe even before "Checking cluster versions".

It looks like it was designed for early checking, it just wasn't placed
early enough. So changing it is pretty easy, as check_file_clone does not
need to be invented, and there is no additional code duplication over what
was already there.

This patch moves the checking to near the beginning.

It carries the --link mode checking along with it. That should be done as
well, and doing it as a separate patch would make both patches uglier.

Cheers,

Jeff

Attachments:

pg_upgrade_filesystem.patchapplication/octet-stream; name=pg_upgrade_filesystem.patchDownload+22-12
#6Peter Eisentraut
peter_e@gmx.net
In reply to: Jeff Janes (#5)
Re: pg_upgrade --clone error checking

On 2019-05-02 20:03, Jeff Janes wrote:

It looks like it was designed for early checking, it just wasn't placed
early enough.  So changing it is pretty easy, as check_file_clone does
not need to be invented, and there is no additional code duplication
over what was already there.

This patch moves the checking to near the beginning.

I think the reason it was ordered that way is that it wants to do all
the checks of the old cluster before doing any checks touching the new
cluster.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#7Jeff Janes
jeff.janes@gmail.com
In reply to: Peter Eisentraut (#6)
Re: pg_upgrade --clone error checking

On Fri, May 3, 2019 at 3:53 AM Peter Eisentraut <
peter.eisentraut@2ndquadrant.com> wrote:

On 2019-05-02 20:03, Jeff Janes wrote:

It looks like it was designed for early checking, it just wasn't placed
early enough. So changing it is pretty easy, as check_file_clone does
not need to be invented, and there is no additional code duplication
over what was already there.

This patch moves the checking to near the beginning.

I think the reason it was ordered that way is that it wants to do all
the checks of the old cluster before doing any checks touching the new
cluster.

But is there a reason to want to do that? I understand we don't want to
keep starting and stopping the clusters needlessly, so we should do
everything we can in one before moving to the other. But for checks that
don't need a running cluster, why would it matter? The existence and
contents of PG_VERSION of the new cluster directory is already checked at
the very beginning (and even tries to start it up and shuts it down again
if a pid file also exists), so there is precedence for touching the new
cluster directory at the filesystem level early (albeit in a readonly
manner) and if a pid file exists then doing even more than that. I didn't
move check_file_clone to before the liveness check is done, out of a
abundance of caution. But creating a transient file with a name of no
significance ("PG_VERSION.clonetest") in a cluster that is not even running
seems like a very low risk thing to do. The pay off is that we get an
inevitable error message much sooner.

Cheers,

Jeff