BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without enough disk space

Started by Nonameover 10 years ago10 messagesbugs
Jump to latest
#1Noname
manu@coopapotheken.be

The following bug has been logged on the website:

Bug reference: 13774
Logged by: Manu Joye
Email address: manu@coopapotheken.be
PostgreSQL version: 9.4.5
Operating system: Windows server 2008r2
Description:

running pg_upgrade yields no warnings when there is not enough disk space to
do the upgrade. I upgraded a server with about 10 databases following the
instructions as provided on the website here. Everything seemed to have gone
well except that some tables in various databases just vanished (pgadmin
complained about missing files) . Upon inspection I noticed the HD had only
a couple of KB of free space left. pg_upgrade seemed to have just restored
what it could and then exited without warning!

We caught it early on and could restore from backups but a check on required
free disk space before upgrade should be implemented or, if that doesn't
work, at least a warning from pg_upgrade that something went wrong.

Kind regards,

Manu

PS I LOVE postgresql, let that be clear. Thanks a whole lot to the whole
team for making this DBMS as awesome as it is.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Bruce Momjian
bruce@momjian.us
In reply to: Noname (#1)
Re: BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without enough disk space

On Thu, Nov 12, 2015 at 07:57:35PM +0000, manu@coopapotheken.be wrote:

The following bug has been logged on the website:

Bug reference: 13774
Logged by: Manu Joye
Email address: manu@coopapotheken.be
PostgreSQL version: 9.4.5
Operating system: Windows server 2008r2
Description:

running pg_upgrade yields no warnings when there is not enough disk space to
do the upgrade. I upgraded a server with about 10 databases following the
instructions as provided on the website here. Everything seemed to have gone
well except that some tables in various databases just vanished (pgadmin
complained about missing files) . Upon inspection I noticed the HD had only
a couple of KB of free space left. pg_upgrade seemed to have just restored
what it could and then exited without warning!

We caught it early on and could restore from backups but a check on required
free disk space before upgrade should be implemented or, if that doesn't
work, at least a warning from pg_upgrade that something went wrong.

Thank you for the report. This is embarrassing, but the code was
testing for the wrong return value on Windows. We used a macro to
define the same symbol on Windows and Unix (pg_copy_file), but for
Windows, we should have been testing for zero, while the code only
tested for a -1 return failure. You can see the Windows failure zero
return value defined here:

https://msdn.microsoft.com/en-us/library/windows/desktop/aa363851%28v=vs.85%29.aspx

Return value

If the function succeeds, the return value is nonzero.
If the function fails, the return value is zero. To get extended error
information, call GetLastError.

We do something similar for hard links, but we create a wrapper function
to return the proper value (-1), and we call it twice, so it seems wise
to keep that unchanged.

The attached patch fixes this and will be applied to all active branches
in the next minor release. Sorry for the bug.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +

Attachments:

copyfile.difftext/x-diff; charset=us-asciiDownload+6-8
#3Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#2)
Re: BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without enough disk space

On Fri, Nov 13, 2015 at 08:24:02PM -0500, Bruce Momjian wrote:

Thank you for the report. This is embarrassing, but the code was
testing for the wrong return value on Windows. We used a macro to
define the same symbol on Windows and Unix (pg_copy_file), but for
Windows, we should have been testing for zero, while the code only
tested for a -1 return failure. You can see the Windows failure zero
return value defined here:

https://msdn.microsoft.com/en-us/library/windows/desktop/aa363851%28v=vs.85%29.aspx

Return value

If the function succeeds, the return value is nonzero.
If the function fails, the return value is zero. To get extended error
information, call GetLastError.

We do something similar for hard links, but we create a wrapper function
to return the proper value (-1), and we call it twice, so it seems wise
to keep that unchanged.

The attached patch fixes this and will be applied to all active branches
in the next minor release. Sorry for the bug.

Patch applied and backpatched through 9.1.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#4Manu Joye
manu.joye@coopapotheken.be
In reply to: Bruce Momjian (#3)
Re: BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without enough disk space

Thanks for the quick fix, Bruce. Much appreciated.

Manu

-----Original Message-----
From: Bruce Momjian [mailto:bruce@momjian.us]
Sent: zaterdag 14 november 2015 17:48
To: manu@coopapotheken.be
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without
enough disk space

On Fri, Nov 13, 2015 at 08:24:02PM -0500, Bruce Momjian wrote:

Thank you for the report. This is embarrassing, but the code was
testing for the wrong return value on Windows. We used a macro to
define the same symbol on Windows and Unix (pg_copy_file), but for
Windows, we should have been testing for zero, while the code only
tested for a -1 return failure. You can see the Windows failure zero
return value defined here:

https://msdn.microsoft.com/en-us/library/windows/desktop/aa363851%28v=
vs.85%29.aspx

Return value

If the function succeeds, the return value is nonzero.
If the function fails, the return value is zero. To get extended

error

information, call GetLastError.

We do something similar for hard links, but we create a wrapper
function to return the proper value (-1), and we call it twice, so it
seems wise to keep that unchanged.

The attached patch fixes this and will be applied to all active
branches in the next minor release. Sorry for the bug.

Patch applied and backpatched through 9.1.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#5Bruce Momjian
bruce@momjian.us
In reply to: Manu Joye (#4)
Re: BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without enough disk space

On Sat, Nov 14, 2015 at 08:02:09PM +0100, Manu Joye wrote:

Thanks for the quick fix, Bruce. Much appreciated.

I am a little worried that bug might have masked other pg_upgrade
Windows copy failures. I am glad you were able to see the problem so
clearly.

---------------------------------------------------------------------------

Manu

-----Original Message-----
From: Bruce Momjian [mailto:bruce@momjian.us]
Sent: zaterdag 14 november 2015 17:48
To: manu@coopapotheken.be
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without
enough disk space

On Fri, Nov 13, 2015 at 08:24:02PM -0500, Bruce Momjian wrote:

Thank you for the report. This is embarrassing, but the code was
testing for the wrong return value on Windows. We used a macro to
define the same symbol on Windows and Unix (pg_copy_file), but for
Windows, we should have been testing for zero, while the code only
tested for a -1 return failure. You can see the Windows failure zero
return value defined here:

https://msdn.microsoft.com/en-us/library/windows/desktop/aa363851%28v=
vs.85%29.aspx

Return value

If the function succeeds, the return value is nonzero.
If the function fails, the return value is zero. To get extended

error

information, call GetLastError.

We do something similar for hard links, but we create a wrapper
function to return the proper value (-1), and we call it twice, so it
seems wise to keep that unchanged.

The attached patch fixes this and will be applied to all active
branches in the next minor release. Sorry for the bug.

Patch applied and backpatched through 9.1.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#6Noah Misch
noah@leadboat.com
In reply to: Bruce Momjian (#2)
Re: BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without enough disk space

On Fri, Nov 13, 2015 at 08:24:02PM -0500, Bruce Momjian wrote:

Thank you for the report. This is embarrassing, but the code was
testing for the wrong return value on Windows. We used a macro to
define the same symbol on Windows and Unix (pg_copy_file)

*** a/src/bin/pg_upgrade/file.c
--- b/src/bin/pg_upgrade/file.c
*************** copyAndUpdateFile(pageCnvCtx *pageConver
*** 34,40 ****
{
if (pageConverter == NULL)
{
! 		if (pg_copy_file(src, dst, force) == -1)
return getErrorText(errno);
else
return NULL;
--- 34,44 ----
{
if (pageConverter == NULL)
{
! #ifndef WIN32
! 		if (copy_file(src, dst, force) == -1)
! #else
! 		if (CopyFile(src, dst, force) == 0)
! #endif
return getErrorText(errno);

Thanks. This passage of code has at least two other problems on Windows. The
third CopyFile() argument means the opposite of the third copy_file()
argument. Next, getErrorText() is wrong on Windows:

const char *
getErrorText(int errNum)
{
#ifdef WIN32
_dosmaperr(GetLastError());
#endif
return pg_strdup(strerror(errNum));
}

Calling _dosmaperr() to set errno is futile, because we nonetheless proceed to
use the passed-in errNum.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#7Bruce Momjian
bruce@momjian.us
In reply to: Noah Misch (#6)
Re: BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without enough disk space

On Tue, Nov 17, 2015 at 11:43:34PM -0500, Noah Misch wrote:

On Fri, Nov 13, 2015 at 08:24:02PM -0500, Bruce Momjian wrote:

Thank you for the report. This is embarrassing, but the code was
testing for the wrong return value on Windows. We used a macro to
define the same symbol on Windows and Unix (pg_copy_file)

*** a/src/bin/pg_upgrade/file.c
--- b/src/bin/pg_upgrade/file.c
*************** copyAndUpdateFile(pageCnvCtx *pageConver
*** 34,40 ****
{
if (pageConverter == NULL)
{
! 		if (pg_copy_file(src, dst, force) == -1)
return getErrorText(errno);
else
return NULL;
--- 34,44 ----
{
if (pageConverter == NULL)
{
! #ifndef WIN32
! 		if (copy_file(src, dst, force) == -1)
! #else
! 		if (CopyFile(src, dst, force) == 0)
! #endif
return getErrorText(errno);

Thanks. This passage of code has at least two other problems on Windows. The
third CopyFile() argument means the opposite of the third copy_file()
argument. Next, getErrorText() is wrong on Windows:

const char *
getErrorText(int errNum)
{
#ifdef WIN32
_dosmaperr(GetLastError());
#endif
return pg_strdup(strerror(errNum));
}

Calling _dosmaperr() to set errno is futile, because we nonetheless proceed to
use the passed-in errNum.

Very good points. The attached patch fixes the copy parameter and
addresses the errno issue by just using errno direcdtly after the
_dosmaperr call.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +

Attachments:

copy.difftext/x-diff; charset=us-asciiDownload+54-57
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#7)
Re: Re: BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without enough disk space

Bruce Momjian <bruce@momjian.us> writes:

Very good points. The attached patch fixes the copy parameter and
addresses the errno issue by just using errno direcdtly after the
_dosmaperr call.

You did not change the comment

* Returns the text of the error message for the given error number

even though no error number is a "given" parameter anymore. Maybe
"... for the most-recently-reported error"?

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#9Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#8)
Re: Re: BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without enough disk space

On Wed, Nov 18, 2015 at 09:30:45AM -0500, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Very good points. The attached patch fixes the copy parameter and
addresses the errno issue by just using errno direcdtly after the
_dosmaperr call.

You did not change the comment

* Returns the text of the error message for the given error number

even though no error number is a "given" parameter anymore. Maybe
"... for the most-recently-reported error"?

Good point. I have changed it to "Returns the text of the most recent
error".

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#10Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#9)
Re: Re: BUG #13774: upgrade from 9.1 to 9.4 'succeeds' without enough disk space

On Wed, Nov 18, 2015 at 09:42:35AM -0500, Bruce Momjian wrote:

On Wed, Nov 18, 2015 at 09:30:45AM -0500, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Very good points. The attached patch fixes the copy parameter and
addresses the errno issue by just using errno direcdtly after the
_dosmaperr call.

You did not change the comment

* Returns the text of the error message for the given error number

even though no error number is a "given" parameter anymore. Maybe
"... for the most-recently-reported error"?

Good point. I have changed it to "Returns the text of the most recent
error".

Patch applied and backpatched through 9.1. I used a smaller patch, same
behavior, on pre-9.4 because the removal of the getErrorText() argument
were too invasive.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription                             +

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs