BUG #7590: Data corruption using pg_dump only with -Z parameter

Started by Jan Vodičkaover 13 years ago10 messagesbugs
Jump to latest
#1Jan Vodička
hrtlik@gmail.com

The following bug has been logged on the website:

Bug reference: 7590
Logged by: Jan Vodička
Email address: hrtlik@gmail.com
PostgreSQL version: 9.2.1
Operating system: Windows 8
Description:

"pg_dump -Z1 my_db > backup" always make corrupted package.
When I try it on postgres database which created from installation: "pg_dump
postgres > backup" it is ok.
I can reproduce it everytime.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jan Vodička (#1)
Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

hrtlik@gmail.com writes:

The following bug has been logged on the website:
Bug reference: 7590
Logged by: Jan Vodička
Email address: hrtlik@gmail.com
PostgreSQL version: 9.2.1
Operating system: Windows 8
Description:

"pg_dump -Z1 my_db > backup" always make corrupted package.

On Windows, that doesn't seem terribly surprising: Windows will probably
do newline munging on the process's stdout, which will corrupt
compressed data since it's not plain text. There's not a lot we can do
to prevent that. Try it like this instead:

pg_dump -Z1 -f backup.gz my_db

to keep the data away from Windows' interference.

regards, tom lane

#3Ryan Kelly
rpkelly22@gmail.com
In reply to: Jan Vodička (#1)
Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

On Tue, Oct 09, 2012 at 02:20:40PM +0000, hrtlik@gmail.com wrote:

The following bug has been logged on the website:

Bug reference: 7590
Logged by: Jan Vodička
Email address: hrtlik@gmail.com
PostgreSQL version: 9.2.1
Operating system: Windows 8
Description:

"pg_dump -Z1 my_db > backup" always make corrupted package.

What does this mean? How did you verify that you got a "corrupted
package"?

When I try it on postgres database which created from installation: "pg_dump
postgres > backup" it is ok.
I can reproduce it everytime.

-Ryan Kelly

#4Jan Vodička
hrtlik@gmail.com
In reply to: Ryan Kelly (#3)
Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

= not able to unpack, invalid

Try this one generated by "pg_dump -Z1 > backup.gz" in windows:
http://mstu.cz/~hrtlik/backup.gz (0.5kB)
original "pg_dump > backup.gz" without compression:
http://mstu.cz/~hrtlik/backup.sql

If you have any way how to get original, tell me.

2012/10/9 Ryan Kelly <rpkelly22@gmail.com>

On Tue, Oct 09, 2012 at 02:20:40PM +0000, hrtlik@gmail.com wrote:

The following bug has been logged on the website:

Bug reference: 7590
Logged by: Jan Vodička
Email address: hrtlik@gmail.com
PostgreSQL version: 9.2.1
Operating system: Windows 8
Description:

"pg_dump -Z1 my_db > backup" always make corrupted package.

What does this mean? How did you verify that you got a "corrupted
package"?

When I try it on postgres database which created from installation:

"pg_dump

postgres > backup" it is ok.
I can reproduce it everytime.

-Ryan Kelly

--
Jan Vodička

#5Jan Vodička
hrtlik@gmail.com
In reply to: Tom Lane (#2)
Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

So is there any way how to get plain sql from this "corrupted" backup?
It would be nice to mention this behavior in manual.

2012/10/9 Tom Lane <tgl@sss.pgh.pa.us>

hrtlik@gmail.com writes:

The following bug has been logged on the website:
Bug reference: 7590
Logged by: Jan Vodička
Email address: hrtlik@gmail.com
PostgreSQL version: 9.2.1
Operating system: Windows 8
Description:

"pg_dump -Z1 my_db > backup" always make corrupted package.

On Windows, that doesn't seem terribly surprising: Windows will probably
do newline munging on the process's stdout, which will corrupt
compressed data since it's not plain text. There's not a lot we can do
to prevent that. Try it like this instead:

pg_dump -Z1 -f backup.gz my_db

to keep the data away from Windows' interference.

regards, tom lane

--
Jan Vodička

#6Craig Ringer
craig@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

On 10/10/2012 02:38 AM, Tom Lane wrote:

hrtlik@gmail.com writes:

The following bug has been logged on the website:
Bug reference: 7590
Logged by: Jan Vodi�ka
Email address: hrtlik@gmail.com
PostgreSQL version: 9.2.1
Operating system: Windows 8
Description:

"pg_dump -Z1 my_db > backup" always make corrupted package.

On Windows, that doesn't seem terribly surprising: Windows will probably
do newline munging on the process's stdout, which will corrupt
compressed data since it's not plain text.

pg_dump might want to refuse to write to stdout when in a non-plain-text
mode on Windows if that's the case.

--
Craig Ringer

#7Craig Ringer
craig@2ndquadrant.com
In reply to: Jan Vodička (#4)
Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

On 10/10/2012 03:07 AM, Jan Vodička wrote:

= not able to unpack, invalid

Try this one generated by "pg_dump -Z1 > backup.gz" in windows:
http://mstu.cz/~hrtlik/backup.gz
<http://mstu.cz/%7Ehrtlik/backup.gz&gt; (0.5kB)
original "pg_dump > backup.gz" without compression:
http://mstu.cz/~hrtlik/backup.sql <http://mstu.cz/%7Ehrtlik/backup.sql&gt;

If you have any way how to get original, tell me.

If Tom is right and the issue is end-of-line transformation, in theory
you might be able to un-mungle newlines. The chances of \r\n occurring
naturally in a tiny backup like that are not huge, so any \r\n in the
data probably used to be a raw \n. Taking a copy of the DB and
performing that substitution might get you a usable backup file.

That's replacing all \x0d\x0a sequences with \x0a. Or I might be wrong
and it's \x0d.

This won't work on a larger backup where some \r\n sequences will occur
naturally in compressed binary data. In those you're likely to have a
much, much bigger job ahead of you.

--
Craig Ringer

#8Jan Vodička
hrtlik@gmail.com
In reply to: Craig Ringer (#7)
Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

That would be definitely much more comfortable solution.
Problem was really in newlines \n vs. \r\n. Replace \r\n -> \n solved
problem.

Thanks a lot.

2012/10/13 Craig Ringer <ringerc@ringerc.id.au>

On 10/10/2012 03:07 AM, Jan Vodička wrote:

= not able to unpack, invalid

Try this one generated by "pg_dump -Z1 > backup.gz" in windows:
http://mstu.cz/~hrtlik/backup.**gz <http://mstu.cz/~hrtlik/backup.gz&gt;
<http://mstu.cz/%7Ehrtlik/**backup.gz&lt;http://mstu.cz/%7Ehrtlik/backup.gz&gt;&gt;
(0.5kB)
original "pg_dump > backup.gz" without compression:
http://mstu.cz/~hrtlik/backup.**sql <http://mstu.cz/~hrtlik/backup.sql&gt; <
http://mstu.cz/%7Ehrtlik/**backup.sql&lt;http://mstu.cz/%7Ehrtlik/backup.sql&gt;

If you have any way how to get original, tell me.

If Tom is right and the issue is end-of-line transformation, in theory you
might be able to un-mungle newlines. The chances of \r\n occurring
naturally in a tiny backup like that are not huge, so any \r\n in the data
probably used to be a raw \n. Taking a copy of the DB and performing that
substitution might get you a usable backup file.

That's replacing all \x0d\x0a sequences with \x0a. Or I might be wrong and
it's \x0d.

This won't work on a larger backup where some \r\n sequences will occur
naturally in compressed binary data. In those you're likely to have a much,
much bigger job ahead of you.

--
Craig Ringer

--
Jan Vodička

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Craig Ringer (#6)
Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

Craig Ringer <ringerc@ringerc.id.au> writes:

On 10/10/2012 02:38 AM, Tom Lane wrote:

On Windows, that doesn't seem terribly surprising: Windows will probably
do newline munging on the process's stdout, which will corrupt
compressed data since it's not plain text.

pg_dump might want to refuse to write to stdout when in a non-plain-text
mode on Windows if that's the case.

Actually, a look at the pg_dump code says that it does

setmode(fileno(stdout), O_BINARY);

so either my diagnosis is wrong or there's some reason why that setting
didn't take.

regards, tom lane

#10Jan Vodička
hrtlik@gmail.com
In reply to: Tom Lane (#9)
Re: BUG #7590: Data corruption using pg_dump only with -Z parameter

Thanks. I've already looked. Problem was that Windows replaced '\n' to
'\r\n', replacing bytes back '\r\n' -> '\n' solved the problem. It was
working on 16GB gzip package.
It should be nice to mention this behavior somewhere.

Jan Vodicka

2012/10/13 Tom Lane <tgl@sss.pgh.pa.us>

Craig Ringer <ringerc@ringerc.id.au> writes:

On 10/10/2012 02:38 AM, Tom Lane wrote:

On Windows, that doesn't seem terribly surprising: Windows will probably
do newline munging on the process's stdout, which will corrupt
compressed data since it's not plain text.

pg_dump might want to refuse to write to stdout when in a non-plain-text
mode on Windows if that's the case.

Actually, a look at the pg_dump code says that it does

setmode(fileno(stdout), O_BINARY);

so either my diagnosis is wrong or there's some reason why that setting
didn't take.

regards, tom lane

--
Jan Vodička