pg_dump - lost synchronization with server: got message type "d", length 6036499

Started by Klint Gorealmost 18 years ago10 messagesgeneral
Jump to latest
#1Klint Gore
kgore4@une.edu.au

Can someone shed some light on what's happening here?

D:\backups>pg_dump -Z 9 -Fc -C -U postgres -f sheepcrc\dbback.dmp sheepcrc2
pg_dump: Dumping the contents of table "uploadeddatafiles" failed:
PQgetCopyData
() failed.
pg_dump: Error message from server: lost synchronization with server:
got messag
e type "d", length 6036499
pg_dump: The command was: COPY public.uploadeddatafiles (id, username,
projectid
, aspsession, filename, filetype, filesize, filedata, uploadedon,
"timestamp") T
O stdout;

pg 8.3.1 server on w2k3
pg_dump 8.3.1 and 8.3.3 on the same machine both do the same thing.

data directory and backup directory excluded from virus scanner.

The followinng variations also failed. The length seems to be either
6036499 or 8435588 (84... is most common).
pg_dump -Z 9 -Fp -C -U postgres -f sheepcrc\dbback.dmp sheepcrc2
pg_dump -Fp -C -U postgres -f sheepcrc\dbback.dmp sheepcrc2

uploadeddatafiles holds excel spreadsheets in the filedata column.

sheepcrc2=# select count(*) from uploadeddatafiles;
count
-------
405
(1 row)

sheepcrc2=# select sum(length(filedata)) from uploadeddatafiles;
sum
-----------
271067619
(1 row)

This completes and returns all rows
select id, md5(filedata) from uploadeddatafiles

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Klint Gore (#1)
Re: pg_dump - lost synchronization with server: got message type "d", length 6036499

Klint Gore <kgore4@une.edu.au> writes:

Can someone shed some light on what's happening here?
D:\backups>pg_dump -Z 9 -Fc -C -U postgres -f sheepcrc\dbback.dmp sheepcrc2
pg_dump: Dumping the contents of table "uploadeddatafiles" failed:
PQgetCopyData
() failed.
pg_dump: Error message from server: lost synchronization with server:
got messag
e type "d", length 6036499

Hmm ... I think what is actually happening here is that pg_dump doesn't
have enough memory available to buffer the message. The only places
where libpq could report that error text with those parameters are
where pqCheckInBufferSpace has failed to enlarge the input buffer
sufficiently. Per the code comment:

/*
* XXX add some better recovery code... plan is to skip over
* the message using its length, then report an error. For the
* moment, just treat this like loss of sync (which indeed it
* might be!)
*/

6 meg doesn't seem particularly enormous though.  Are you running
pg_dump under some especially restrictive user limits?  Maybe it's
dying here after having leaked a lot of memory for some other reason
--- try watching the pg_dump process size while it runs.

regards, tom lane

#3Klint Gore
kgore4@une.edu.au
In reply to: Tom Lane (#2)
Re: pg_dump - lost synchronization with server: got message type "d", length 6036499

Tom Lane wrote:

Klint Gore <kgore4@une.edu.au> writes:

Can someone shed some light on what's happening here?
D:\backups>pg_dump -Z 9 -Fc -C -U postgres -f sheepcrc\dbback.dmp sheepcrc2
pg_dump: Dumping the contents of table "uploadeddatafiles" failed:
PQgetCopyData
() failed.
pg_dump: Error message from server: lost synchronization with server:
got messag
e type "d", length 6036499

6 meg doesn't seem particularly enormous though.  Are you running
pg_dump under some especially restrictive user limits?  Maybe it's
dying here after having leaked a lot of memory for some other reason
--- try watching the pg_dump process size while it runs.

I'm running it under my own account which has adminstrator rights.

Peak memory usage was about 540m which brought the total usage for the
machine to about half the physical memory allocated (3g total).

Is there a binary debug build for win32 somewhere?

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Klint Gore (#3)
Re: pg_dump - lost synchronization with server: got message type "d", length 6036499

Klint Gore <kgore4@une.edu.au> writes:

Tom Lane wrote:

Maybe it's
dying here after having leaked a lot of memory for some other reason
--- try watching the pg_dump process size while it runs.

Peak memory usage was about 540m which brought the total usage for the
machine to about half the physical memory allocated (3g total).

Well, that might well explain the failure. pg_dump does suck a lot of
schema information into memory at startup, but 540m seems excessive.
Maybe you've found a memory leak in pg_dump (it wouldn't be the first
one). Does this database have a particularly large number of objects?

regards, tom lane

#5Klint Gore
kgore4@une.edu.au
In reply to: Tom Lane (#4)
Re: pg_dump - lost synchronization with server: got message type "d", length 6036499

Tom Lane wrote:

Klint Gore <kgore4@une.edu.au> writes:

Tom Lane wrote:

Maybe it's
dying here after having leaked a lot of memory for some other reason
--- try watching the pg_dump process size while it runs.

Peak memory usage was about 540m which brought the total usage for the
machine to about half the physical memory allocated (3g total).

Well, that might well explain the failure. pg_dump does suck a lot of
schema information into memory at startup, but 540m seems excessive.
Maybe you've found a memory leak in pg_dump (it wouldn't be the first
one). Does this database have a particularly large number of objects?

regards, tom lane

I wouldn't call it large - 27 tables, 111 functions, 21 custom types
(used for set returning function results).

The biggest row count table has about 200k records (structure is
int,int,timestamp)

The biggest physical table is the one thats failiing. The table itself
is physically 81m and its toast table is 82m.

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Klint Gore (#5)
Re: pg_dump - lost synchronization with server: got message type "d", length 6036499

Klint Gore <kgore4@une.edu.au> writes:

Tom Lane wrote:

Maybe you've found a memory leak in pg_dump (it wouldn't be the first
one). Does this database have a particularly large number of objects?

I wouldn't call it large - 27 tables, 111 functions, 21 custom types
(used for set returning function results).

[ squint... ] Hard to see how that could be eating half a gig of
pg_dump memory space.

Would you be willing to send me a "pg_dump -s" (ie, just schema no
data) dump of this DB? Off-list of course.

regards, tom lane

#7Klint Gore
kgore4@une.edu.au
In reply to: Tom Lane (#6)
Re: pg_dump - lost synchronization with server: got message type "d", length 6036499

Tom Lane wrote:

Klint Gore <kgore4@une.edu.au> writes:

Tom Lane wrote:

Maybe you've found a memory leak in pg_dump (it wouldn't be the first
one). Does this database have a particularly large number of objects?

I wouldn't call it large - 27 tables, 111 functions, 21 custom types
(used for set returning function results).

[ squint... ] Hard to see how that could be eating half a gig of
pg_dump memory space.

Would you be willing to send me a "pg_dump -s" (ie, just schema no
data) dump of this DB? Off-list of course.

Upon more investigation pg_dump 8.3.1 on linux (ubuntu feisty) across
the network completes properly.

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Klint Gore (#1)
Re: pg_dump - lost synchronization with server: got message type "d", length 6036499

Klint Gore <kgore4@une.edu.au> writes:

Tom Lane wrote:

Would you be willing to send me a "pg_dump -s" (ie, just schema no
data) dump of this DB? Off-list of course.

attached. created with pg_dump 8.3.3 win32 to the 8.3.1 win32 server.

Thanks. When I reload this here and try to pg_dump it, I see no evidence
of any problem --- pg_dump completes pretty quickly and seems to eat
no more than a megabyte or so. I'm not sure what to make of that.
Could there be a Windows-specific memory problem in pg_dump?

regards, tom lane

#9Dave Page
dpage@pgadmin.org
In reply to: Klint Gore (#3)
Re: pg_dump - lost synchronization with server: got message type "d", length 6036499

On Thu, Jul 3, 2008 at 1:49 AM, Klint Gore <kgore4@une.edu.au> wrote:

I'm running it under my own account which has adminstrator rights.
Peak memory usage was about 540m which brought the total usage for the
machine to about half the physical memory allocated (3g total).
Is there a binary debug build for win32 somewhere?

If you're using 8.3.x, we include an archive of symbol files alongside
the installer on the download site.

--
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com

#10Klint Gore
kgore4@une.edu.au
In reply to: Tom Lane (#8)
Re: pg_dump - lost synchronization with server: got message type "d", length 6036499

Tom Lane wrote:

Klint Gore <kgore4@une.edu.au> writes:

Tom Lane wrote:

Would you be willing to send me a "pg_dump -s" (ie, just schema no
data) dump of this DB? Off-list of course.

attached. created with pg_dump 8.3.3 win32 to the 8.3.1 win32 server.

Thanks. When I reload this here and try to pg_dump it, I see no evidence
of any problem --- pg_dump completes pretty quickly and seems to eat
no more than a megabyte or so. I'm not sure what to make of that.
Could there be a Windows-specific memory problem in pg_dump?

Any more thoughts on this?

The file being produced when dumping the table in question is about 1
gig when it crashes. In the last check-in for fe-protocol3.c, there's
notes about 1 gig strings so I'm wondering if my data when received by
pg_dump (or output by the 8.3.1 server) goes over? 270m of bytea
converted to \xxx octal strings would make something > 1 gig. But the
270m is spread over 400 rows with nothing bigger than 3m.

If there's no interest in researching this any further, then I'll stop
the server and take a file system copy of the data directory and upgrade
it to 8.3.3 and pray the problem goes away.

klint.

--
Klint Gore
Database Manager
Sheep CRC
A.G.B.U.
University of New England
Armidale NSW 2350

Ph: 02 6773 3789
Fax: 02 6773 3266
EMail: kgore4@une.edu.au