pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)

Started by Glyn Astillalmost 16 years ago10 messagesgeneral
Jump to latest
#1Glyn Astill
glynastill@yahoo.co.uk

Hi chaps,

I've just upgraded a server from 8.3 to 8.4, and when trying to use the parallel restore options I get the following error:

"pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)"

The dump I'm trying to restore is purely a data dump, and the schema is separate (due to the way our setup works).

These are the options I'm using for the dump and the restore:

pg_dump -Fc <dbname> -U postgres -h localhost -a --disable-triggers

pg_restore -U postgres --disable-triggers -j 4 -c -d <dbname>

can anyone tell me what I'm doing wrong, or why my files are not supported by parallel restore?

Thanks
Glyn

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Glyn Astill (#1)
Re: pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)

Glyn Astill <glynastill@yahoo.co.uk> writes:

I've just upgraded a server from 8.3 to 8.4, and when trying to use the parallel restore options I get the following error:

"pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)"

This is the second or third report we've gotten of that, but nobody's
been able to offer a reproducible test case. Can you?

regards, tom lane

#3Glyn Astill
glynastill@yahoo.co.uk
In reply to: Tom Lane (#2)
Re: pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)
--- On Fri, 30/4/10, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Glyn Astill <glynastill@yahoo.co.uk>
writes:

I've just upgraded a server from 8.3 to 8.4, and when

trying to use the parallel restore options I get the
following error:

"pg_restore: [custom archiver] dumping a specific TOC

data block out of order is not supported without ID on this
input stream (fseek required)"

This is the second or third report we've gotten of that,
but nobody's
been able to offer a reproducible test case.  Can
you?

Hi Tom,

The schema is fairly large, but I will try.

One thing I forgot to mention is that in the restore script I drop the indexes off my tables between restoring the schema and the data. I've always done this to speed up the restore, but is there any chance this could be causing the issue?

I guess what would help is some insight into what the error message means.

It appers to orginate in _PrintTocData in pg_backup_custom.c, but I don't really understand what's happening here at all, a wild guess is it's trying to seek to a particular toc entry in the file? or process the file sequentially?

http://doxygen.postgresql.org/pg__backup__custom_8c.html#6024b8108422e69062072df29f48506f

Glyn

#4Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Glyn Astill (#3)
Re: pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)

Glyn Astill wrote:

One thing I forgot to mention is that in the restore script I drop the indexes off my tables between restoring the schema and the data. I've always done this to speed up the restore, but is there any chance this could be causing the issue?

Uh. Why are you doing that? pg_restore is supposed to restore the
schema, then data, finally indexes and other stuff. Are you using
separate schema/data dumps? If so, don't do that -- it's known to be
slower.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#5Glyn Astill
glynastill@yahoo.co.uk
In reply to: Alvaro Herrera (#4)
Re: pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)
--- On Fri, 30/4/10, Alvaro Herrera <alvherre@commandprompt.com> wrote:

Uh.  Why are you doing that?  pg_restore is
supposed to restore the
schema, then data, finally indexes and other stuff. 
Are you using
separate schema/data dumps?  If so, don't do that --
it's known to be
slower.

Yes, I'm restoring the schema first, then the data.

The reason being that the data can come from different slony 1.2 slaves, but the schema always comes from the origin server due to modifications slony makes to schemas on the slaves.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Glyn Astill (#3)
Re: pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)

Glyn Astill <glynastill@yahoo.co.uk> writes:

The schema is fairly large, but I will try.

My guess is that you can reproduce it with not a lot of data, if you can
isolate the trigger condition.

One thing I forgot to mention is that in the restore script I drop the indexes off my tables between restoring the schema and the data. I've always done this to speed up the restore, but is there any chance this could be causing the issue?

Possibly. I think there must be *something* unusual triggering the
problem, and maybe that is it or part of it.

I guess what would help is some insight into what the error message means.

It's hard to tell. The likely theories are (1) we're doing things in an
order that requires seeking backwards in the file, and for some reason
pg_restore thinks it can't do that; (2) there's a bug causing the code
to search for a item number that isn't actually in the file.

One of the previous reports actually turned out to be pilot error: the
initial dump had failed after emitting a partially complete file, and
so the error from pg_restore was essentially an instance of (2). But
with three or so reports I'm thinking there's something else going on.

regards, tom lane

#7Glyn Astill
glynastill@yahoo.co.uk
In reply to: Tom Lane (#6)
Re: pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)

Well I've ony just gotten round to taking another look at this, response inline below:

--- On Fri, 30/4/10, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Glyn Astill <glynastill@yahoo.co.uk>
writes:

The schema is fairly large, but I will try.

My guess is that you can reproduce it with not a lot of
data, if you can
isolate the trigger condition.

Hmm, tried reducing the amount of data and the issue goes away. Could this indicate some issue with the file, like an issue with it's size (~~ 5gb)? Or could it be an issue with the data itself?

One thing I forgot to mention is that in the restore

script I drop the indexes off my tables between restoring
the schema and the data. I've always done this to speed up
the restore, but is there any chance this could be causing
the issue?

Possibly.  I think there must be *something* unusual
triggering the
problem, and maybe that is it or part of it.

I've removed this faffing with indexes inbetween but the problem still persists.

I guess what would help is some insight into what the

error message means.

It's hard to tell.  The likely theories are (1) we're
doing things in an
order that requires seeking backwards in the file, and for
some reason
pg_restore thinks it can't do that; (2) there's a bug
causing the code
to search for a item number that isn't actually in the
file.

One of the previous reports actually turned out to be pilot
error: the
initial dump had failed after emitting a partially complete
file, and
so the error from pg_restore was essentially an instance of
(2).  But
with three or so reports I'm thinking there's something
else going on.

So I'm still at a loss as to why it's happening.

I've tried to dig a little deeper (and I may just be punching thin air here) by adding the value of id into the error message at die_horribly() and it gives me id 7550 which is the first table in the TOC entry list when I do a pg_restore -l, everything above it is a sequence.

Here's a snip of pg_restore -l:

7775; 0 0 SEQUENCE SET website ui_content_id_seq pgcontrol
7550; 0 22272 TABLE DATA _main_replication sl_archive_counter slony

And the output if run it under gdb:

GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html&gt;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu"...
(gdb) set args -U postgres --disable-triggers -j 4 -c -d SEE Way5a-pgsql-SEE-data.gz
(gdb) break die_horribly
Breakpoint 1 at 0x4044b0: file pg_backup_archiver.c, line 1384.
(gdb) run
Starting program: /usr/local/pgsql/bin/pg_restore -U postgres --disable-triggers -j 4 -c -d SEE Way5a-pgsql-SEE-data.gz
[Thread debugging using libthread_db enabled]
[New Thread 0x7f72480eb700 (LWP 4335)]
pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)
hasSeek = 1 dataState = 1 id = 7550
[Switching to Thread 0x7f72480eb700 (LWP 4335)]

Breakpoint 1, die_horribly (AH=0x61c210, modulename=0x4171f6 "archiver", fmt=0x4167d8 "worker process failed: exit code %d\n") at pg_backup_archiver.c:1384
1384 {
(gdb) pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)
hasSeek = 1 dataState = 1 id = 7550
pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)
hasSeek = 1 dataState = 1 id = 7550
pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)
hasSeek = 1 dataState = 1 id = 7550

(gdb) bt
#0 die_horribly (AH=0x61c210, modulename=0x4171f6 "archiver", fmt=0x4167d8 "worker process failed: exit code %d\n") at pg_backup_archiver.c:1384
#1 0x0000000000408f14 in RestoreArchive (AHX=0x61c210, ropt=0x61c0d0) at pg_backup_archiver.c:3586
#2 0x0000000000403737 in main (argc=10, argv=0x7fffffffd5b8) at pg_restore.c:380
(gdb) step
pg_restore: [archiver] worker process failed: exit code 1

Program exited with code 01.

Any further ideas of where I should dig would be appreciated.

Thanks
Glyn

#8Alban Hertroys
dalroi@solfertje.student.utwente.nl
In reply to: Glyn Astill (#7)
Re: pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)

On 21 May 2010, at 11:58, Glyn Astill wrote:

Well I've ony just gotten round to taking another look at this, response inline below:

--- On Fri, 30/4/10, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Glyn Astill <glynastill@yahoo.co.uk>
writes:

The schema is fairly large, but I will try.

My guess is that you can reproduce it with not a lot of
data, if you can
isolate the trigger condition.

Hmm, tried reducing the amount of data and the issue goes away. Could this indicate some issue with the file, like an issue with it's size (~~ 5gb)? Or could it be an issue with the data itself?

The file-size in combination with an "out of order" error smells of a 32-bit integer wrap-around problem.

And indeed, from the documentation (http://www.postgresql.org/docs/8.4/interactive/lo-intro.html):
"One remaining advantage of the large object facility is that it allows values up to 2 GB in size"

So I guess your large object is too large.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.

!DSPAM:737,4bf6617510414104348269!

#9Glyn Astill
glynastill@yahoo.co.uk
In reply to: Alban Hertroys (#8)
Re: pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)
--- On Fri, 21/5/10, Alban Hertroys <dalroi@solfertje.student.utwente.nl> wrote:

On 21 May 2010, at 11:58, Glyn Astill
wrote:

Well I've ony just gotten round to taking another look

at this, response inline below:

--- On Fri, 30/4/10, Tom Lane <tgl@sss.pgh.pa.us>

wrote:

Glyn Astill <glynastill@yahoo.co.uk>
writes:

The schema is fairly large, but I will try.

My guess is that you can reproduce it with not a

lot of

data, if you can
isolate the trigger condition.

Hmm, tried reducing the amount of data and the issue

goes away. Could this indicate some issue with the file,
like an issue with it's size (~~ 5gb)? Or could it be an
issue with the data itself?

The file-size in combination with an "out of order" error
smells of a 32-bit integer wrap-around problem.

And indeed, from the documentation (http://www.postgresql.org/docs/8.4/interactive/lo-intro.html):
"One remaining advantage of the large object facility is
that it allows values up to 2 GB in size"

So I guess your large object is too large.

Hmm, we don't use any large objects though, all our data is pretty much just date, text and numeric fields etc

Glyn.

#10Alban Hertroys
dalroi@solfertje.student.utwente.nl
In reply to: Glyn Astill (#9)
Re: pg_restore: [custom archiver] dumping a specific TOC data block out of order is not supported without ID on this input stream (fseek required)

On 21 May 2010, at 12:44, Glyn Astill wrote:

So I guess your large object is too large.

Hmm, we don't use any large objects though, all our data is pretty much just date, text and numeric fields etc

Doh! Seems I mixed up a few threads here. It was probably the mentioning of a 5GB file that threw me off, hadn't realised you were referring to a dump file there.

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.

!DSPAM:737,4bf67a7e10411591919641!