pg_dump or hardware?
Hi all,
short summary:
platform: i386 SMP (dual PIII)
os: linux 2.6.8.1
vendor: debian (3.1, stable)
pgsql ver: 7.4.7 (deb)
disk: tech. SCSI vendor. IBM model. DDYS-T36950N rev. S96H
controller: adaptec aic-7892a
description:
we're experiencing a weird problem
trying to get a dump of our db for backup purposes,
the executed command is:
/usr/bin/pg_dump -U postgres -h 6pali elenco | /usr/bin/bzip2 > elenco_test.bz2
the output:
pg_dump: ERROR: could not open relation with OID 201327173
pg_dump: SQL command to dump the contents of table "nominativi" failed: PQendcopy() failed.
pg_dump: Error message from server: ERROR: could not open relation with OID 201327173
pg_dump: The command was: COPY public.nominativi (nome_cogno, indirizzo, cap, citta, prov,
prefisso, telefono1, telefono2, note, idpersona, estrazione, num_estra, occupato,
cod_prov, cod_com, cod_reg, capoluo, rand) TO stdout;
so it seems that we've got some problems with the "nominativi" table
(a 20 million-row table), in fact the following command also fails:
pg_dump -t nominativi -U postgres -h 6pali elenco | /usr/bin/bzip2 > nominativi.bz2
with the same err msg as before. Before the erros occurs we're are able to
get a partial backup, see:
#> ls -l nominativi.bz2
-rw-r--r-- 1 sickpig users 2.5M apr 19 12:35 nominativi.bz2
#> wc -l nominativi
145904 nominativi
We're trying to understand whether this is due to data corruption or
hardware failure. We run long self-tests on our SCSI disk through
smartmontools on a regular basis. see attached file for "smartctl -a /dev/sda"
output. All suggestions are welcome.
Regards,
Andrea
Attachments:
smartctl.txttext/plain; name=smartctl.txtDownload
andrea suisani <andrea.suisani@opinioni.net> writes:
pg_dump: ERROR: could not open relation with OID 201327173
pg_dump: SQL command to dump the contents of table "nominativi" failed: PQendcopy() failed.
pg_dump: Error message from server: ERROR: could not open relation with OID 201327173
Hmm ... what do you get from
select oid from pg_class where relname = 'nominativi';
select relname from pg_class where oid = 201327173;
If the first returns 201327173 while the second doesn't return anything,
then I'd wonder about corruption of pg_class's OID index. REINDEXing it
might help. I don't remember whether 7.4 makes you use a standalone
backend to reindex system catalogs --- see its REINDEX man page for
details.
regards, tom lane
Thanks for the quick reply
Tom Lane wrote:
andrea suisani <andrea.suisani@opinioni.net> writes:
[cut]
Hmm ... what do you get from
select oid from pg_class where relname = 'nominativi';
oid
--------
561644
(1 row)
afaics it seems weird.... does this mean that another postgresql
object screw up? this is the "\d nominativi" output:
Table "public.nominativi"
Column | Type | Modifiers
------------+------------------+-----------
nome_cogno | character(70) |
indirizzo | character(40) |
cap | character(5) |
citta | character(35) |
prov | character(2) |
prefisso | character(4) |
telefono1 | character(13) |
telefono2 | character(13) |
note | character(100) |
idpersona | integer |
estrazione | date |
num_estra | integer |
occupato | boolean |
cod_prov | integer |
cod_com | integer |
cod_reg | integer |
capoluo | integer |
rand | double precision |
Indexes:
"citta1_idx" btree (prov, citta) WHERE (((prov = 'BL'::bpchar) OR (prov =
'PD'::bpchar) OR (prov = 'RO'::bpchar) OR (prov = 'VE'::bpchar) OR (prov = 'TV'::bpchar)
OR (prov = 'VR'::bpchar)) AND (cod_com IS NULL))
"cod_com_ndx1" btree (cod_com)
"codprov_capo_rand1" btree (cod_prov, capoluo, rand)
"pre_tel_index1" btree (prefisso, telefono1)
and none of nominativi's indexes has an oid like 201327173
select oid
from pg_class
where relname in ('citta1_idx','cod_com_ndx1','codprov_capo_rand1','pre_tel_index1');
oid
----------
46788374
40916657
40916656
40916658
(4 rows)
select relname from pg_class where oid = 201327173;
relname
---------
(0 rows)
while if I "ask" for oid 561644 this is what I get:
select relname from pg_class where oid = 561644;
relname
------------
nominativi
(1 row)
If the first returns 201327173 while the second doesn't return anything,
then I'd wonder about corruption of pg_class's OID index. REINDEXing it
might help.
I don't remember whether 7.4 makes you use a standalone
backend to reindex system catalogs --- see its REINDEX man page for
details.
from what I can see from REINDEX man page, I can reindex shared system catalogs
in stand-alone mode
[cut]
Regards,
Andrea