BUG #10095: primary key corruption
The following bug has been logged on the website:
Bug reference: 10095
Logged by: Luke Coldiron
Email address: lukecoldiron@hotmail.com
PostgreSQL version: 9.3.3
Operating system: Ubuntu Linux 12.04 "Precise Pangolin" 32bit
Description:
I am seeing a problem where different primary keys in my database are being
corrupted.
ERROR: could not read block 0 in file "base/16407/41243": read only 0 of
8192 bytes
When I look on the filesystem the "base/16407/41243" file is zero bytes.
When I lookup the object name that is currupt via select relname from
pg_class where relfilenode = 41243; it is always a primary key and not
always on the same table.
The system was previously upgraded from pg 8.3.7 and these issues did not
occur.
I haven't tried upgrading to 9.3.4 since it didn't look like any of the bug
fixes where targeted at the issue I am seeing.
Unfortunately, I have not yet be able to create a reproducible test case or
find a log where the issue first appeared. Any ideas would be much
appreciated.
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs
On Mon, Apr 21, 2014 at 5:08 PM, <lukecoldiron@hotmail.com> wrote:
ERROR: could not read block 0 in file "base/16407/41243": read only 0 of
8192 bytes
Is this server a slave? Or has it been at some point (and now promoted to
master)?
When I look on the filesystem the "base/16407/41243" file is zero bytes.
When I lookup the object name that is currupt via select relname from
pg_class where relfilenode = 41243; it is always a primary key and not
always on the same table.
For now, you can fix the corrupted indexes by simple issuing REINDEX.
Although I strongly recommend you doing a dump of all your databases,
remove it all and execute initdb again, and then restore the dumps.
The system was previously upgraded from pg 8.3.7 and these issues did not
occur.
How have you managed the upgrade? Also, has been any hardware issue
recently? I also recommend you checking for disk and memory corruption.
Best regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres
On Mon, Apr 21, 2014 at 5:08 PM, <lukecoldiron@hotmail.com> wrote:
ERROR: could not read block 0 in file "base/16407/41243": read only 0 of
8192 bytes
Is this server a slave? Or has it been at some point (and now promoted to master)?
It is not a slave server nor has it been at any point in time.
When I look on the filesystem the "base/16407/41243" file is zero bytes.
When I lookup the object name that is currupt via select relname from
pg_class where relfilenode = 41243; it is always a primary key and not
always on the same table.
For now, you can fix the corrupted indexes by simple issuing REINDEX. Although I strongly recommend you doing a dump of all your databases, remove it all and execute initdb again, and then restore the dumps.
The system was previously upgraded from pg 8.3.7 and these issues did not
occur.
How have you managed the upgrade? Also, has been any hardware issue recently? I also recommend you checking for disk and memory corruption.
I need to give a little more background on this. The database is installed standalone on many different hardware instances that are exactly the same. The database is used for configuration in a closed software appliance much like a consumer router. Acceptance testing of a fresh (no upgrade) pg 9.3.3 database instance has yielded a number of units with the primary key corruption issue after running for a short period of time (within a week of testing operation). As shown from the error message above the file that should hold the primary key is truncated. The table corresponding to this also contains zero rows but is not corrupt and is expected to have zero rows. I am suspecting a change in some behavior between pg 8.3.7 and 9.3.3 as the cause everything else being equal. At the moment I don't have much to go on as I have not been able to reproduce the issue on demand however I am still working at trying to be a reproducible test case.
Best regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres