BUG #14621: ERROR: compressed data is corrupt

Started by Lara Schembriabout 9 years ago4 messagesbugs
Jump to latest
#1Lara Schembri
Lara.Schembri@nyxgg.com

The following bug has been logged on the website:

Bug reference: 14621
Logged by: Lara Schembri
Email address: lara.schembri@nyxgg.com
PostgreSQL version: 9.3.4
Operating system: Centos 7
Description:

Hi,

A few months ago, the master has failed due to a controller issue and had to
failover to the slave. The Controller introduced some data corruption which
was detected recently.

Currently I'm trying to archive an old table is not being used and during
the pg_dump the below error was given

ERROR: compressed data is corrupt

I have identified that a text coloumn of the ctid (74127541,3) was
corrupted. This was done by doing a full table scan and wait for that error.
However, I would like to know how extensive this corruption is. Is there a
way to be able to extract all the corrupted CTID's and ignore them from the
copy.

Please note i have tried the chk function
declare t text;
begin t := $1;
return false;
exception when others then return true;
end;

which did not work.

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Lara Schembri (#1)
Re: BUG #14621: ERROR: compressed data is corrupt

lara.schembri@nyxgg.com writes:

Currently I'm trying to archive an old table is not being used and during
the pg_dump the below error was given
ERROR: compressed data is corrupt

I have identified that a text coloumn of the ctid (74127541,3) was
corrupted. This was done by doing a full table scan and wait for that error.
However, I would like to know how extensive this corruption is. Is there a
way to be able to extract all the corrupted CTID's and ignore them from the
copy.

Please note i have tried the chk function
declare t text;
begin t := $1;
return false;
exception when others then return true;
end;

which did not work.

You would get better responses if you defined what you meant by "did not
work", but I'm going to guess that the issue is that this code failed to
expose corrupted data. That's probably because it would have just
assigned the bad datum to "t" without decompressing it. I'd try something
that would require examination of the actual content of the string,
perhaps "t := md5($1)".

regards, tom lane

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#3Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#2)
Re: BUG #14621: ERROR: compressed data is corrupt

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Please note i have tried the chk function
declare t text;
begin t := $1;
return false;
exception when others then return true;
end;

which did not work.

Tom> You would get better responses if you defined what you meant by
Tom> "did not work", but I'm going to guess that the issue is that this
Tom> code failed to expose corrupted data. That's probably because it
Tom> would have just assigned the bad datum to "t" without
Tom> decompressing it.

The original intended use of that function (which is one I used to give
out regularly to people on IRC when working with them on data corruption
issues, and no doubt some of them have subsequently posted it on blogs
or whatnot) is to pass in the whole-row var for $1 like so:

select ctid, id from brokentable t where chk(t);

the intent being to detect failures of external toast fetches or other
corruption symptoms.

Unfortunately this is no longer as useful as it was, since a fix some
time back now has chk(t) detoast the fields of t before entering the
function, so the exception doesn't get caught. These days I usually
have people use this one instead:

create function chk(tid) returns boolean language plpgsql
as $f$
declare
r text;
begin
r := (select t from brokentable t where ctid=$1);
return false;
exception when others then
return true;
end;
$f$;

select ctid, id from brokentable t where chk(ctid);

In both versions of the function, it's relying on the fact that
everything will get detoasted and decompressed as part of casting the
record value to text.

--
Andrew (irc:RhodiumToad)

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

#4Lara Schembri
Lara.Schembri@nyxgg.com
In reply to: Andrew Gierth (#3)
Re: BUG #14621: ERROR: compressed data is corrupt

Hi Andrew and Tom,

Thanks for your reply.

I tried the suggested function and it did show where the corrupted ctid is.

Thanks a lot for your help. Will try this function on all tables just to see how bad the corruption is.

Regards
Lara
-----Original Message-----
From: Andrew Gierth [mailto:andrew@tao11.riddles.org.uk]
Sent: den 12 april 2017 15:01
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Lara Schembri <Lara.Schembri@nyxgg.com>; pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #14621: ERROR: compressed data is corrupt

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Please note i have tried the chk function
declare t text;
begin t := $1;
return false;
exception when others then return true;
end;

which did not work.

Tom> You would get better responses if you defined what you meant by Tom> "did not work", but I'm going to guess that the issue is that this Tom> code failed to expose corrupted data. That's probably because it Tom> would have just assigned the bad datum to "t" without Tom> decompressing it.

The original intended use of that function (which is one I used to give out regularly to people on IRC when working with them on data corruption issues, and no doubt some of them have subsequently posted it on blogs or whatnot) is to pass in the whole-row var for $1 like so:

select ctid, id from brokentable t where chk(t);

the intent being to detect failures of external toast fetches or other corruption symptoms.

Unfortunately this is no longer as useful as it was, since a fix some time back now has chk(t) detoast the fields of t before entering the function, so the exception doesn't get caught. These days I usually have people use this one instead:

create function chk(tid) returns boolean language plpgsql as $f$
declare
r text;
begin
r := (select t from brokentable t where ctid=$1);
return false;
exception when others then
return true;
end;
$f$;

select ctid, id from brokentable t where chk(ctid);

In both versions of the function, it's relying on the fact that everything will get detoasted and decompressed as part of casting the record value to text.

--
Andrew (irc:RhodiumToad)

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs