Proposal: Integrity check

Started by Zdenek Kotalaalmost 18 years ago11 messages
#1Zdenek Kotala
Zdenek.Kotala@Sun.COM

Regarding to Robert Mach's work during Google SOC on data integrity
check. I would like to improve storage module and implement some
Robert's code into the core.

I would like to make following modification:

1) Add ReadBuffer_noerror (recommend me better name) function which will
accept damaged page without Error. This page will be marked as corrupted
and when ReadBuffer will touch this page then it will be handled in
standard way.

This is important for check and repair functions to process all table
without interruption.

2) Extend PageHeaderIsValid function reporting.

a) split one condition to more and report each problem separately
b) check linp if they point to correct place (occupied space)
c) check overlaying

Because some tests are time expensive, IntegrityCheckLevel variable will
specify how many test will be performed.

3) Add PageHeaderIsValid check also for write operation

In production it should catch problem with memory or software bugs. In
development it should catch memory overwriting.

4) Add special command (CHECK/VERIFY) or function which validates table
or whole database.

Any comments?

Thanks Zdenek

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zdenek Kotala (#1)
Re: Proposal: Integrity check

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

I would like to make following modification:

1) Add ReadBuffer_noerror (recommend me better name) function which will
accept damaged page without Error. This page will be marked as corrupted
and when ReadBuffer will touch this page then it will be handled in
standard way.

This seems like a pretty horrid idea. Bad pages shouldn't be allowed to
get into shared buffers in the first place. Why not have the checking
logic operate outside shared buffers?

3) Add PageHeaderIsValid check also for write operation

In production it should catch problem with memory or software bugs. In
development it should catch memory overwriting.

Is there any evidence whatsoever to demonstrate that this is worth the
cycles it will eat?

regards, tom lane

#3Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Tom Lane (#2)
Re: Proposal: Integrity check

Tom Lane wrote:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

I would like to make following modification:

1) Add ReadBuffer_noerror (recommend me better name) function which will
accept damaged page without Error. This page will be marked as corrupted
and when ReadBuffer will touch this page then it will be handled in
standard way.

This seems like a pretty horrid idea. Bad pages shouldn't be allowed to
get into shared buffers in the first place. Why not have the checking
logic operate outside shared buffers?

It currently works outside the shared buffers, but I afraid about
collision due to parallel read and write access on one block. I'm not
sure if parallel write(8k) and read(8k) is synchronized by kernel/fs or
not. If not it should generates false positive results. If yes than I'm
happy :-) with outside processing.

3) Add PageHeaderIsValid check also for write operation

In production it should catch problem with memory or software bugs. In
development it should catch memory overwriting.

Is there any evidence whatsoever to demonstrate that this is worth the
cycles it will eat?

Alex from clickware tries this modification to catch their problem with
random damaged database. But, I don't have any result yet.

Zdenek

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zdenek Kotala (#3)
Re: Proposal: Integrity check

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Tom Lane wrote:

This seems like a pretty horrid idea. Bad pages shouldn't be allowed to
get into shared buffers in the first place. Why not have the checking
logic operate outside shared buffers?

It currently works outside the shared buffers, but I afraid about
collision due to parallel read and write access on one block. I'm not
sure if parallel write(8k) and read(8k) is synchronized by kernel/fs or
not. If not it should generates false positive results. If yes than I'm
happy :-) with outside processing.

We're already assuming that; otherwise base backups for PITR don't work.

regards, tom lane

#5Zeugswetter Andreas ADI SD
Andreas.Zeugswetter@s-itsolutions.at
In reply to: Tom Lane (#4)
Re: Proposal: Integrity check

This seems like a pretty horrid idea. Bad pages shouldn't be

allowed to

get into shared buffers in the first place. Why not have the

checking

logic operate outside shared buffers?

It currently works outside the shared buffers, but I afraid about
collision due to parallel read and write access on one block. I'm

not

sure if parallel write(8k) and read(8k) is synchronized by kernel/fs

or

not. If not it should generates false positive results. If yes than

I'm

happy :-) with outside processing.

We're already assuming that; otherwise base backups for PITR
don't work.

I think we could, but iirc we did not. We do not need that assumption if
you don't
turn off fullpage writes. All pages that could potentially be changed
during the
backup exist in the WAL fullpages that have to be replayed.
Don't we even now allways write fullpages to WAL during backup exactly
because we
did not confirm that assumption ?

I think I recall times when some OS had trouble with this when you mixed
O_DIRECT (or was it also O_DATASYNC) and normal IO.

Andreas

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zeugswetter Andreas ADI SD (#5)
Re: Proposal: Integrity check

"Zeugswetter Andreas ADI SD" <Andreas.Zeugswetter@s-itsolutions.at> writes:

We're already assuming that; otherwise base backups for PITR
don't work.

I think we could, but iirc we did not. We do not need that assumption if
you don't
turn off fullpage writes.

Oh, I had forgotten that RestoreBkpBlocks restores unconditionally.
Good point.

It's still the case that there's no need to allow pages into shared
buffers that don't pass the header-is-valid checks. If they don't
pass, then there's no way that bufmgr has a conflicting copy.

regards, tom lane

#7Gregory Stark
stark@enterprisedb.com
In reply to: Tom Lane (#4)
Re: Proposal: Integrity check

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

Zdenek Kotala <Zdenek.Kotala@Sun.COM> writes:

Tom Lane wrote:

This seems like a pretty horrid idea. Bad pages shouldn't be allowed to
get into shared buffers in the first place. Why not have the checking
logic operate outside shared buffers?

It currently works outside the shared buffers, but I afraid about
collision due to parallel read and write access on one block. I'm not
sure if parallel write(8k) and read(8k) is synchronized by kernel/fs or
not. If not it should generates false positive results. If yes than I'm
happy :-) with outside processing.

We're already assuming that; otherwise base backups for PITR don't work.

Don't full page writes protect us? I thought you had to start applying logs
forward from the point the backup started. Should we be forcing full page
writes during the base backup, perhaps treating the start of the base backup
as if it was a checkpoint as far as triggering full page writes?

I have little confidence that write(2) and read(2) calls are synchronized in
this manner when postgres's page size is larger than the filesystem block
size. Certainly I doubt someone issuing a write(2) of a few megabytes is going
to stop another process from being able to issue reads against the same file
until those megabytes are all in kernel cache, let alone if they overflow
cache and force i/o.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!

#8Simon Riggs
simon@2ndquadrant.com
In reply to: Zdenek Kotala (#1)
Re: Proposal: Integrity check

On Fri, 2008-01-25 at 17:56 +0100, Zdenek Kotala wrote:

Regarding to Robert Mach's work during Google SOC on data integrity
check. I would like to improve storage module and implement some
Robert's code into the core.

I would like to make following modification:

1) Add ReadBuffer_noerror (recommend me better name) function which will
accept damaged page without Error. This page will be marked as corrupted
and when ReadBuffer will touch this page then it will be handled in
standard way.

This is important for check and repair functions to process all table
without interruption.

We shouldn't let duff data into shared buffers at all.

I think you could mix the two methods of reading buffers

- start a subtransaction
- read blocks into shared buffers
- if error, then re-read block into private memory and examine
- carry on thru table in a new subtransaction

OK with other points, except I don't want a new command. Let's do it as
a function that can accept block ranges to check, not just whole tables.
e.g. pg_check_blocks(17, 43) would check blocks 17 -> 43

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

#9Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Simon Riggs (#8)
Re: Proposal: Integrity check

Simon Riggs wrote:

On Fri, 2008-01-25 at 17:56 +0100, Zdenek Kotala wrote:

Regarding to Robert Mach's work during Google SOC on data integrity
check. I would like to improve storage module and implement some
Robert's code into the core.

I would like to make following modification:

1) Add ReadBuffer_noerror (recommend me better name) function which will
accept damaged page without Error. This page will be marked as corrupted
and when ReadBuffer will touch this page then it will be handled in
standard way.

This is important for check and repair functions to process all table
without interruption.

We shouldn't let duff data into shared buffers at all.

As Tom mentioned before. I agree, it could cause a lot of problems.

I think you could mix the two methods of reading buffers

- start a subtransaction
- read blocks into shared buffers
- if error, then re-read block into private memory and examine
- carry on thru table in a new subtransaction

It seems like good idea.

OK with other points, except I don't want a new command. Let's do it as
a function that can accept block ranges to check, not just whole tables.
e.g. pg_check_blocks(17, 43) would check blocks 17 -> 43

It makes sense. I think following function should cover all cases:

pg_check_blocks() - all db
pg_check_blocks(relno) - all relation
pg_check_blocks(relno, start, stop) - selected interval
pg_check_blocks(relno, array of blocks) - selected blocks

Zdenek

#10Alvaro Herrera
alvherre@commandprompt.com
In reply to: Zdenek Kotala (#1)
Re: Proposal: Integrity check

Zdenek Kotala escribi�:

Regarding to Robert Mach's work during Google SOC on data integrity
check. I would like to improve storage module and implement some
Robert's code into the core.

Did this go anywhere?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#11Zdenek Kotala
Zdenek.Kotala@Sun.COM
In reply to: Alvaro Herrera (#10)
Re: Proposal: Integrity check

Alvaro Herrera napsal(a):

Zdenek Kotala escribi�:

Regarding to Robert Mach's work during Google SOC on data integrity
check. I would like to improve storage module and implement some
Robert's code into the core.

Did this go anywhere?

I did not catch May commit fest :(. I plan to send core improvement to next
commit fest.

Zdenek