16-bit page checksums for 9.2

Started by Simon Riggsover 14 years ago174 messageshackers
Jump to latest
#1Simon Riggs
simon@2ndQuadrant.com

After the various recent discussions on list, I present what I believe
to be a working patch implementing 16-but checksums on all buffer
pages.

page_checksums = on | off (default)

There are no required block changes; checksums are optional and some
blocks may have a checksum, others not. This means that the patch will
allow pg_upgrade.

That capability also limits us to 16-bit checksums. Fletcher's 16 is
used in this patch and seems rather quick, though that is easily
replaceable/tuneable if desired, perhaps even as a parameter enum.
This patch is a step on the way to 32-bit checksums in a future
redesign of the page layout, though that is not a required future
change, nor does this prevent that.

Checksum is set whenever the buffer is flushed to disk, and checked
when the page is read in from disk. It is not set at other times, and
for much of the time may not be accurate. This follows earlier
discussions from 2010-12-22, and is discussed in detail in patch
comments.

Note it works with buffer manager pages, which includes shared and
local data buffers, but not SLRU pages (yet? an easy addition but
needs other discussion around contention).

Note that all this does is detect bit errors on the page, it doesn't
identify where the error is, how bad and definitely not what caused it
or when it happened.

The main body of the patch involves changes to bufpage.c/.h so this
differs completely from the VMware patch, for technical reasons. Also
included are facilities to LockBufferForHints() with usage in various
AMs, to avoid the case where hints are set during calculation of the
checksum.

In my view this is a fully working, committable patch but I'm not in a
hurry to do so given the holiday season.

Hopefully its a gift not a turkey, and therefore a challenge for some
to prove that wrong. Enjoy either way,

Merry Christmas,

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

checksum16.v1.patchtext/x-patch; charset=US-ASCII; name=checksum16.v1.patchDownload+376-64
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#1)
Re: 16-bit page checksums for 9.2

Simon Riggs <simon@2ndQuadrant.com> writes:

After the various recent discussions on list, I present what I believe
to be a working patch implementing 16-but checksums on all buffer
pages.

I think locking around hint-bit-setting is likely to be unworkable from
a performance standpoint. I also wonder whether it might not result in
deadlocks.

Also, as far as I can see this patch usurps the page version field,
which I find unacceptably short-sighted. Do you really think this is
the last page layout change we'll ever make?

regards, tom lane

#3Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#2)
Re: 16-bit page checksums for 9.2

On Saturday, December 24, 2011 03:46:16 PM Tom Lane wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

After the various recent discussions on list, I present what I believe
to be a working patch implementing 16-but checksums on all buffer
pages.

I think locking around hint-bit-setting is likely to be unworkable from
a performance standpoint. I also wonder whether it might not result in
deadlocks.

Why don't you use the same tricks as the former patch and copy the buffer,
compute the checksum on that, and then write out that copy (you can even do
both at the same time). I have a hard time believing that the additional copy
is more expensive than the locking.

Andres

#4Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#2)
Re: 16-bit page checksums for 9.2

On Sat, Dec 24, 2011 at 2:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

After the various recent discussions on list, I present what I believe
to be a working patch implementing 16-but checksums on all buffer
pages.

I think locking around hint-bit-setting is likely to be unworkable from
a performance standpoint.

Anyone choosing page_checksums = on has already made a performance
reducing decision in favour of reliability. So they understand and
accept the impact. There is no locking when the parameter is off.

A safe alternative is to use LockBuffer, which has a much greater
performance impact.

I did think about optimistically checking after the write, but if we
crash at that point we will then see a block that has an invalid
checksum. It's faster but you may get a checksum failure if you crash
- but then one important aspect of this is to spot problems in case of
a crash, so that seems unacceptable.

I also wonder whether it might not result in
deadlocks.

If you can see how, please say. I can't see any ways for that myself.

Also, as far as I can see this patch usurps the page version field,
which I find unacceptably short-sighted.  Do you really think this is
the last page layout change we'll ever make?

No, I don't. I hope and expect the next page layout change to
reintroduce such a field.

But since we're agreed now that upgrading is important, changing page
format isn't likely to be happening until we get an online upgrade
process. So future changes are much less likely. If they do happen, we
have some flag bits spare that can be used to indicate later versions.
It's not the prettiest thing in the world, but it's a small ugliness
in return for an important feature. If there was a way without that, I
would have chosen it.

pg_filedump will need to be changed more than normal, but the version
isn't used anywhere else in the server code.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#5Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#3)
Re: 16-bit page checksums for 9.2

On Sat, Dec 24, 2011 at 3:54 PM, Andres Freund <andres@anarazel.de> wrote:

On Saturday, December 24, 2011 03:46:16 PM Tom Lane wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

After the various recent discussions on list, I present what I believe
to be a working patch implementing 16-but checksums on all buffer
pages.

I think locking around hint-bit-setting is likely to be unworkable from
a performance standpoint.  I also wonder whether it might not result in
deadlocks.

Why don't you use the same tricks as the former patch and copy the buffer,
compute the checksum on that, and then write out that copy (you can even do
both at the same time). I have a hard time believing that the additional copy
is more expensive than the locking.

We would copy every time we write, yet lock only every time we set hint bits.

If that option is favoured, I'll write another version after Christmas.

ISTM we can't write and copy at the same time because the cheksum is
not a trailer field.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#6Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#1)
Re: 16-bit page checksums for 9.2

On Sat, Dec 24, 2011 at 3:51 PM, Aidan Van Dyk <aidan@highrise.ca> wrote:

Not an expert here, but after reading through the patch quickly, I
don't see anything that changes the torn-page problem though, right?

Hint bits aren't wal-logged, and FPW isn't forced on the hint-bit-only
dirty, right?

Checksums merely detect a problem, whereas FPWs correct a problem if
it happens, but only in crash situations.

So this does nothing to remove the need for FPWs, though checksum
detection could be used for double write buffers also.

Checksums work even when there is no crash, so if your disk goes bad
and corrupts data then you'll know about it as soon as it happens.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#7Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#5)
Re: 16-bit page checksums for 9.2

On Saturday, December 24, 2011 05:01:02 PM Simon Riggs wrote:

On Sat, Dec 24, 2011 at 3:54 PM, Andres Freund <andres@anarazel.de> wrote:

On Saturday, December 24, 2011 03:46:16 PM Tom Lane wrote:

Simon Riggs <simon@2ndQuadrant.com> writes:

After the various recent discussions on list, I present what I believe
to be a working patch implementing 16-but checksums on all buffer
pages.

I think locking around hint-bit-setting is likely to be unworkable from
a performance standpoint. I also wonder whether it might not result in
deadlocks.

Why don't you use the same tricks as the former patch and copy the
buffer, compute the checksum on that, and then write out that copy (you
can even do both at the same time). I have a hard time believing that
the additional copy is more expensive than the locking.

We would copy every time we write, yet lock only every time we set hint
bits.

Isn't setting hint bits also a rather frequent operation? At least in a well-
cached workload where most writeout happens due to checkpoints.

If that option is favoured, I'll write another version after Christmas.

Seems less complicated (wrt deadlocking et al) to me. But I havent read your
patch, so I will shut up now ;)

Andres

#8Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#6)
Re: 16-bit page checksums for 9.2

On Sat, Dec 24, 2011 at 4:06 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Checksums merely detect a problem, whereas FPWs correct a problem if
it happens, but only in crash situations.

So this does nothing to remove the need for FPWs, though checksum
detection could be used for double write buffers also.

This is missing the point. If you have a torn page on a page that is
only dirty due to hint bits then the checksum will show a spurious
checksum failure. It will "detect" a problem that isn't there.

The problem is that there is no WAL indicating the hint bit change.
And if the torn page includes the new checksum but not the new hint
bit or vice versa it will be a checksum mismatch.

The strategy discussed in the past was moving all the hint bits to a
common area and skipping them in the checksum. No amount of double
writing or buffering or locking will avoid this problem.

--
greg

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#8)
Re: 16-bit page checksums for 9.2

On Sat, Dec 24, 2011 at 8:06 PM, Greg Stark <stark@mit.edu> wrote:

On Sat, Dec 24, 2011 at 4:06 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Checksums merely detect a problem, whereas FPWs correct a problem if
it happens, but only in crash situations.

So this does nothing to remove the need for FPWs, though checksum
detection could be used for double write buffers also.

This is missing the point. If you have a torn page on a page that is
only dirty due to hint bits then the checksum will show a spurious
checksum failure. It will "detect" a problem that isn't there.

It will detect a problem that *is* there, but one you are classifying
it as a non-problem because it is a correctable or acceptable bit
error. Given that acceptable bit errors on hints cover no more than 1%
of a block, the great likelihood is that the bit error is unacceptable
in any case, so false positives page errors are in fact very rare.

Any bit error is an indicator of problems on the external device, so
many would regard any bit error as unacceptable.

The problem is that there is no WAL indicating the hint bit change.
And if the torn page includes the new checksum but not the new hint
bit or vice versa it will be a checksum mismatch.

The strategy discussed in the past was moving all the hint bits to a
common area and skipping them in the checksum. No amount of double
writing or buffering or locking will avoid this problem.

I completely agree we should do this, but we are unable to do it now,
so this patch is a stop-gap and provides a much requested feature
*now*.

In the future, we will be able to tell the difference between an
acceptable and an unacceptable bit error. Right now, all we have is
the ability to detect a bit error and as I point out above that is 99%
of the problem solves, at least.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#10Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#9)
Re: 16-bit page checksums for 9.2

Simon Riggs wrote:
On Sat, Dec 24, 2011 at 8:06 PM, Greg Stark wrote:

The problem is that there is no WAL indicating the hint bit
change. And if the torn page includes the new checksum but not the
new hint bit or vice versa it will be a checksum mismatch.

With *just* this patch, true. An OS crash or hardware failure could
sometimes create an invalid page.

The strategy discussed in the past was moving all the hint bits to
a common area and skipping them in the checksum. No amount of
double writing or buffering or locking will avoid this problem.

I don't believe that. Double-writing is a technique to avoid torn
pages, but it requires a checksum to work. This chicken-and-egg
problem requires the checksum to be implemented first.

I completely agree we should do this, but we are unable to do it
now, so this patch is a stop-gap and provides a much requested
feature *now*.

Yes, for people who trust their environment to prevent torn pages, or
who are willing to tolerate one bad page per OS crash in return for
quick reporting of data corruption from unreliable file systems, this
is a good feature even without double-writes.

In the future, we will be able to tell the difference between an
acceptable and an unacceptable bit error.

A double-write patch would provide that, and it sounds like VMware
has a working patch for that which is being polished for submission.
It would need to wait until we have some consensus on the checksum
patch before it can be finalized. I'll try to review the patch from
this thread today, to do what I can to move that along.

-Kevin

#11Martijn van Oosterhout
kleptog@svana.org
In reply to: Simon Riggs (#5)
Re: 16-bit page checksums for 9.2

On Sat, Dec 24, 2011 at 04:01:02PM +0000, Simon Riggs wrote:

On Sat, Dec 24, 2011 at 3:54 PM, Andres Freund <andres@anarazel.de> wrote:

Why don't you use the same tricks as the former patch and copy the buffer,
compute the checksum on that, and then write out that copy (you can even do
both at the same time). I have a hard time believing that the additional copy
is more expensive than the locking.

ISTM we can't write and copy at the same time because the cheksum is
not a trailer field.

Ofcourse you can. If the checksum is in the trailer field you get the
nice property that the whole block has a constant checksum. However, if
you store the checksum elsewhere you just need to change the checking
algorithm to copy the checksum out, zero those bytes and run the
checksum and compare with the extracted checksum.

Not pretty, but I don't think it makes a difference in performence.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

He who writes carelessly confesses thereby at the very outset that he does
not attach much importance to his own thoughts.

-- Arthur Schopenhauer

#12Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#9)
Re: 16-bit page checksums for 9.2

On Sun, Dec 25, 2011 at 5:08 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Sat, Dec 24, 2011 at 8:06 PM, Greg Stark <stark@mit.edu> wrote:

On Sat, Dec 24, 2011 at 4:06 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

Checksums merely detect a problem, whereas FPWs correct a problem if
it happens, but only in crash situations.

So this does nothing to remove the need for FPWs, though checksum
detection could be used for double write buffers also.

This is missing the point. If you have a torn page on a page that is
only dirty due to hint bits then the checksum will show a spurious
checksum failure. It will "detect" a problem that isn't there.

It will detect a problem that *is* there, but one you are classifying
it as a non-problem because it is a correctable or acceptable bit
error.

I don't agree with this. We don't WAL-log hint bit changes precisely
because it's OK if they make it to disk and it's OK if they don't.
Given that, I don't see how we can say that writing out only half of a
page that has had hint bit changes is a problem. It's not.

(And if it is, then we ought to WAL-log all such changes regardless of
whether CRCs are in use.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#13Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Kevin Grittner (#10)
Re: 16-bit page checksums for 9.2

On 25.12.2011 15:01, Kevin Grittner wrote:

I don't believe that. Double-writing is a technique to avoid torn
pages, but it requires a checksum to work. This chicken-and-egg
problem requires the checksum to be implemented first.

I don't think double-writes require checksums on the data pages
themselves, just on the copies in the double-write buffers. In the
double-write buffer, you'll need some extra information per-page anyway,
like a relfilenode and block number that indicates which page it is in
the buffer.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#14Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#13)
Re: 16-bit page checksums for 9.2

On Tue, Dec 27, 2011 at 8:05 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 25.12.2011 15:01, Kevin Grittner wrote:

I don't believe that.  Double-writing is a technique to avoid torn
pages, but it requires a checksum to work.  This chicken-and-egg
problem requires the checksum to be implemented first.

I don't think double-writes require checksums on the data pages themselves,
just on the copies in the double-write buffers. In the double-write buffer,
you'll need some extra information per-page anyway, like a relfilenode and
block number that indicates which page it is in the buffer.

How would you know when to look in the double write buffer?

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#15Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#10)
Re: 16-bit page checksums for 9.2

On Sun, Dec 25, 2011 at 1:01 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:

This chicken-and-egg
problem requires the checksum to be implemented first.

v2 of checksum patch, using a conditional copy if checksumming is
enabled, so locking is removed.

Thanks to Andres for thwacking me with the cluestick, though I have
used a simple copy rather than a copy & calc.

Tested using make installcheck with parameter on/off, then restart and
vacuumdb to validate all pages.

Reviews, objections, user interface tweaks all welcome.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

checksum16.v2.patchtext/x-patch; charset=US-ASCII; name=checksum16.v2.patchDownload+383-71
#16Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#14)
Re: 16-bit page checksums for 9.2

On 28.12.2011 01:39, Simon Riggs wrote:

On Tue, Dec 27, 2011 at 8:05 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 25.12.2011 15:01, Kevin Grittner wrote:

I don't believe that. Double-writing is a technique to avoid torn
pages, but it requires a checksum to work. This chicken-and-egg
problem requires the checksum to be implemented first.

I don't think double-writes require checksums on the data pages themselves,
just on the copies in the double-write buffers. In the double-write buffer,
you'll need some extra information per-page anyway, like a relfilenode and
block number that indicates which page it is in the buffer.

How would you know when to look in the double write buffer?

You scan the double-write buffer, and every page in the double write
buffer that has a valid checksum, you copy to the main storage. There's
no need to check validity of pages in the main storage.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#17Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#16)
Re: 16-bit page checksums for 9.2

On Wed, Dec 28, 2011 at 7:42 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

How would you know when to look in the double write buffer?

You scan the double-write buffer, and every page in the double write buffer
that has a valid checksum, you copy to the main storage. There's no need to
check validity of pages in the main storage.

OK, then we are talking at cross purposes. Double write buffers, in
the way you explain them allow us to remove full page writes. They
clearly don't do anything to check page validity on read. Torn pages
are not the only fault we wish to correct against... and the double
writes idea is orthogonal to the idea of checksums.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#18Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#17)
Re: 16-bit page checksums for 9.2

On 28.12.2011 11:22, Simon Riggs wrote:

On Wed, Dec 28, 2011 at 7:42 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

How would you know when to look in the double write buffer?

You scan the double-write buffer, and every page in the double write buffer
that has a valid checksum, you copy to the main storage. There's no need to
check validity of pages in the main storage.

OK, then we are talking at cross purposes. Double write buffers, in
the way you explain them allow us to remove full page writes. They
clearly don't do anything to check page validity on read. Torn pages
are not the only fault we wish to correct against... and the double
writes idea is orthogonal to the idea of checksums.

The reason we're talking about double write buffers in this thread is
that double write buffers can be used to solve the problem with hint
bits and checksums.

You're right, though, that it's academical whether double write buffers
can be used without checksums on data pages, if the whole point of the
exercise is to make it possible to have checksums on data pages..

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#19Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#18)
Re: 16-bit page checksums for 9.2

On Wed, Dec 28, 2011 at 5:45 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 28.12.2011 11:22, Simon Riggs wrote:

On Wed, Dec 28, 2011 at 7:42 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com>  wrote:

How would you know when to look in the double write buffer?

You scan the double-write buffer, and every page in the double write
buffer
that has a valid checksum, you copy to the main storage. There's no need
to
check validity of pages in the main storage.

OK, then we are talking at cross purposes. Double write buffers, in
the way you explain them allow us to remove full page writes. They
clearly don't do anything to check page validity on read. Torn pages
are not the only fault we wish to correct against... and the double
writes idea is orthogonal to the idea of checksums.

The reason we're talking about double write buffers in this thread is that
double write buffers can be used to solve the problem with hint bits and
checksums.

Torn pages are not the only problem we need to detect.

You said "You scan the double write buffer...". When exactly would you do that?

Please explain how a double write buffer detects problems that do not
occur as the result of a crash.

We don't have much time, so please be clear and lucid.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#20Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#19)
Re: 16-bit page checksums for 9.2

Heikki Linnakangas wrote:
On 28.12.2011 01:39, Simon Riggs wrote:

On Tue, Dec 27, 2011 at 8:05 PM, Heikki Linnakangas
wrote:

On 25.12.2011 15:01, Kevin Grittner wrote:

I don't believe that. Double-writing is a technique to avoid
torn pages, but it requires a checksum to work. This chicken-
and-egg problem requires the checksum to be implemented first.

I don't think double-writes require checksums on the data pages
themselves, just on the copies in the double-write buffers. In
the double-write buffer, you'll need some extra information per-
page anyway, like a relfilenode and block number that indicates
which page it is in the buffer.

You are clearly right -- if there is no checksum in the page itself,
you can put one in the double-write metadata. I've never seen that
discussed before, but I'm embarrassed that it never occurred to me.

How would you know when to look in the double write buffer?

You scan the double-write buffer, and every page in the double
write buffer that has a valid checksum, you copy to the main
storage. There's no need to check validity of pages in the main
storage.

Right. I'll recap my understanding of double-write (from memory --
if there's a material error or omission, I hope someone will correct
me).

The write-ups I've seen on double-write techniques have all the
writes to the double-write buffer (a single, sequential file that
stays around). This is done as sequential writing to a file which is
overwritten pretty frequently, making the writes to a controller very
fast, and a BBU write-back cache unlikely to actually write to disk
very often. On good server-quality hardware, it should be blasting
RAM-to_RAM very efficiently. The file is fsync'd (like I said,
hopefully to BBU cache), then each page in the double-write buffer is
written to the normal page location, and that is fsync'd. Once that
is done, the database writes have no risk of being torn, and the
double-write buffer is marked as empty. This all happens at the
point when you would be writing the page to the database, after the
WAL-logging.

On crash recovery you read through the double-write buffer from the
start and write the pages which look good (including a good checksum)
to the database before replaying WAL. If you find a checksum error
in processing the double-write buffer, you assume that you never got
as far as the fsync of the double-write buffer, which means you never
started writing the buffer contents to the database, which means
there can't be any torn pages there. If you get to the end and
fsync, you can be sure any torn pages from a previous attempt to
write to the database itself have been overwritten with the good copy
in the double-write buffer. Either way, you move on to WAL
processing.

You wind up with a database free of torn pages before you apply WAL.
full_page_writes to the WAL are not needed as long as double-write is
used for any pages which would have been written to the WAL. If
checksums were written to the double-buffer metadata instead of
adding them to the page itself, this could be implemented alone. It
would probably allow a modest speed improvement over using
full_page_writes and would eliminate those full-page images from the
WAL files, making them smaller.

If we do add a checksum to the page header, that could be used for
testing for torn pages in the double-write buffer without needing a
redundant calculation for double-write. With no torn pages in the
actual database, checksum failures there would never be false
positives. To get this right for a checksum in the page header,
double-write would need to be used for all cases where
full_page_writes now are used (i.e., the first write of a page after
a checkpoint), and for all unlogged writes (e.g., hint-bit-only
writes). There would be no correctness problem for always using
double-write, but it would be unnecessary overhead for other page
writes, which I think we can avoid.

-Kevin

#21Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#20)
#22Noah Misch
noah@leadboat.com
In reply to: Kevin Grittner (#21)
#23Ants Aasma
ants.aasma@cybertec.at
In reply to: Kevin Grittner (#20)
#24Nicolas Barbier
nicolas.barbier@gmail.com
In reply to: Ants Aasma (#23)
#25Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#20)
#26Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#25)
#27Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#26)
#28Aidan Van Dyk
aidan@highrise.ca
In reply to: Kevin Grittner (#20)
#29Jeff Janes
jeff.janes@gmail.com
In reply to: Ants Aasma (#23)
#30Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Simon Riggs (#15)
#31Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Kevin Grittner (#30)
#32Robert Haas
robertmhaas@gmail.com
In reply to: Jeff Janes (#29)
#33Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#32)
#34Simon Riggs
simon@2ndQuadrant.com
In reply to: Jim Nasby (#31)
#35Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#30)
#36Andres Freund
andres@anarazel.de
In reply to: Kevin Grittner (#30)
#37Simon Riggs
simon@2ndQuadrant.com
In reply to: Andres Freund (#36)
#38Ants Aasma
ants.aasma@cybertec.at
In reply to: Robert Haas (#32)
#39Nicolas Barbier
nicolas.barbier@gmail.com
In reply to: Simon Riggs (#37)
#40Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Nicolas Barbier (#39)
#41Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Kevin Grittner (#33)
#42Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Alvaro Herrera (#41)
#43Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#34)
#44Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Stephen Frost (#43)
#45Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Kevin Grittner (#44)
#46Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#43)
#47Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#42)
#48Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#47)
#49Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#48)
#50Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Robert Haas (#49)
#51Robert Haas
robertmhaas@gmail.com
In reply to: Kevin Grittner (#50)
#52Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#46)
#53Simon Riggs
simon@2ndQuadrant.com
In reply to: Kevin Grittner (#44)
#54Stephen Frost
sfrost@snowman.net
In reply to: Simon Riggs (#52)
#55Simon Riggs
simon@2ndQuadrant.com
In reply to: Stephen Frost (#54)
#56Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#55)
#57Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#53)
#58Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#57)
#59Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#57)
#60Andres Freund
andres@anarazel.de
In reply to: Heikki Linnakangas (#59)
#61Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#58)
#62Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#60)
#63Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#62)
#64Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#62)
#65Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#61)
#66Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#64)
#67Merlin Moncure
mmoncure@gmail.com
In reply to: Andres Freund (#63)
#68Aidan Van Dyk
aidan@highrise.ca
In reply to: Merlin Moncure (#67)
#69Robert Haas
robertmhaas@gmail.com
In reply to: Merlin Moncure (#67)
#70Aidan Van Dyk
aidan@highrise.ca
In reply to: Aidan Van Dyk (#68)
#71Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#66)
#72Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#71)
#73Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#72)
#74Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#73)
#75Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#74)
#76Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Andres Freund (#56)
#77Greg Smith
gsmith@gregsmith.com
In reply to: Heikki Linnakangas (#72)
#78Greg Smith
gsmith@gregsmith.com
In reply to: Aidan Van Dyk (#28)
#79Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#75)
#80Noah Misch
noah@leadboat.com
In reply to: Simon Riggs (#79)
#81Dan Scales
scales@vmware.com
In reply to: Noah Misch (#80)
#82Robert Haas
robertmhaas@gmail.com
In reply to: Dan Scales (#81)
#83Dan Scales
scales@vmware.com
In reply to: Robert Haas (#82)
#84Simon Riggs
simon@2ndQuadrant.com
In reply to: Dan Scales (#83)
#85Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#84)
#86Robert Haas
robertmhaas@gmail.com
In reply to: Dan Scales (#83)
#87Bruce Momjian
bruce@momjian.us
In reply to: Martijn van Oosterhout (#11)
#88Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#4)
#89Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#88)
#90Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#89)
#91Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Bruce Momjian (#90)
#92Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#90)
#93Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#91)
#94Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#93)
#95Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#94)
#96Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#95)
#97Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#93)
#98Bruce Momjian
bruce@momjian.us
In reply to: Heikki Linnakangas (#91)
#99Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#97)
#100Simon Riggs
simon@2ndQuadrant.com
In reply to: Noah Misch (#80)
#101Noah Misch
noah@leadboat.com
In reply to: Simon Riggs (#100)
#102Simon Riggs
simon@2ndQuadrant.com
In reply to: Noah Misch (#101)
#103Noah Misch
noah@leadboat.com
In reply to: Simon Riggs (#102)
#104Simon Riggs
simon@2ndQuadrant.com
In reply to: Noah Misch (#103)
#105Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#104)
#106Albert Cervera i Areny
albert@nan-tic.com
In reply to: Simon Riggs (#105)
#107Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#105)
#108Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#107)
#109Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#108)
#110Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#107)
#111Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#110)
#112Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#105)
#113Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#110)
#114Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#113)
#115Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#113)
#116Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#114)
#117Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#115)
#118Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#117)
#119Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#118)
#120Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#119)
#121Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#120)
#122Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#121)
#123Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#121)
#124Bruce Momjian
bruce@momjian.us
In reply to: Robert Haas (#113)
#125Simon Riggs
simon@2ndQuadrant.com
In reply to: Bruce Momjian (#124)
#126Bruce Momjian
bruce@momjian.us
In reply to: Simon Riggs (#125)
#127Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#119)
#128Simon Riggs
simon@2ndQuadrant.com
In reply to: Noah Misch (#127)
#129Noah Misch
noah@leadboat.com
In reply to: Robert Haas (#113)
#130Simon Riggs
simon@2ndQuadrant.com
In reply to: Noah Misch (#129)
#131Robert Haas
robertmhaas@gmail.com
In reply to: Noah Misch (#127)
In reply to: Robert Haas (#131)
#133David Fetter
david@fetter.org
In reply to: Peter Geoghegan (#132)
#134Robert Haas
robertmhaas@gmail.com
In reply to: Peter Geoghegan (#132)
#135Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#130)
#136Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#135)
#137Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#136)
#138Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#137)
#139Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#138)
#140Simon Riggs
simon@2ndQuadrant.com
In reply to: Heikki Linnakangas (#139)
#141Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#140)
#142Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#141)
#143Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#142)
#144Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#143)
#145Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Simon Riggs (#144)
#146Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Heikki Linnakangas (#145)
#147Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#145)
#148Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Alvaro Herrera (#146)
#149Robert Haas
robertmhaas@gmail.com
In reply to: Alvaro Herrera (#146)
#150Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Robert Haas (#149)
#151Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Heikki Linnakangas (#148)
#152Robert Haas
robertmhaas@gmail.com
In reply to: Heikki Linnakangas (#150)
#153Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#152)
#154Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#153)
#155Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#153)
#156Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#155)
#157Jim Nasby
Jim.Nasby@BlueTreble.com
In reply to: Alvaro Herrera (#154)
#158Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#157)
#159Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#156)
#160Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#159)
#161Robert Haas
robertmhaas@gmail.com
In reply to: Josh Berkus (#160)
#162Josh Berkus
josh@agliodbs.com
In reply to: Robert Haas (#161)
#163Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#161)
#164Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#163)
#165Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Robert Haas (#164)
#166Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#165)
#167Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#166)
#168Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#167)
#169Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#168)
In reply to: Robert Haas (#134)
#171Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#168)
#172Jeff Janes
jeff.janes@gmail.com
In reply to: Simon Riggs (#112)
#173Jeff Davis
pgsql@j-davis.com
In reply to: Simon Riggs (#112)
#174Jeff Davis
pgsql@j-davis.com
In reply to: Simon Riggs (#112)