Substituting Checksum Algorithm (was: Enabling Checksums)

Started by Jeff Davisalmost 13 years ago29 messages
Jump to latest
#1Jeff Davis
pgsql@j-davis.com

Starting a new thread to more narrowly address the topic.

Attached is my reorganization of Ants's patch here:

/messages/by-id/CA
+CSw_vinyf-w45i=M1m__MpJZY=e8S4Nt_KNnpEbtWjTOaSUA@mail.gmail.com

My changes:

* wrest the core FNV algorithm from the specific concerns of a data page
- PageCalcChecksum16 mixes the block number, reduces to 16 bits,
and avoids the pd_checksum field
- the checksum algorithm is just a pure block checksum with a 32-bit
result
* moved the FNV checksum into a separate file, checksum.c
* added Ants's suggested compilation flags for better optimization
* slight update to the paragraph in the README that discusses concerns
specific to a data page

I do have a couple questions/concerns about Ants's patch though:

* The README mentions a slight bias; does that come from the mod
(2^16-1)? That's how I updated the README, so I wanted to make sure.
* How was the FNV_PRIME chosen?
* I can't match the individual algorithm step as described in the README
to the actual code. And the comments in the README don't make it clear
enough which one is right (or maybe they both are, and I'm just tired).

The README says:

hash = (hash ^ value) * ((hash ^ value) >> 3)

(which is obviously missing the FNV_PRIME factor) and the code says:

   +#define CHECKSUM_COMP(checksum, value) do {\
   +       uint32 __tmp = (checksum) ^ (value);\
   +       (checksum) = __tmp * FNV_PRIME ^ (__tmp >> 3);\
   +} while (0)

I'm somewhat on the fence about the "shift right". It was discussed in
this part of the thread:

/messages/by-id/99343716-5F5A-45C8-B2F6-74B9BA357396@phlo.org

I think we should be able to state with a little more clarity in the
README why there is a problem with plain FNV-1a, and why this
modification is both effective and safe.

I'd lean toward simplicity and closer adherence to the published version
of the algorithm rather than detecting a few more obscure error
patterns. It looks like the modification slows down the algorithm, too.

Regards,
Jeff Davis

Attachments:

fnv-jeff-20130422.patchtext/x-patch; charset=UTF-8; name=fnv-jeff-20130422.patchDownload+185-35
#2Florian Pflug
fgp@phlo.org
In reply to: Jeff Davis (#1)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On Apr23, 2013, at 09:17 , Jeff Davis <pgsql@j-davis.com> wrote:

I'd lean toward simplicity and closer adherence to the published version
of the algorithm rather than detecting a few more obscure error
patterns. It looks like the modification slows down the algorithm, too.

The pattern that plain FNV1 misses are not that obscure, unfortunately.
With plain FNV1A, the n-th bit of an input word (i.e. 32-bit block) only
affects bits n through 31 of the checksum. In particular, the most
significant bit of every 32-bit block only affects the MSB of the checksum,
making the algorithm miss any even number of flipped MSBs. More generally,
any form of data corruption that affects only the top N bits are missed
at least once out of 2^N times, since changing only those bits cannot
yield more than 2^N different checksum values.

Such corruption pattern may not be especially likely, given that we're
mainly protecting against disk corruption, not memory corruption. But
quantifying how unlikely exactly seems hard, thus providing at least some
form of protection against such errors seems prudent.

In addition, even with the shift-induced slowdown, FNV1+SHIFT still
performs similarly to hardware-accelerated CRC, reaching about 6bytes/cycle
on modern Intel CPUS. This really is plenty fast - if I'm not mistaken, it
translates to well over 10 GB/s.

So overall -1 for removing the shift.

best regards,
Florian Pflug

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Ants Aasma
ants.aasma@cybertec.at
In reply to: Jeff Davis (#1)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On Apr 23, 2013 10:17 AM, "Jeff Davis" <pgsql@j-davis.com> wrote:

Attached is my reorganization of Ants's patch here:

/messages/by-id/CA
+CSw_vinyf-w45i=M1m__MpJZY=e8S4Nt_KNnpEbtWjTOaSUA@mail.gmail.com

Thanks for your help. Some notes below.

My changes:

* wrest the core FNV algorithm from the specific concerns of a data page
- PageCalcChecksum16 mixes the block number, reduces to 16 bits,
and avoids the pd_checksum field
- the checksum algorithm is just a pure block checksum with a 32-bit
result
* moved the FNV checksum into a separate file, checksum.c

I think the function should not be called checksum_fnv as it implies that
we use the well known straightforward implementation. Maybe checksum_block
or some other generic name.

* added Ants's suggested compilation flags for better optimization

-msse4.1 is not safe to use in default builds. On the other hand it doesn't
hurt to just specify it in CFLAGS for the whole compile (possibly as
-march=native). We should just omit it and mention somewhere that SSE4.1
enabled builds will have better checksum performance.

* slight update to the paragraph in the README that discusses concerns
specific to a data page

I do have a couple questions/concerns about Ants's patch though:

* The README mentions a slight bias; does that come from the mod
(2^16-1)? That's how I updated the README, so I wanted to make sure.

Yes.

* How was the FNV_PRIME chosen?

I still haven't found the actual source for this value. It's specified as
the value to use for 32bit FNV-1a.

* I can't match the individual algorithm step as described in the README
to the actual code. And the comments in the README don't make it clear
enough which one is right (or maybe they both are, and I'm just tired).

I will try to reword.

The README says:

hash = (hash ^ value) * ((hash ^ value) >> 3)

(which is obviously missing the FNV_PRIME factor) and the code says:

+#define CHECKSUM_COMP(checksum, value) do {\
+       uint32 __tmp = (checksum) ^ (value);\
+       (checksum) = __tmp * FNV_PRIME ^ (__tmp >> 3);\
+} while (0)

I'm somewhat on the fence about the "shift right". It was discussed in
this part of the thread:

/messages/by-id/99343716-5F5A-45C8-B2F6-74B9BA357396@phlo.org

I think we should be able to state with a little more clarity in the
README why there is a problem with plain FNV-1a, and why this
modification is both effective and safe.

Florian already mentioned why it's effective. I have an intuition why it's
safe, will try to come up with a well reasoned argument.

Regards,
Antd Aasma

#4Andres Freund
andres@anarazel.de
In reply to: Jeff Davis (#1)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On 2013-04-23 00:17:28 -0700, Jeff Davis wrote:

+ # important optimization flags for checksum.c
+ ifeq ($(GCC),yes)
+ checksum.o: CFLAGS += -msse4.1 -funroll-loops -ftree-vectorize
+ endif

I am pretty sure we can't do those unconditionally:
- -funroll-loops and -ftree-vectorize weren't always part of gcc afair,
so we would need a configure check for those
- SSE4.1 looks like a total no-go, its not available everywhere. We
*can* add runtime detection of that with gcc fairly easily and
one-time if we wan't to go there (later?) using 'ifunc's, but that
needs a fair amount of infrastructure work.
- We can rely on SSE1/2 on amd64, but I think thats automatically
enabled there.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Ants Aasma
ants.aasma@cybertec.at
In reply to: Andres Freund (#4)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On Tue, Apr 23, 2013 at 11:47 AM, Andres Freund <andres@2ndquadrant.com> wrote:

On 2013-04-23 00:17:28 -0700, Jeff Davis wrote:

+ # important optimization flags for checksum.c
+ ifeq ($(GCC),yes)
+ checksum.o: CFLAGS += -msse4.1 -funroll-loops -ftree-vectorize
+ endif

I am pretty sure we can't do those unconditionally:
- -funroll-loops and -ftree-vectorize weren't always part of gcc afair,
so we would need a configure check for those

-funroll-loops is available from at least GCC 2.95. -ftree-vectorize
is GCC 4.0+. From what I read from the documentation on ICC -axSSE4.1
should generate a plain and accelerated version and do a runtime
check., I don't know if ICC vectorizes the specific loop in the patch,
but I would expect it to given that Intels vectorization has generally
been better than GCCs and the loop is about as simple as it gets. I
don't know the relevant options for other compilers.

- SSE4.1 looks like a total no-go, its not available everywhere. We
*can* add runtime detection of that with gcc fairly easily and
one-time if we wan't to go there (later?) using 'ifunc's, but that
needs a fair amount of infrastructure work.
- We can rely on SSE1/2 on amd64, but I think thats automatically
enabled there.

This is why I initially went for the lower strength 16bit checksum
calculation - requiring only SSE2 would have made supporting the
vectorized version on amd64 trivial. By now my feeling is that it's
not prudent to compromise in quality to save some infrastructure
complexity. If we set a hypothetical VECTORIZATION_FLAGS variable at
configure time, the performance is still there for those who need it
and can afford CPU specific builds.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Jeff Davis
pgsql@j-davis.com
In reply to: Ants Aasma (#3)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On Tue, 2013-04-23 at 11:44 +0300, Ants Aasma wrote:

I will try to reword.

Did you have a chance to clarify this, as well as some of the other
documentation issues Simon mentioned here?

/messages/by-id/CA+U5nMKVEu8UDXQe
+Nk=d7Nqm4ypFSzaEf0esaK4j31LyQCOBQ@mail.gmail.com

I'm not sure if others are waiting on me for a new patch or not. I can
give the documentation issues a try, but I was hesitant to do so because
you've done the research.

The problems that I can correct are fairly trivial.

Regards,
Jeff Davis

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Ants Aasma
ants.aasma@cybertec.at
In reply to: Jeff Davis (#6)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On Apr 25, 2013 10:38 PM, "Jeff Davis" <pgsql@j-davis.com> wrote:

On Tue, 2013-04-23 at 11:44 +0300, Ants Aasma wrote:

I will try to reword.

Did you have a chance to clarify this, as well as some of the other
documentation issues Simon mentioned here?

/messages/by-id/CA+U5nMKVEu8UDXQe
+Nk=d7Nqm4ypFSzaEf0esaK4j31LyQCOBQ@mail.gmail.com

Not yet. I am busy at the moment due to events in private life. I might be
able to find some time over the weekend to go over the documentation but no
guarantees.

I'm not sure if others are waiting on me for a new patch or not. I can
give the documentation issues a try, but I was hesitant to do so because
you've done the research.

The problems that I can correct are fairly trivial.

The unresolved code issue that I know of is moving the compiler flags
behind a configure check. I would greatly appreciate it if you could take a
look at that. My config-fu is weak and it would take me some time to figure
out how to do that.

Regards,
Ants Aasma

#8Florian Pflug
fgp@phlo.org
In reply to: Ants Aasma (#7)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On Apr26, 2013, at 10:28 , Ants Aasma <ants.aasma@eesti.ee> wrote:

On Apr 25, 2013 10:38 PM, "Jeff Davis" <pgsql@j-davis.com> wrote:

On Tue, 2013-04-23 at 11:44 +0300, Ants Aasma wrote:

I will try to reword.

Did you have a chance to clarify this, as well as some of the other
documentation issues Simon mentioned here?

/messages/by-id/CA+U5nMKVEu8UDXQe
+Nk=d7Nqm4ypFSzaEf0esaK4j31LyQCOBQ@mail.gmail.com

Not yet. I am busy at the moment due to events in private life. I might be able to find some time over the weekend to go over the documentation but no guarantees.

I can try to write up the reasoning behind the choice of FNV1+SHIFT3 as a checksum function, but I'm quite busy too so I'm not 100% certain I'll get to it. If that's OK with you Ants, that is.

I'm not sure if others are waiting on me for a new patch or not. I can
give the documentation issues a try, but I was hesitant to do so because
you've done the research.

The problems that I can correct are fairly trivial.

The unresolved code issue that I know of is moving the compiler flags behind a configure check. I would greatly appreciate it if you could take a look at that. My config-fu is weak and it would take me some time to figure out how to do that.

Do we necessarily have to do that before beta? If not, let's concentrate on getting the basic path in, and let's add the gcc-specific compiler options later.

best regards,
Florian Pflug

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Andres Freund
andres@anarazel.de
In reply to: Florian Pflug (#8)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On 2013-04-26 13:11:00 +0200, Florian Pflug wrote:

The unresolved code issue that I know of is moving the compiler flags behind a configure check. I would greatly appreciate it if you could take a look at that. My config-fu is weak and it would take me some time to figure out how to do that.

Do we necessarily have to do that before beta? If not, let's concentrate on getting the basic path in, and let's add the gcc-specific compiler options later.

If we want them we should do it before beta, otherwise we won't notice
problems that the flags cause (misoptimizations, problems on compiler
versions, ...). So either now or in 9.4.

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#9)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-04-26 13:11:00 +0200, Florian Pflug wrote:

The unresolved code issue that I know of is moving the compiler flags behind a configure check. I would greatly appreciate it if you could take a look at that. My config-fu is weak and it would take me some time to figure out how to do that.

Do we necessarily have to do that before beta? If not, let's concentrate on getting the basic path in, and let's add the gcc-specific compiler options later.

If we want them we should do it before beta, otherwise we won't notice
problems that the flags cause (misoptimizations, problems on compiler
versions, ...). So either now or in 9.4.

Yeah, beta1 is the point in the cycle where we have the most leverage
for discovering portability problems. We should not be leaving anything
involving configure tests as "to fix later".

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#10)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On 26 April 2013 14:40, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andres Freund <andres@2ndquadrant.com> writes:

On 2013-04-26 13:11:00 +0200, Florian Pflug wrote:

The unresolved code issue that I know of is moving the compiler flags behind a configure check. I would greatly appreciate it if you could take a look at that. My config-fu is weak and it would take me some time to figure out how to do that.

Do we necessarily have to do that before beta? If not, let's concentrate on getting the basic path in, and let's add the gcc-specific compiler options later.

If we want them we should do it before beta, otherwise we won't notice
problems that the flags cause (misoptimizations, problems on compiler
versions, ...). So either now or in 9.4.

Yeah, beta1 is the point in the cycle where we have the most leverage
for discovering portability problems. We should not be leaving anything
involving configure tests as "to fix later".

I'm expecting to spend some time on this over the weekend, once I've
re-read the thread and patches to see if there is something to commit.

That's my last time window, so this looks like the last chance to make
changes before beta.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Jeff Davis
pgsql@j-davis.com
In reply to: Simon Riggs (#11)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On Fri, Apr 26, 2013 at 7:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

I'm expecting to spend some time on this over the weekend, once I've
re-read the thread and patches to see if there is something to commit.

That's my last time window, so this looks like the last chance to make
changes before beta.

I updated the patch and split it into two parts (attached).

The first patch is the checksum algorithm itself. I have done
some documentation updates and moved it into the C file (rather
than the README), but further explanation of the "shift right 3"
modification will need to be by Ants or Florian.

The second patch adds the configure-time check for the right
compilation flags, and uses them when compiling checksum.c. I
called the new variable CFLAGS_EXTRA, for lack of a better idea,
so feel free to come up with a new name. It doesn't check for, or
use, -msse4.1, but that can be specified by the user by
configuring with CFLAGS_EXTRA="-msse4.1".

I don't know of any more required changes, aside from
documentation improvements.

Regards,
Jeff Davis

Attachments:

fnv-jeff-20130426.patchapplication/octet-stream; name=fnv-jeff-20130426.patchDownload+178-35
fnv-jeff-20130426-cflags-extra.patchapplication/octet-stream; name=fnv-jeff-20130426-cflags-extra.patchDownload+168-0
#13Greg Smith
greg@2ndquadrant.com
In reply to: Jeff Davis (#12)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On 4/26/13 3:57 PM, Jeff Davis wrote:

The second patch adds the configure-time check for the right
compilation flags, and uses them when compiling checksum.c. I
called the new variable CFLAGS_EXTRA, for lack of a better idea,
so feel free to come up with a new name. It doesn't check for, or
use, -msse4.1, but that can be specified by the user by
configuring with CFLAGS_EXTRA="-msse4.1".

Thank you, that is the last piece I was looking at but couldn't nail
down on my own. With that I should be able to duplicate both the
slicing by 8 CRC speedup Ants sent over (which also expected some
optimization changes) and trying something FNV based this weekend.

I think I need to do two baselines: master without checksums, and
master with extra optimizations but still without checksums. It may be
the case that using better compile time optimizations gives a general
speedup that's worth considering regardless. The optimizations seem to
have a very significant impact on the checksum feature, but I'd like to
quantify how they change the code a little bit before even getting into
that.

--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Jeff Davis
pgsql@j-davis.com
In reply to: Greg Smith (#13)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On Fri, 2013-04-26 at 16:40 -0400, Greg Smith wrote:

I think I need to do two baselines: master without checksums, and
master with extra optimizations but still without checksums. It may be
the case that using better compile time optimizations gives a general
speedup that's worth considering regardless. The optimizations seem to
have a very significant impact on the checksum feature, but I'd like to
quantify how they change the code a little bit before even getting into
that.

The patch only affects optimization flags used when compiling
checksum.c, so it should have no effect on other areas of the code.

If you want to compile the whole source with those flags, then just do:

CFLAGS="-msse4.1 -funroll-loops -ftree-vectorize" ./configure

Changing the optimization flags for existing code will have a larger
impact and should be considered separately from checksums.

Regards,
Jeff Davis

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Andres Freund
andres@anarazel.de
In reply to: Jeff Davis (#12)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On 2013-04-26 12:57:09 -0700, Jeff Davis wrote:

I updated the patch and split it into two parts (attached).

The second patch adds the configure-time check for the right
compilation flags, and uses them when compiling checksum.c. I
called the new variable CFLAGS_EXTRA, for lack of a better idea,
so feel free to come up with a new name. It doesn't check for, or
use, -msse4.1, but that can be specified by the user by
configuring with CFLAGS_EXTRA="-msse4.1".

CFLAGS_VECTORIZATION? EXTRA sounds to generic to me.

--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -242,6 +242,30 @@ undefine([Ac_cachevar])dnl
+# PGAC_PROG_CC_CFLAGS_EXTRA_OPT
+# -----------------------
+# Given a string, check if the compiler supports the string as a
+# command-line option. If it does, add the string to CFLAGS_EXTRA.
+AC_DEFUN([PGAC_PROG_CC_CFLAGS_EXTRA_OPT],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_prog_cc_cflags_extra_$1])])dnl
+AC_CACHE_CHECK([whether $CC supports $1], [Ac_cachevar],
+[pgac_save_CFLAGS_EXTRA=$CFLAGS_EXTRA
+CFLAGS_EXTRA="$pgac_save_CFLAGS_EXTRA $1"
+ac_save_c_werror_flag=$ac_c_werror_flag
+ac_c_werror_flag=yes
+_AC_COMPILE_IFELSE([AC_LANG_PROGRAM()],
+                   [Ac_cachevar=yes],
+                   [Ac_cachevar=no])
+ac_c_werror_flag=$ac_save_c_werror_flag
+CFLAGS_EXTRA="$pgac_save_CFLAGS_EXTRA"])
+if test x"$Ac_cachevar" = x"yes"; then
+  CFLAGS_EXTRA="$CFLAGS_EXTRA $1"
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_PROG_CC_CFLAGS_EXTRA_OPT

I think it would be better to have a PGAC_PROG_CC_VAR_OPT or so which
assigns the flag to some passed variable name. Then we can reuse it for
different vars and I have the feeling those will come. And having a
CFLAGS_VECTOR_OPT would just be stupid ;)

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Jeff Davis
pgsql@j-davis.com
In reply to: Andres Freund (#15)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On Sat, 2013-04-27 at 00:20 +0200, Andres Freund wrote:

CFLAGS_VECTORIZATION? EXTRA sounds to generic to me.

I went with CFLAGS_VECTOR to be a little shorter while still keeping
some meaning.

I think it would be better to have a PGAC_PROG_CC_VAR_OPT or so which
assigns the flag to some passed variable name. Then we can reuse it for
different vars and I have the feeling those will come. And having a
CFLAGS_VECTOR_OPT would just be stupid ;)

Good suggestion; done.

Thank you for the review. New renamed patch attached for the build
options only (the other patch for the FNV checksum algorithm is
unchanged).

Regards,
Jeff Davis

Attachments:

fnv-jeff-20130426-cflags-vector.patchtext/x-patch; charset=ISO-8859-1; name=fnv-jeff-20130426-cflags-vector.patchDownload+169-0
#17Ants Aasma
ants.aasma@cybertec.at
In reply to: Jeff Davis (#12)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On Fri, Apr 26, 2013 at 10:57 PM, Jeff Davis <pgsql@j-davis.com> wrote:

On Fri, Apr 26, 2013 at 7:09 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

I'm expecting to spend some time on this over the weekend, once I've
re-read the thread and patches to see if there is something to commit.

That's my last time window, so this looks like the last chance to make
changes before beta.

I updated the patch and split it into two parts (attached).

The first patch is the checksum algorithm itself. I have done
some documentation updates and moved it into the C file (rather
than the README), but further explanation of the "shift right 3"
modification will need to be by Ants or Florian.

The second patch adds the configure-time check for the right
compilation flags, and uses them when compiling checksum.c. I
called the new variable CFLAGS_EXTRA, for lack of a better idea,
so feel free to come up with a new name. It doesn't check for, or
use, -msse4.1, but that can be specified by the user by
configuring with CFLAGS_EXTRA="-msse4.1".

I don't know of any more required changes, aside from
documentation improvements.

I have updated the base patch. This is supposed to go under the
cflags-vector patch that Jeff posted yesterday.

I had the opportunity to run a few more tests of the hash. Based on
the tests I switched the shift-right operation from 3 to 17bits (the
original value was chosen by gut feel). Avalanche tests showed that
this value removed bias the quickest. You can see the difference in
the attached image, colors are still black 0% bias, blue 5%, green
33%, yellow 75%, red 100%. The final box in the diagram is covered by
the final mixing iteration. The take away from these diagrams is 17
mixes better than 3. 17 still has some residual bias for the final
iteration on the page. The effective information content values in
checksum for 16 high order bits on final 32 32bit words on the page
are: 16.0 15.1 14.1 13.3 12.6 12.1 11.8 11.7 11.5 11.5 11.1 11.0 10.9
10.6 10.6 10.4. Error detection capability for the highest bit is
therefore 1:1351. Based on this I also switched to using two
iterations of zeroes at the end, this way the lowest information
content is 15.976bits or 1:64473.

Documentation changes:
* reworded the algorithm description so the order of steps is more apparent.
* added a link to the FNV reference page.
* fixed note about FNV being 4 bytes at a time. Official variant is 1
byte at a time.
* added a segment of why the algorithm was chosen and its error
detection capabilities.
* added a segment about how the code affects vectorization.

Issue to decide before commiting:
* Whether to use 32 or 64 parallel checksums. The tradeoff for 64 is a
small performance hit (10-20%) on todays CPUs for a small performance
gain on Haswell processors coming to market this year and up to a
theoretical 2x performance gain on future CPUs. Changing this is just
a matter of changing N_SUMS and updating documentation to match.

Regards,
Ants Aasma
--
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

Attachments:

avalanche-fnv-slr3.pngimage/png; name=avalanche-fnv-slr3.pngDownload
avalanche-fnv-slr17.pngimage/png; name=avalanche-fnv-slr17.pngDownload
fnv-ants-20130428.patchapplication/octet-stream; name=fnv-ants-20130428.patchDownload+194-35
#18Simon Riggs
simon@2ndQuadrant.com
In reply to: Ants Aasma (#17)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On 28 April 2013 13:22, Ants Aasma <ants@cybertec.at> wrote:

I have updated the base patch. This is supposed to go under the
cflags-vector patch that Jeff posted yesterday.

I've committed your patch, with two changes
* one comment extended
* adding the .h file from Jeff's last main patch

Please can Ants and Jeff review the commit and confirm that is what we
wanted. Thanks.

I'm about to light up the build farm with a trial commit of the
compiler instructions stuff.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#18)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On 30 April 2013 06:57, Simon Riggs <simon@2ndquadrant.com> wrote:

I'm about to light up the build farm with a trial commit of the
compiler instructions stuff.

Amazingly that seemed to work.

ISTM that we also need this patch to put memory barriers in place
otherwise the code might be rearranged.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

checksum_memory_barrier.v1.patchapplication/octet-stream; name=checksum_memory_barrier.v1.patchDownload+4-1
#20Andres Freund
andres@anarazel.de
In reply to: Simon Riggs (#19)
Re: Substituting Checksum Algorithm (was: Enabling Checksums)

On 2013-04-30 11:55:29 +0100, Simon Riggs wrote:

ISTM that we also need this patch to put memory barriers in place
otherwise the code might be rearranged.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -960,11 +960,14 @@ PageCalcChecksum16(Page page, BlockNumber blkno)
* Save pd_checksum and set it to zero, so that the checksum calculation
* isn't affected by the checksum stored on the page. We do this to
* allow optimization of the checksum calculation on the whole block
-	 * in one go.
+	 * in one go. Memory barriers are required to avoid rearrangement here.
*/
save_checksum = phdr->pd_checksum;
+	pg_memory_barrier();
phdr->pd_checksum = 0;
+	pg_memory_barrier();
checksum = checksum_block(page, BLCKSZ);
+	pg_memory_barrier();
phdr->pd_checksum = save_checksum;

/* mix in the block number to detect transposed pages */

Why? I am not sure which rearrangement you're fearing? In all cases
where there is danger of concurrent write access to the page we should
already be working on a copy?
Also, if we need a memory barrier I can only see a point in the 2nd
one. The first and third shouldn't ever be able to change anything since
we are only writing to local memory?

Greetings,

Andres Freund

--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Ants Aasma
ants.aasma@cybertec.at
In reply to: Simon Riggs (#19)
#22Simon Riggs
simon@2ndQuadrant.com
In reply to: Ants Aasma (#21)
#23Tom Lane
tgl@sss.pgh.pa.us
In reply to: Simon Riggs (#19)
#24Simon Riggs
simon@2ndQuadrant.com
In reply to: Tom Lane (#23)
#25Greg Smith
greg@2ndquadrant.com
In reply to: Simon Riggs (#18)
#26Martijn van Oosterhout
kleptog@svana.org
In reply to: Greg Smith (#25)
#27Greg Smith
greg@2ndquadrant.com
In reply to: Martijn van Oosterhout (#26)
#28Noah Misch
noah@leadboat.com
In reply to: Greg Smith (#27)
#29Andres Freund
andres@anarazel.de
In reply to: Greg Smith (#27)