Uh, this is *not* a 64-bit CRC ...
I just took a close look at the COMP_CRC64 macro in xlog.c.
This isn't a 64-bit CRC. It's two independent 32-bit CRCs, one done
on just the odd-numbered bytes and one on just the even-numbered bytes
of the datastream. That's hardly any stronger than a single 32-bit CRC;
it's certainly not what I thought we had agreed to implement.
We can't change this algorithm without forcing an initdb, which would be
a rather unpleasant thing to do at this late stage of the release cycle.
But I'm not happy with it. Comments?
regards, tom lane
On Wed, Feb 28, 2001 at 04:53:09PM -0500, Tom Lane wrote:
I just took a close look at the COMP_CRC64 macro in xlog.c.
This isn't a 64-bit CRC. It's two independent 32-bit CRCs, one done
on just the odd-numbered bytes and one on just the even-numbered bytes
of the datastream. That's hardly any stronger than a single 32-bit CRC;
it's certainly not what I thought we had agreed to implement.We can't change this algorithm without forcing an initdb, which would be
a rather unpleasant thing to do at this late stage of the release cycle.
But I'm not happy with it. Comments?
This might be a good time to update:
The CRC-64 code used in the SWISS-PROT genetic database is (now) at:
ftp://ftp.ebi.ac.uk/pub/software/swissprot/Swissknife/old/SPcrc.tar.gz
From the README:
The code in this package has been derived from the BTLib package
obtained from Christian Iseli <chris@ludwig-alpha.unil.ch>.
From his mail:
The reference is: W. H. Press, S. A. Teukolsky, W. T. Vetterling, and
B. P. Flannery, "Numerical recipes in C", 2nd ed., Cambridge University
Press. Pages 896ff.
The generator polynomial is x64 + x4 + x3 + x1 + 1.
I would suggest that if you don't change the algorithm, at least change
the name in the sources. Were you to #ifdef in a real crc-64, and make
a compile-time option to select the old one, you could allow users who
wish to avoid the initdb a way to continue with the existing pair of
CRC-32s.
Nathan Myers
ncm@zembu.com
I will just add a TODO item and we can hit it for 7.2.
On Wed, Feb 28, 2001 at 04:53:09PM -0500, Tom Lane wrote:
I just took a close look at the COMP_CRC64 macro in xlog.c.
This isn't a 64-bit CRC. It's two independent 32-bit CRCs, one done
on just the odd-numbered bytes and one on just the even-numbered bytes
of the datastream. That's hardly any stronger than a single 32-bit CRC;
it's certainly not what I thought we had agreed to implement.We can't change this algorithm without forcing an initdb, which would be
a rather unpleasant thing to do at this late stage of the release cycle.
But I'm not happy with it. Comments?This might be a good time to update:
The CRC-64 code used in the SWISS-PROT genetic database is (now) at:
ftp://ftp.ebi.ac.uk/pub/software/swissprot/Swissknife/old/SPcrc.tar.gz
From the README:
The code in this package has been derived from the BTLib package
obtained from Christian Iseli <chris@ludwig-alpha.unil.ch>.
From his mail:The reference is: W. H. Press, S. A. Teukolsky, W. T. Vetterling, and
B. P. Flannery, "Numerical recipes in C", 2nd ed., Cambridge University
Press. Pages 896ff.The generator polynomial is x64 + x4 + x3 + x1 + 1.
I would suggest that if you don't change the algorithm, at least change
the name in the sources. Were you to #ifdef in a real crc-64, and make
a compile-time option to select the old one, you could allow users who
wish to avoid the initdb a way to continue with the existing pair of
CRC-32s.Nathan Myers
ncm@zembu.com
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
Added to TODO:
* Correct CRC WAL code to be normal CRC32 algorithm
On Wed, Feb 28, 2001 at 04:53:09PM -0500, Tom Lane wrote:
I just took a close look at the COMP_CRC64 macro in xlog.c.
This isn't a 64-bit CRC. It's two independent 32-bit CRCs, one done
on just the odd-numbered bytes and one on just the even-numbered bytes
of the datastream. That's hardly any stronger than a single 32-bit CRC;
it's certainly not what I thought we had agreed to implement.We can't change this algorithm without forcing an initdb, which would be
a rather unpleasant thing to do at this late stage of the release cycle.
But I'm not happy with it. Comments?This might be a good time to update:
The CRC-64 code used in the SWISS-PROT genetic database is (now) at:
ftp://ftp.ebi.ac.uk/pub/software/swissprot/Swissknife/old/SPcrc.tar.gz
From the README:
The code in this package has been derived from the BTLib package
obtained from Christian Iseli <chris@ludwig-alpha.unil.ch>.
From his mail:The reference is: W. H. Press, S. A. Teukolsky, W. T. Vetterling, and
B. P. Flannery, "Numerical recipes in C", 2nd ed., Cambridge University
Press. Pages 896ff.The generator polynomial is x64 + x4 + x3 + x1 + 1.
I would suggest that if you don't change the algorithm, at least change
the name in the sources. Were you to #ifdef in a real crc-64, and make
a compile-time option to select the old one, you could allow users who
wish to avoid the initdb a way to continue with the existing pair of
CRC-32s.Nathan Myers
ncm@zembu.com
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Wed, Feb 28, 2001 at 09:17:19PM -0500, Bruce Momjian wrote:
On Wed, Feb 28, 2001 at 04:53:09PM -0500, Tom Lane wrote:
I just took a close look at the COMP_CRC64 macro in xlog.c.
This isn't a 64-bit CRC. It's two independent 32-bit CRCs, one done
on just the odd-numbered bytes and one on just the even-numbered bytes
of the datastream. That's hardly any stronger than a single 32-bit CRC;
it's certainly not what I thought we had agreed to implement.We can't change this algorithm without forcing an initdb, which would be
a rather unpleasant thing to do at this late stage of the release cycle.
But I'm not happy with it. Comments?This might be a good time to update:
The CRC-64 code used in the SWISS-PROT genetic database is (now) at:
ftp://ftp.ebi.ac.uk/pub/software/swissprot/Swissknife/old/SPcrc.tar.gz
From the README:
The code in this package has been derived from the BTLib package
obtained from Christian Iseli <chris@ludwig-alpha.unil.ch>.
From his mail:The reference is: W. H. Press, S. A. Teukolsky, W. T. Vetterling, and
B. P. Flannery, "Numerical recipes in C", 2nd ed., Cambridge University
Press. Pages 896ff.The generator polynomial is x64 + x4 + x3 + x1 + 1.
I would suggest that if you don't change the algorithm, at least change
the name in the sources. Were you to #ifdef in a real crc-64, and make
a compile-time option to select the old one, you could allow users who
wish to avoid the initdb a way to continue with the existing pair of
CRC-32s.Added to TODO:
* Correct CRC WAL code to be normal CRC32 algorithm
Um, how about
* Correct CRC WAL code to be a real CRC64 algorithm
instead?
Nathan Myers
ncm@zembu.com
Added to TODO:
* Correct CRC WAL code to be normal CRC32 algorithm
Um, how about
* Correct CRC WAL code to be a real CRC64 algorithm
instead?
Done.
--
Bruce Momjian | http://candle.pha.pa.us
pgman@candle.pha.pa.us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
I just took a close look at the COMP_CRC64 macro in xlog.c.
This isn't a 64-bit CRC. It's two independent 32-bit CRCs, one done
on just the odd-numbered bytes and one on just the even-numbered bytes
of the datastream. That's hardly any stronger than a single
32-bit CRC;
it's certainly not what I thought we had agreed to implement.
Hmm, strange. I thought that we had agreed upon a 32 bit CRC
on the grounds, that it would be strong enough to guard a single
log record.
Andreas
Import Notes
Resolved by subject fallback
Zeugswetter Andreas SB <ZeugswetterA@wien.spardat.at> writes:
it's certainly not what I thought we had agreed to implement.
Hmm, strange. I thought that we had agreed upon a 32 bit CRC
on the grounds, that it would be strong enough to guard a single
log record.
I thought that, and still think it, but I was outvoted. However I
see no value in the present actual implementation ...
regards, tom lane
ncm@zembu.com (Nathan Myers) writes:
The CRC-64 code used in the SWISS-PROT genetic database is (now) at:
ftp://ftp.ebi.ac.uk/pub/software/swissprot/Swissknife/old/SPcrc.tar.gz
From the README:
The code in this package has been derived from the BTLib package
obtained from Christian Iseli <chris@ludwig-alpha.unil.ch>.
From his mail:
The reference is: W. H. Press, S. A. Teukolsky, W. T. Vetterling, and
B. P. Flannery, "Numerical recipes in C", 2nd ed., Cambridge University
Press. Pages 896ff.
The generator polynomial is x64 + x4 + x3 + x1 + 1.
Nathan (or anyone else with a copy of "Numerical recipes in C", which
I'm embarrassed to admit I don't own), is there any indication in there
that anyone spent any effort on choosing that particular generator
polynomial? As far as I can see, it violates one of the standard
guidelines for choosing a polynomial, namely that it be a multiple of
(x + 1) ... which in modulo-2 land is equivalent to having an even
number of terms, which this ain't got. See Ross Williams'
A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS, available from
ftp://ftp.rocksoft.com/papers/crc_v3.txt among other places, which is
by far the most thorough and readable thing I've ever seen on CRCs.
I spent some time digging around the net for standard CRC64 polynomials,
and the only thing I could find that looked like it might have been
picked by someone who understood what they were doing is in the DLT
(digital linear tape) standard, ECMA-182 (available from
http://www.ecma.ch/ecma1/STAND/ECMA-182.HTM):
x^64 + x^62 + x^57 + x^55 + x^54 + x^53 + x^52 + x^47 + x^46 + x^45 +
x^40 + x^39 + x^38 + x^37 + x^35 + x^33 + x^32 + x^31 + x^29 + x^27 +
x^24 + x^23 + x^22 + x^21 + x^19 + x^17 + x^13 + x^12 + x^10 + x^9 +
x^7 + x^4 + x + 1
I'm probably going to go with this one unless someone can come up with
an authoritative source for another one.
regards, tom lane
This isn't a 64-bit CRC. It's two independent 32-bit CRCs, one done
on just the odd-numbered bytes and one on just the even-numbered bytes
of the datastream. That's hardly any stronger than a single 32-bit CRC;
I believe that the longer data the more chance to get same CRC/hash
for different data sets (if data length > CRC/hash length). Or am I wrong?
Having no crc64 implementation (see below) I decided to use 2 CRC32 instead
of one - it looked better, without any additional cost (record header is
8 byte aligned anyway, on, mmm, most platform).
it's certainly not what I thought we had agreed to implement.
I've asked if anyone can send crc64 impl to me and got only one from
Nathan Myers. Unfortunately, SWISS-PROT impl assumes that long long
is 8 bytes - is it portable?
Vadim
Import Notes
Resolved by subject fallback
"Vadim Mikheev" <vmikheev@sectorbase.com> writes:
I've asked if anyone can send crc64 impl to me and got only one from
Nathan Myers. Unfortunately, SWISS-PROT impl assumes that long long
is 8 bytes - is it portable?
No, it's not. I have written an implementation that uses uint64 if
available (per configure results) otherwise a pair of uint32 registers.
(See pg_crc.h in as-yet-uncommitted patches I posted to pgpatches.)
Interestingly, gcc -O on HP-PA generates essentially the same code
sequence for either the 64- or dual-32-bit versions of the macro...
regards, tom lane
On Mon, Mar 05, 2001 at 02:00:59PM -0500, Tom Lane wrote:
ncm@zembu.com (Nathan Myers) writes:
The CRC-64 code used in the SWISS-PROT genetic database is (now) at:
ftp://ftp.ebi.ac.uk/pub/software/swissprot/Swissknife/old/SPcrc.tar.gzFrom the README:
The code in this package has been derived from the BTLib package
obtained from Christian Iseli <chris@ludwig-alpha.unil.ch>.
From his mail:The reference is: W. H. Press, S. A. Teukolsky, W. T. Vetterling, and
B. P. Flannery, "Numerical recipes in C", 2nd ed., Cambridge University
Press. Pages 896ff.The generator polynomial is x64 + x4 + x3 + x1 + 1.
Nathan (or anyone else with a copy of "Numerical recipes in C", which
I'm embarrassed to admit I don't own), is there any indication in there
that anyone spent any effort on choosing that particular generator
polynomial? As far as I can see, it violates one of the standard
guidelines for choosing a polynomial, namely that it be a multiple of
(x + 1) ... which in modulo-2 land is equivalent to having an even
number of terms, which this ain't got. See Ross Williams'
A PAINLESS GUIDE TO CRC ERROR DETECTION ALGORITHMS, available from
ftp://ftp.rocksoft.com/papers/crc_v3.txt among other places, which is
by far the most thorough and readable thing I've ever seen on CRCs.I spent some time digging around the net for standard CRC64 polynomials,
and the only thing I could find that looked like it might have been
picked by someone who understood what they were doing is in the DLT
(digital linear tape) standard, ECMA-182 (available from
http://www.ecma.ch/ecma1/STAND/ECMA-182.HTM):x^64 + x^62 + x^57 + x^55 + x^54 + x^53 + x^52 + x^47 + x^46 + x^45 +
x^40 + x^39 + x^38 + x^37 + x^35 + x^33 + x^32 + x^31 + x^29 + x^27 +
x^24 + x^23 + x^22 + x^21 + x^19 + x^17 + x^13 + x^12 + x^10 + x^9 +
x^7 + x^4 + x + 1
I'm sorry to have taken so long to reply.
The polynomial chosen for SWISS-PROT turns out to be presented, in
Numerical Recipes, just as an example of a primitive polynomial of
that degree; no assertion is made about its desirability for error
checking. It is (in turn) drawn from E. J. Watson, "Mathematics of
Computation", vol. 16, pp368-9.
Having (x + 1) as a factor guarantees to catch all errors in which
an odd number of bits have been changed. Presumably you are then
infinitesimally less likely to catch all errors in which an even
number of bits have been changed.
I would have posted the ECMA-182 polynomial if I had found it. (That
was good searching!) One hopes that the ECMA polynomial was chosen more
carefully than entirely at random. High-degree codes are often chosen
by Monte Carlo methods, by applying statistical tests to randomly-chosen
values, because the search space is so large.
I have verified that Tom transcribed the polynomial correctly from
the PDF image. The ECMA document doesn't say whether their polynomial
is applied "bit-reversed", but the check would be equally strong either
way.
Nathan Myers
ncm@zembu.com