Cosmic ray hits integerset
Hi,
Here's a curious one-off failure in test_integerset:
+ERROR: iterate returned wrong value; got 519985430528, expected 485625692160
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rhinoceros&dt=2021-04-01%2018:19:47
On 2021-Jun-22, Thomas Munro wrote:
Hi,
Here's a curious one-off failure in test_integerset:
+ERROR: iterate returned wrong value; got 519985430528, expected 485625692160
Cosmic rays indeed. The base-2 representation of the expected value is
111000100010001100011000000000000000000
and that of the actual value is
111100100010001100011000000000000000000
There's a single bit of difference.
--
�lvaro Herrera Valdivia, Chile
"No hay hombre que no aspire a la plenitud, es decir,
la suma de experiencias de que un hombre es capaz"
22 июня 2021 г., в 19:21, Alvaro Herrera <alvherre@alvh.no-ip.org> написал(а):
On 2021-Jun-22, Thomas Munro wrote:
Hi,
Here's a curious one-off failure in test_integerset:
+ERROR: iterate returned wrong value; got 519985430528, expected 485625692160
Cosmic rays indeed. The base-2 representation of the expected value is
111000100010001100011000000000000000000
and that of the actual value is
111100100010001100011000000000000000000There's a single bit of difference.
I've tried to explain this as not a single-event upset, but integer overflow in 30-bits mode of simple8b somewhere. But found nothing so far. Actual error is in bit 35, and next mode is 60-bit mode.
Looks like cosmic ray to me too.
Best regards, Andrey Borodin.
Hi, Asking out of pure technical curiosity about "the rhinoceros" - what kind of animal is it ? Physical box or VM? How one could get dmidecode(1) / dmesg(1) / mcelog (1) from what's out there (e.g. does it run ECC or not ?)
-J.
Show quoted text
-----Original Message-----
From: Alvaro Herrera <alvherre@alvh.no-ip.org>
Sent: Tuesday, June 22, 2021 4:21 PM
To: Thomas Munro <thomas.munro@gmail.com>
Cc: pgsql-hackers <pgsql-hackers@postgresql.org>
Subject: Re: Cosmic ray hits integersetOn 2021-Jun-22, Thomas Munro wrote:
Hi,
Here's a curious one-off failure in test_integerset:
+ERROR: iterate returned wrong value; got 519985430528, expected +485625692160Cosmic rays indeed. The base-2 representation of the expected value is
111000100010001100011000000000000000000
and that of the actual value is
111100100010001100011000000000000000000There's a single bit of difference.
On 7/7/21 2:53 AM, Jakub Wartak wrote:
Hi, Asking out of pure technical curiosity about "the rhinoceros" - what kind of animal is it ? Physical box or VM? How one could get dmidecode(1) / dmesg(1) / mcelog (1) from what's out there (e.g. does it run ECC or not ?)
Rhinoceros is just a VM on a simple desktop machine. Nothing fancy.
Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development
Fwiw, yes it could be a cosmic ray.
It could also just be marginally bad ram. Bad ram is notoriously hard
to reliably test for. It can be very sensitive to the exact bit
pattern stored in it, the timing of reads and writes, and other
factors. The whole point of the rowhammer attacks is to push some of
those timing factors hard but the same failures can happen randomly.
On Wed, 7 Jul 2021 at 08:14, Joe Conway <mail@joeconway.com> wrote:
On 7/7/21 2:53 AM, Jakub Wartak wrote:
Hi, Asking out of pure technical curiosity about "the rhinoceros" - what kind of animal is it ? Physical box or VM? How one could get dmidecode(1) / dmesg(1) / mcelog (1) from what's out there (e.g. does it run ECC or not ?)
Rhinoceros is just a VM on a simple desktop machine. Nothing fancy.
Joe
--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development
--
greg