Thread: Cosmic ray hits integerset

Cosmic ray hits integerset

From
Thomas Munro
Date:
Hi,

Here's a curious one-off failure in test_integerset:

+ERROR:  iterate returned wrong value; got 519985430528, expected 485625692160

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rhinoceros&dt=2021-04-01%2018:19:47



Re: Cosmic ray hits integerset

From
Alvaro Herrera
Date:
On 2021-Jun-22, Thomas Munro wrote:

> Hi,
> 
> Here's a curious one-off failure in test_integerset:
> 
> +ERROR:  iterate returned wrong value; got 519985430528, expected 485625692160

Cosmic rays indeed.  The base-2 representation of the expected value is
111000100010001100011000000000000000000
and that of the actual value is
111100100010001100011000000000000000000

There's a single bit of difference.

-- 
Álvaro Herrera       Valdivia, Chile
"No hay hombre que no aspire a la plenitud, es decir,
la suma de experiencias de que un hombre es capaz"



Re: Cosmic ray hits integerset

From
Andrey Borodin
Date:

> 22 июня 2021 г., в 19:21, Alvaro Herrera <alvherre@alvh.no-ip.org> написал(а):
>
> On 2021-Jun-22, Thomas Munro wrote:
>
>> Hi,
>>
>> Here's a curious one-off failure in test_integerset:
>>
>> +ERROR:  iterate returned wrong value; got 519985430528, expected 485625692160
>
> Cosmic rays indeed.  The base-2 representation of the expected value is
> 111000100010001100011000000000000000000
> and that of the actual value is
> 111100100010001100011000000000000000000
>
> There's a single bit of difference.

I've tried to explain this as not a single-event upset, but integer overflow in 30-bits mode of simple8b somewhere. But
foundnothing so far. Actual error is in bit 35, and next mode is 60-bit mode. 

Looks like cosmic ray to me too.

Best regards, Andrey Borodin.


RE: Cosmic ray hits integerset

From
Jakub Wartak
Date:
Hi, Asking out of pure technical curiosity about "the rhinoceros" - what kind of animal is it ? Physical box or VM? How
onecould get dmidecode(1) / dmesg(1) / mcelog (1) from what's out there (e.g. does it run ECC or not ?) 

-J.

> -----Original Message-----
> From: Alvaro Herrera <alvherre@alvh.no-ip.org>
> Sent: Tuesday, June 22, 2021 4:21 PM
> To: Thomas Munro <thomas.munro@gmail.com>
> Cc: pgsql-hackers <pgsql-hackers@postgresql.org>
> Subject: Re: Cosmic ray hits integerset
>
> On 2021-Jun-22, Thomas Munro wrote:
>
> > Hi,
> >
> > Here's a curious one-off failure in test_integerset:
> >
> > +ERROR:  iterate returned wrong value; got 519985430528, expected
> > +485625692160
>
> Cosmic rays indeed.  The base-2 representation of the expected value is
> 111000100010001100011000000000000000000
> and that of the actual value is
> 111100100010001100011000000000000000000
>
> There's a single bit of difference.




Re: Cosmic ray hits integerset

From
Joe Conway
Date:
On 7/7/21 2:53 AM, Jakub Wartak wrote:
> Hi, Asking out of pure technical curiosity about "the rhinoceros" - what kind of animal is it ? Physical box or VM?
Howone could get dmidecode(1) / dmesg(1) / mcelog (1) from what's out there (e.g. does it run ECC or not ?)
 


Rhinoceros is just a VM on a simple desktop machine. Nothing fancy.

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



Re: Cosmic ray hits integerset

From
Greg Stark
Date:
Fwiw, yes it could be a cosmic ray.

It could also just be marginally bad ram. Bad ram is notoriously hard
to reliably test for. It can be very sensitive to the exact bit
pattern stored in it, the timing of reads and writes, and other
factors. The whole point of the rowhammer attacks is to push some of
those timing factors hard but the same failures can happen randomly.

On Wed, 7 Jul 2021 at 08:14, Joe Conway <mail@joeconway.com> wrote:
>
> On 7/7/21 2:53 AM, Jakub Wartak wrote:
> > Hi, Asking out of pure technical curiosity about "the rhinoceros" - what kind of animal is it ? Physical box or VM?
Howone could get dmidecode(1) / dmesg(1) / mcelog (1) from what's out there (e.g. does it run ECC or not ?)
 
>
>
> Rhinoceros is just a VM on a simple desktop machine. Nothing fancy.
>
> Joe
>
> --
> Crunchy Data - http://crunchydata.com
> PostgreSQL Support for Secure Enterprises
> Consulting, Training, & Open Source Development
>
>


-- 
greg