Re: random() (was Re: New GUC to sample log queries) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: random() (was Re: New GUC to sample log queries)
Date
Msg-id 1551.1546018192@sss.pgh.pa.us
Whole thread Raw
In response to Re: random() (was Re: New GUC to sample log queries)  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
Fabien COELHO <coelho@cri.ensmp.fr> writes:
>> but I don't feel a need for replacing the algorithm.

> Hmmm. Does it mean that you would veto any change, even if the speed 
> concern is addressed (i.e. faster/not slower with better quality)?

Well, not veto exactly, but I'd be suspicious of it.

First, erand48 has been around a *long* time and its properties are pretty
well understood; these other algorithms you found on the net have no real
pedigree IMO.  Moreover, since it is standard, there's a lower cognitive
burden on people to understand what it is and what it can be trusted for.

Second, we don't actually have a problem we need to fix by changing the
algorithm.  We do need to worry about keeping drandom's state separate
from the internal random() usages, but that's independent of what the
algorithm is.  Nobody has complained that random() is insufficiently
random, only that the seed might be predictable.

I do agree, after closer inspection, that our current coding of _dorand48
is a hangover from machines lacking 64-bit arithmetic.  glibc does it like
this:

int
__drand48_iterate (unsigned short int xsubi[3], struct drand48_data *buffer)
{
  uint64_t X;
  uint64_t result;

  /* Initialize buffer, if not yet done.  */
  if (__glibc_unlikely (!buffer->__init))
    {
      buffer->__a = 0x5deece66dull;
      buffer->__c = 0xb;
      buffer->__init = 1;
    }

  /* Do the real work.  We choose a data type which contains at least
     48 bits.  Because we compute the modulus it does not care how
     many bits really are computed.  */

  X = (uint64_t) xsubi[2] << 32 | (uint32_t) xsubi[1] << 16 | xsubi[0];

  result = X * buffer->__a + buffer->__c;

  xsubi[0] = result & 0xffff;
  xsubi[1] = (result >> 16) & 0xffff;
  xsubi[2] = (result >> 32) & 0xffff;

  return 0;
}

and other than the pointless use of variable __a and __c, I think
we ought to adopt similar coding.

            regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: removal of dangling temp tables
Next
From: Alvaro Herrera
Date:
Subject: Re: removal of dangling temp tables