Re: random() (was Re: New GUC to sample log queries) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: random() (was Re: New GUC to sample log queries)
Date
Msg-id CAH2-Wzk1Wa4TcrdOQv=eknY_NWyr2bTKf3rFtWPu06-V6LhE6A@mail.gmail.com
Whole thread Raw
In response to random() (was Re: New GUC to sample log queries)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: random() (was Re: New GUC to sample log queries)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, Dec 26, 2018 at 10:45 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wonder whether we should establish a project policy to avoid use
> of random() for internal purposes, ie try to get to a point where
> drandom() is the only caller in the backend.  A quick grep says
> that there's a dozen or so callers, so this patch certainly isn't
> the only offender ... but should we make an effort to convert them
> all to use, say, pg_erand48()?  I think all the existing callers
> could happily share a process-wide random state, so we could make
> a wrapper that's no harder to use than random().

I've used setseed() to make nbtree's "getting tired" behavior
deterministic for specific test cases I've developed -- the random()
choice of whether to split a page full of duplicates, or continue
right in search of free space becomes predictable. I've used this to
determine whether my nbtree patch's pg_upgrade'd indexes have
precisely the same behavior as v3 indexes on the master branch
(precisely the same in terms of the structure of the final index
following a bulk load).

I'm not sure whether or not kind of debugging scenario is worth giving
much weight to going forward, but it's something to consider. It seems
generally useful to be able to force deterministic-ish behavior in a
single session. I don't expect that the test case is even a bit
portable, but the technique was quite effective.

-- 
Peter Geoghegan


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: removal of dangling temp tables
Next
From: Tom Lane
Date:
Subject: Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)