Re: random() (was Re: New GUC to sample log queries) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: random() (was Re: New GUC to sample log queries)
Date
Msg-id 5659.1545852666@sss.pgh.pa.us
Whole thread Raw
In response to Re: random() (was Re: New GUC to sample log queries)  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: random() (was Re: New GUC to sample log queries)  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
Peter Geoghegan <pg@bowt.ie> writes:
> On Wed, Dec 26, 2018 at 10:45 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I wonder whether we should establish a project policy to avoid use
>> of random() for internal purposes, ie try to get to a point where
>> drandom() is the only caller in the backend.  A quick grep says
>> that there's a dozen or so callers, so this patch certainly isn't
>> the only offender ... but should we make an effort to convert them
>> all to use, say, pg_erand48()?  I think all the existing callers
>> could happily share a process-wide random state, so we could make
>> a wrapper that's no harder to use than random().

> I've used setseed() to make nbtree's "getting tired" behavior
> deterministic for specific test cases I've developed -- the random()
> choice of whether to split a page full of duplicates, or continue
> right in search of free space becomes predictable. I've used this to
> determine whether my nbtree patch's pg_upgrade'd indexes have
> precisely the same behavior as v3 indexes on the master branch
> (precisely the same in terms of the structure of the final index
> following a bulk load).

TBH, I'd call it a bug --- maybe even a low-grade security hazard
--- that it's possible to affect that from user level.

In fact, contemplating that for a bit: it is possible, as things
stand in HEAD, for a user to control which of his statements will
get logged if the DBA has enabled log_statement_sample_rate.
It doesn't take a lot of creativity to think of ways to abuse that.
So maybe Coverity had the right idea to start with.

There might well be debugging value in affecting internal PRNG usages,
but let's please not think it's a good idea that that's trivially
reachable from SQL.

            regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)
Next
From: John Naylor
Date:
Subject: Re: reducing the footprint of ScanKeyword (was Re: Large writable variables)