Home > mailing lists

Re: random() (was Re: New GUC to sample log queries) - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: random() (was Re: New GUC to sample log queries)
Date	December 29, 2018 21:07:15
Msg-id	17866.1546117635@sss.pgh.pa.us Whole thread Raw
In response to	Re: random() (was Re: New GUC to sample log queries) (Tom Lane <tgl@sss.pgh.pa.us>)
List	pgsql-hackers

Tree view

I wrote:
> Thomas Munro <thomas.munro@enterprisedb.com> writes:
>> I was going to suggest that we might be able to use a single
>> not-visible-to-users number that is mixed into the existing recipe, so
>> that we only ever read urandom once for the cluster.

> Yeah, I was thinking along similar lines, but there's a problem:
> InitProcessGlobals runs before an EXEC_BACKEND child has reconnected
> to shared memory, so there's no cheap way to pass state to it.
> No doubt there are ways around that, but I'd just as soon avoid
> adding complexity here.  If we broke it somehow, the likely results
> would be silent failure of the per-process seed to be random, which
> might escape detection for a long time.

>> But it sounds
>> like it's not a problem, and it's probably better to just pass the
>> whole problem over to the OS.

> Yeah, that's what I'm thinking.

One final point on this --- I wondered whether initializing the
seed with pg_strong_random would be too costly time-wise.
It doesn't seem so; I measure the time to do it as ~25us with
/dev/urandom or ~80us with OpenSSL on my main development
workstation.  The total time to start a backend on that machine
is a few milliseconds depending on what you want to count.
I got numbers broadly in line with those results on half a dozen
other platforms (recent Fedora, Mac, various BSD on x86_64, ARM,
and PPC).

So I'd judge that the time cost is not a reason not to apply this
patch, but we might want to reconsider why we're preferring
OpenSSL over direct use of /dev/urandom.

I also wondered whether we might do something like Thomas' suggestion
to try to cut the startup time.  In a fork() environment it'd be
pretty cheap to pass down a master seed value from the postmaster,
but with EXEC_BACKEND I think the only reliable way would be to
put the seed value in a file and have backends read it from there.
That's none too cheap either --- for grins, I hacked pg_strong_random.c
to read an ordinary file instead of /dev/urandom, and got runtimes
around 15us, so reading /dev/urandom is really not that much slower than
a plain file.  On the whole I don't find this line of thought attractive,
especially since it's not real clear to me how safe it would be to work
like this rather than have a fully independent seed for each process.

Anyway, I've run out of reasons why we might not want to do this,
so I'm going to go ahead and commit it.

            regards, tom lane

pgsql-hackers by date:

From: Tom Lane
Date: 29 December 2018, 19:12:55
Subject: Re: add_partial_path() may remove dominated path but still in use

From: Vik Fearing
Date: 29 December 2018, 21:40:14
Subject: Optimize constant MinMax expressions

Re: random() (was Re: New GUC to sample log queries) - Mailing list pgsql-hackers

Previous

Next