Home > mailing lists

Re: CPU costs of random_zipfian in pgbench - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: CPU costs of random_zipfian in pgbench
Date	February 17, 2019 16:09:27
Msg-id	6065.1550419767@sss.pgh.pa.us Whole thread Raw
In response to	Re: CPU costs of random_zipfian in pgbench (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses	Re: CPU costs of random_zipfian in pgbench Re: CPU costs of random_zipfian in pgbench Re: CPU costs of random_zipfian in pgbench
List	pgsql-hackers

Tree view

Fabien COELHO <coelho@cri.ensmp.fr> writes:
>> I'm trying to use random_zipfian() for benchmarking of skewed data sets, 
>> and I ran head-first into an issue with rather excessive CPU costs. 

> If you want skewed but not especially zipfian, use exponential which is 
> quite cheap. Also zipfian with a > 1.0 parameter does not have to compute 
> the harmonic number, so it depends in the parameter.

Maybe we should drop support for parameter values < 1.0, then.  The idea
that pgbench is doing something so expensive as to require caching seems
flat-out insane from here.  That cannot be seen as anything but a foot-gun
for unwary users.  Under what circumstances would an informed user use
that random distribution rather than another far-cheaper-to-compute one?

> ... This is why I submitted a pseudo-random permutation 
> function, which alas does not get much momentum from committers.

TBH, I think pgbench is now much too complex; it does not need more
features, especially not ones that need large caveats in the docs.
(What exactly is the point of having zipfian at all?)

            regards, tom lane

pgsql-hackers by date:

From: Tom Lane
Date: 17 February 2019, 15:56:06
Subject: Re: Ryu floating point output patch

From: Andrew Gierth
Date: 17 February 2019, 16:19:05
Subject: Re: Ryu floating point output patch

Re: CPU costs of random_zipfian in pgbench - Mailing list pgsql-hackers

Previous

Next