Thread: PgBench's \setrandom could be better

PgBench's \setrandom could be better

From
David Rowley
Date:
Hi,

I've just been looking into the TPC-H benchmark with intention to put together some scripts which can be run easily to output some run times for each of the 22 queries.

I've not had a great deal of exposure to pgbench yet, but I had thought that it might be able to help me run these queries with the randomly chosen inputs that are described in the TPC-H spec.

From the spec under 2.4.6.3:

1. DATE is the first of January of a randomly selected year within [1993 .. 1997]; 
2. DISCOUNT is randomly selected within [0.02 .. 0.09]; 
3. QUANTITY is randomly selected within [24 .. 25]. 

With pgbench I can do:

\setrandom qty 24 25
then throw a :qty into the query and have pgbench replace that with either 24 or 25. 

The problem is that this does not work for anything apart from integer types.

I also can't seem to do anything with the date type. Perhaps it would be nice to be able to do something like:

\setrandom date '1993-01-01' '1994-01-01' '1995-01-01' '1996-01-01' '1997-01-01'

But this would be ambiguous if I wanted to choose either from either 10 or 20 as:

\setrandom var 10 20 

would choose a random number between 10 and 20, not 10 or 20.

I would imagine that code could be added to allow the random to choose a floating point number, if one or more of the inputs had a decimal point in them, but I can't see around how to make it work with lists without changing the syntax.

Perhaps [between] could become a optional keyword in there so that we could do:

\setrandom var between 0 10

And for lists it could work like:

\setrandom var list 'a' 'b' 'c'

There should be no conflicts with the original syntax, as as far as I'm aware only an integer number can follow the variable name.

I was planning on putting something together to help me with this, but I thought I'd post here first so that I might actually end up with something useful at the end of it.

So my questions are:

1. Would anyone else find this useful?
2. Is my proposed syntax ok? Can you think of anything better or more flexible?

Regards

David Rowley

Re: PgBench's \setrandom could be better

From
Robert Haas
Date:
On Sat, Nov 15, 2014 at 5:51 PM, David Rowley <dgrowleyml@gmail.com> wrote:
> With pgbench I can do:
>
> \setrandom qty 24 25
> then throw a :qty into the query and have pgbench replace that with either
> 24 or 25.
>
> The problem is that this does not work for anything apart from integer
> types.

I've got a pending patch here - which I don't think anyone has
reviewed - that adds an expression syntax to pgbench:

https://commitfest.postgresql.org/action/patch_view?id=1581

Once that patch goes in, I think we should generalize it so that the
\set syntax can use not only operators but also functions, like abs().
We could also generalize it to work on data types other than integers,
such as strings or (like you're suggesting here) dates, so that you
could have a random_date() function.

In general I'm pretty skeptical about continuing to expend pgbench by
adding more backslash commands and complicating the syntax of the ones
we've already got.  We can certainly do it that way, but it's kind of
a mess.  I'd rather have a more general framework to build on.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company