Re: pgbench - implement strict TPC-B benchmark - Mailing list pgsql-hackers

From Andres Freund
Subject Re: pgbench - implement strict TPC-B benchmark
Date
Msg-id 20190805162416.epmdegr2amf66sv5@alap3.anarazel.de
Whole thread Raw
In response to Re: pgbench - implement strict TPC-B benchmark  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: pgbench - implement strict TPC-B benchmark  (Fabien COELHO <coelho@cri.ensmp.fr>)
List pgsql-hackers
Hi,

On 2019-08-05 17:38:23 +0200, Fabien COELHO wrote:
> Which is a (somehow disappointing) * 3.3 speedup. The impact on the 3
> complex expressions tests is not measurable, though.

I don't know why that could be disappointing. We put in much more work
for much smaller gains in other places.


> Probably variable management should be reworked more deeply to cleanup the
> code.

Agreed.


> Questions:
>  - how likely is such a patch to pass? (IMHO not likely)

I don't see why? I didn't review the patch in any detail, but it didn't
look crazy in quick skim? Increasing how much load can be simulated
using pgbench, is something I personally find much more interesting than
adding capabilities that very few people will ever use.

FWIW, the areas I find current pgbench "most lacking" during development
work are:

1) Data load speed. The data creation is bottlenecked on fprintf in a
   single process. The index builds are done serially. The vacuum could
   be replaced by COPY FREEZE.  For a lot of meaningful tests one needs
   10-1000s of GB of testdata - creating that is pretty painful.

2) Lack of proper initialization integration for custom
   scripts. I.e. have steps that are in the custom script that allow -i,
   vacuum, etc to be part of the script, rather than separately
   executable steps. --init-steps doesn't do anything for that.

3) pgbench overhead, although that's to a significant degree libpq's fault

4) Ability to cancel pgbench and get approximate results. That currently
   works if the server kicks out the clients, but not when interrupting
   pgbench - which is just plain weird.  Obviously that doesn't matter
   for "proper" benchmark runs, but often during development, it's
   enough to run pgbench past some events (say the next checkpoint).


>  - what is its impact to overall performance when actual queries
>    are performed (IMHO very small).

Obviously not huge - I'd also not expect it to be unobservably small
either. Especially if somebody went and fixed some of the inefficiency
in libpq, but even without that. And even moreso, if somebody revived
the libpq batch work + the relevant pgbench patch, because that removes
a lot of the system/kernel overhead, due to the crazy number of context
switches (obviously not realistic for all workloads, but e.g. for plenty
java workloads, it is), but leaves the same number of variable accesses
etc.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: David Fetter
Date:
Subject: Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions
Next
From: Tom Lane
Date:
Subject: Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions