[oops, resent because stalled, wrong From!]
Hello Noah,
>> Thread create time seems to be expensive as well, maybe up 0.1
>> seconds under some conditions (?). Under --rate, this create delay
>> means that throttling is laging behind schedule by about that time,
>> so all the first transactions are trying to catch up.
>
> threadRun() already initializes throttle_trigger with a fresh timestamp.
> Please detail how the problem remains despite that.
Indeed, I did this kludge because I could not rely on the "before fork" start
time as it was (possibly) creating a "rush" at the beginning of the run under
--rate.
The version I submitted takes the start time after the thread is created, and
use it directly for throttling, so the start time is taken once per thread and
used instead of retaking it because the first one cannot be relied on.
> [...]
Fine detailed analysis!
> Opinions, other ideas?
I do not think that there is a clean and simple way to take the start/stop
period into account when computing global performances of a run. The TPC-C
benchmark tells to ignore the warmup/closure period, whatever they are, and
only perform measures within the steady state. However the full graph must be
provided when the bench is provided.
About better measures: If I could rely on having threads, I would simply
synchronise the threads at the beginning so that they actually start after they
are all created, and one thread would decide when to stop and set a shared
volatile variable to stop all transactions more or less at once. In this case,
the thread start time would be taken just after the synchronization, and maybe
only by thread 0 would be enough.
Note that this is pretty standard stuff with threads, ISTM that it would solve
most of the issues, *but* this is not possible with the "thread fork emulation"
implemented by pgbench, which really means no threads at all.
A possible compromise would be to do just that when actual threads are used,
and let it more or less as it is when fork emulation is on...
--
Fabien.