Re: pgbench throttling latency limit - Mailing list pgsql-hackers
From | Jan Wieck |
---|---|
Subject | Re: pgbench throttling latency limit |
Date | |
Msg-id | 54109D30.2090202@wi3ck.info Whole thread Raw |
In response to | Re: pgbench throttling latency limit (Heikki Linnakangas <hlinnakangas@vmware.com>) |
List | pgsql-hackers |
On 09/10/2014 11:28 AM, Heikki Linnakangas wrote: > On 09/10/2014 05:57 PM, Fabien COELHO wrote: >> >> Hello Heikki, >> >>> I looked closer at the this, and per Jan's comments, realized that we don't >>> log the lag time in the per-transaction log file. I think that's a serious >>> omission; when --rate is used, the schedule lag time is important information >>> to make sense of the result. I think we have to apply the attached patch, and >>> backpatch to 9.4. (The documentation on the log file format needs updating) >> >> Indeed. I think that people do not like it to change. I remember that I >> suggested to change timestamps to "xxxx.yyyyyy" instead of the unreadable >> "xxxx yyy", and be told not to, because some people have tool which >> process the output so the format MUST NOT CHANGE. So my behavior is not to >> avoid touching anything in this area. >> >> I'm fine if you do it, though:-) However I have not time to have a precise >> look at your patch to cross-check it before Friday. > > This is different from changing "xxx yyy" to "xxx.yyy" in two ways. > First, this adds new information to the log that just isn't there > otherwise, it's not just changing the format for the sake of it. Second, > in this case it's easy to write a parser for the log format so that it > works with the old and new formats. Many such programs would probably > ignore any unexpected extra fields, as a matter of lazy programming, > while changing the separator from space to a dot would surely require > changing every parsing program. > > We could leave out the lag fields, though, when --rate is not used. > Without --rate, the lag is always zero anyway. That would keep the file > format unchanged, when you don't use the new --rate feature. I'm not > sure if that would be better or worse for programs that might want to > parse the information. We could also leave the default output format as is and introduce another option with a % style format string. Jan > >>> Also, this is bizarre: >>> >>>> int64 wait = (int64) (throttle_delay * >>>> 1.00055271703 * -log(getrand(thread, 1, 10000) / 10000.0)); >>> >>> We're using getrand() to generate a uniformly distributed random value >>> between 1 and 10000, and then convert it to a double between 0.0 and 1.0. >> >> The reason for this conversion is to have randomness but to still avoid >> going to extreme multiplier values. The idea is to avoid a very large >> multiplier which would generate (even if it is not often) a 0 tps when 100 >> tps is required. The 10000 granularity is basically random but the >> multiplier stays bounded (max 9.2, so the minimum possible tps would be >> around 11 for a target of 100 tps, bar issues from the database for >> processing the transactions). >> >> So although this feature can be discussed and amended, it is fully >> voluntary. I think that it make sense so I would prefer to keep it as is. >> Maybe the comments could be update to be clearer. > > Ok, yeah, the comments indeed didn't mention anything about that. I > don't think such clamping is necessary. With that 9.2x clamp on the > multiplier, the probability that any given transaction hits it is about > 1/10000. And a delay 9.2 times the average is still quite reasonable, > IMHO. The maximum delay on my laptop, when pg_erand48() returns DBL_MIN, > seems to be about 700 x the average, which is still reasonable if you > run a decent number of transactions. And of course, the probability of > hitting such an extreme value is miniscule, so if you're just doing a > few quick test runs with a small total number of transactions, you won't > hit that. > > - Heikki > -- Jan Wieck Senior Software Engineer http://slony.info
pgsql-hackers by date: