Re: pgbench progress report improvements - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: pgbench progress report improvements |
Date | |
Msg-id | alpine.DEB.2.02.1309220850080.18614@sto Whole thread Raw |
In response to | Re: pgbench progress report improvements (Noah Misch <noah@leadboat.com>) |
Responses |
Re: pgbench progress report improvements
|
List | pgsql-hackers |
Dear Noah, Thanks for your answers and remarks, [...] I'll split some part of the patch where there is no coupling, but I do not want to submit conflicting patches. > Those benefits aren't compelling enough to counterbalance the risk of > gettimeofday() overhead affecting results. (Other opinions welcome.) Yep. If I really leave gettimeofday out, I cannot measure the stddev, so I'll end up with: --rate => gettimeofday, mean (& stddev) measured, because they cannot be derived otherwise. no --rate => no (or less) gettimeofday, mean computed from total time and no stddev report because it cannot be computed. That is annoying, because I do want the standard deviation, and this mean one "if"s complexity here and there. ISTM that when running under a target time, the (hopefully very small) overhead is only one gettimeofday call, because the other one is taken anyway to check whether the thread should stop. Or I can add a yet another option, say --stddev, to ask for standard deviation, which will imply the additional gettimeofday call... > For a tool like pgbench that requires considerable skill to use > effectively, changing a default only helps slightly. It doesn't take > much of a risk to make us better off leaving the default unchanged. I can put a 0 default... but even very experienced people will be bitten over and over. Why should I care? ISTM that the default should be the safe option, and experienced user can request "-quiet" if they want it. >> [...] >> I tried to preserve the row-counting behavior because I thought that >> someone would object otherwise, but I would be really in favor of >> dropping the row-counting report behavior altogether and only keep the >> time triggered report. > > I would be fine with dropping the row-counting behavior. But why subject this > distant side issue to its third round of bikeshedding in 18 months? I was not involved:-) The 100000 behavior is the initial & old version, and only applies to initialization. Someone found it too verbose when scaling, and I agree, so made a quick patch which preserves the old behavior (someone must have said: whatever, do not change the default!) but allowed to switch to a less noisy version, hence the -quiet which is not really quiet. This would be the defective result of a compromise:-) If I follow your request not to change any default, I cannot merge cleanly the -i & bench behaviors, as currenty -i does have a default progress report and its own -quiet, and benchmarking does not. The current behavior is inconsistent. I would prefer something consistent, preferably always show a progress report, and -quiet hides it (fully), or if you really insist no progress report, and --progress shows it, and the -quiet option is removed. >>>> - Take thread start time at the beginning of the thread. >>> >>> That theory is sound, but I would like at least one report reproducing that >>> behavior and finding that this change improved it. > > [...] so I'm inclined to leave it alone. I really spent *hours* debugging stupid measures on the previous round of pgbench changes, when adding the throttling stuff. Having the start time taken when the thread really starts is just sanity, and I needed that just to rule out that it was the source of the "strange" measures. -j 800 vs -j 100 : ITM that if I you create more threads, the time delay incurred is cumulative, so the strangeness of the result should worsen. 800 threads ~ possibly 10 seconds of creation time, to be compared to a few seconds of run time. >>> Shouldn't it just be: >>> >>> int64 wait = (int64) (throttle_delay * >>> -log(1 - pg_erand48(thread->random_state))); >> [...] > Ah; that makes sense. Then I understand why you want to remove the bias, but > why do you also increase the upper bound? Because the bias was significantly larger for 1000 (about 0.5%), so this also contributed to reduce said bias, and 9.2 times the average target seems as reasonnable a bound as 6.9. >> It is also printed without --rate. There is a "if" above because there is >> one report with "lag" (under --rate), and one without. > > The code I quoted is for the final report in printResults(), and that only > shows latency mean/stddev when using --rate. The progress reporting in > threadRun() does have two variations as you describe. Indeed, I took it for the progress report. I'll check. It must be consistent whether under --rate or not. -- Fabien.
pgsql-hackers by date: