Thread: Fix pgbench --progress report under (very) low rate

Fix pgbench --progress report under (very) low rate

From
Fabien COELHO
Date:
When running with low rate, the --progress is only printed when there is 
some activity, which makes it quite irregular, including some catching up 
with stupid tps figures.

Shame on me for this "feature" (aka bug) in the first place.

This patch fixes this behavior by considering the next report time as a 
target to meet as well as transaction schedule times.

Before the patch:
 sh> ./pgbench -R 0.5 -T 10 -P 1 -S progress: 1.7 s, 0.6 tps, lat 6.028 ms stddev -nan, lag 1.883 ms progress: 2.2 s,
2.3tps, lat 2.059 ms stddev -nan, lag 0.530 ms progress: 7.3 s, 0.4 tps, lat 2.944 ms stddev 1.192, lag 2.624 ms
progress:7.3 s, 1402.5 tps, lat 5.115 ms stddev 0.000, lag 0.000 ms progress: 7.3 s, 0.0 tps, lat -nan ms stddev -nan,
laginf ms progress: 7.3 s, 335.2 tps, lat 3.106 ms stddev 0.000, lag 0.000 ms progress: 8.8 s, 0.0 tps, lat -nan ms
stddev-nan, lag inf ms progress: 8.8 s, 307.6 tps, lat 4.855 ms stddev -nan, lag 0.000 ms progress: 10.0 s, 0.0 tps,
lat-nan ms stddev -nan, lag -nan ms
 

After the patch:
 sh> ./pgbench -R 0.5 -T 10 -P 1 -S progress: 1.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms progress: 2.0 s, 1.0
tps,lat 5.980 ms stddev 0.000, lag 0.733 ms progress: 3.0 s, 1.0 tps, lat 1.905 ms stddev 0.000, lag 0.935 ms progress:
4.0s, 1.0 tps, lat 3.966 ms stddev 0.000, lag 0.623 ms progress: 5.0 s, 2.0 tps, lat 2.527 ms stddev 1.579, lag 0.512
msprogress: 6.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms progress: 7.0 s, 0.0 tps, lat -nan ms stddev -nan, lag
-nanms progress: 8.0 s, 1.0 tps, lat 1.750 ms stddev 0.000, lag 0.767 ms progress: 9.0 s, 0.0 tps, lat -nan ms stddev
-nan,lag -nan ms progress: 10.0 s, 2.0 tps, lat 2.403 ms stddev 1.386, lag 0.357 ms
 

To answer a question before it is asked: I run low rates because I'm 
looking at latency (rather than throughput) under different conditions. 
For instance with the above tests, the latency is about 3 ms, but it 
varies with the tps: (0.5 tps => 3 ms, 10 tps => 1 ms, 50 tps => 0.8 ms,
100 tps => 0.5 ms, 200 tps => 0.75 ms, 1000 tps => 0.5 ms...).

-- 
Fabien.

Re: Fix pgbench --progress report under (very) low rate

From
Robert Haas
Date:
On Sun, Mar 22, 2015 at 6:12 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote:
> When running with low rate, the --progress is only printed when there is
> some activity, which makes it quite irregular, including some catching up
> with stupid tps figures.
>
> Shame on me for this "feature" (aka bug) in the first place.
>
> This patch fixes this behavior by considering the next report time as a
> target to meet as well as transaction schedule times.
>
> Before the patch:
>
>  sh> ./pgbench -R 0.5 -T 10 -P 1 -S
>  progress: 1.7 s, 0.6 tps, lat 6.028 ms stddev -nan, lag 1.883 ms
>  progress: 2.2 s, 2.3 tps, lat 2.059 ms stddev -nan, lag 0.530 ms
>  progress: 7.3 s, 0.4 tps, lat 2.944 ms stddev 1.192, lag 2.624 ms
>  progress: 7.3 s, 1402.5 tps, lat 5.115 ms stddev 0.000, lag 0.000 ms
>  progress: 7.3 s, 0.0 tps, lat -nan ms stddev -nan, lag inf ms
>  progress: 7.3 s, 335.2 tps, lat 3.106 ms stddev 0.000, lag 0.000 ms
>  progress: 8.8 s, 0.0 tps, lat -nan ms stddev -nan, lag inf ms
>  progress: 8.8 s, 307.6 tps, lat 4.855 ms stddev -nan, lag 0.000 ms
>  progress: 10.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
>
> After the patch:
>
>  sh> ./pgbench -R 0.5 -T 10 -P 1 -S
>  progress: 1.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
>  progress: 2.0 s, 1.0 tps, lat 5.980 ms stddev 0.000, lag 0.733 ms
>  progress: 3.0 s, 1.0 tps, lat 1.905 ms stddev 0.000, lag 0.935 ms
>  progress: 4.0 s, 1.0 tps, lat 3.966 ms stddev 0.000, lag 0.623 ms
>  progress: 5.0 s, 2.0 tps, lat 2.527 ms stddev 1.579, lag 0.512 ms
>  progress: 6.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
>  progress: 7.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
>  progress: 8.0 s, 1.0 tps, lat 1.750 ms stddev 0.000, lag 0.767 ms
>  progress: 9.0 s, 0.0 tps, lat -nan ms stddev -nan, lag -nan ms
>  progress: 10.0 s, 2.0 tps, lat 2.403 ms stddev 1.386, lag 0.357 ms

I haven't had time to really review the code here (except to notice
that you have a typo: "nedded") but the idea of it seems good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fix pgbench --progress report under (very) low rate

From
Fabien COELHO
Date:
> I haven't had time to really review the code here (except to notice
> that you have a typo: "nedded") but the idea of it seems good.

v3 rebase (after pgbench moved to src/bin) and minor style tweaking.

-- 
Fabien.

Re: Fix pgbench --progress report under (very) low rate

From
Fabien COELHO
Date:
> v3 rebase (after pgbench moved to src/bin) and minor style tweaking.

v4 adds a fix to another progress timing issue:

Currently if pgbench/postgres get stuck somewhere, the report catches up 
by repeating progresses several time in a row, which looks like that:
  progress: 10.0 s ...  progress: 11.0 s ... stuck...  progress: 14.2 s catchup for 11.0 -> 14.2  progress: 14.2 s
stupiddata  progress: 14.2 s stupid data  progress: 15.0 s ...  progress: 16.0 s ...
 

The correction removes the "stupid data" lines which compute a reports on 
a very short time, including absurd tps figures.

Yet again, shame on me in the first place for this behavior.

-- 
Fabien.

Re: Fix pgbench --progress report under (very) low rate

From
Heikki Linnakangas
Date:
On 06/15/2015 09:12 PM, Fabien COELHO wrote:
>
>> v3 rebase (after pgbench moved to src/bin) and minor style tweaking.
>
> v4 adds a fix to another progress timing issue:
>
> Currently if pgbench/postgres get stuck somewhere, the report catches up
> by repeating progresses several time in a row, which looks like that:
>
>     progress: 10.0 s ...
>     progress: 11.0 s ... stuck...
>     progress: 14.2 s catchup for 11.0 -> 14.2
>     progress: 14.2 s stupid data
>     progress: 14.2 s stupid data
>     progress: 15.0 s ...
>     progress: 16.0 s ...
>
> The correction removes the "stupid data" lines which compute a reports on
> a very short time, including absurd tps figures.
>
> Yet again, shame on me in the first place for this behavior.

Thanks, applied. I chose to also backpatch this, although arguably this 
is a change in behaviour that would not be good to change in a minor 
version. However, progress reports are a very user-facing feature, it's 
not going to break anyone's scripts.

- Heikki