Re: checkpointer continuous flushing - V16 - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: checkpointer continuous flushing - V16
Date
Msg-id alpine.DEB.2.10.1603011644380.18133@sto
Whole thread Raw
In response to Re: checkpointer continuous flushing - V16  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
Hello Tomas,

> One of the goals of this thread (as I understand it) was to make the overall 
> behavior smoother - eliminate sudden drops in transaction rate due to bursts 
> of random I/O etc.
>
> One way to look at this is in terms of how much the tps fluctuates, so let's 
> see some charts. I've collected per-second tps measurements (using the 
> aggregation built into pgbench) but looking at that directly is pretty 
> pointless because it's very difficult to compare two noisy lines jumping up 
> and down.
>
> So instead let's see CDF of the per-second tps measurements. I.e. we have 
> 3600 tps measurements, and given a tps value the question is what percentage 
> of the measurements is below this value.
>
>    y = Probability(tps <= x)
>
> We prefer higher values, and the ideal behavior would be that we get exactly 
> the same tps every second. Thus an ideal CDF line would be a step line. Of 
> course, that's rarely the case in practice. But comparing two CDF curves is 
> easy - the line more to the right is better, at least for tps measurements, 
> where we prefer higher values.

Very nice and interesting graphs!

Alas not easy to interpret for the HDD, as there are better/worse 
variation all along the distribution, the lines cross one another, so how 
it fares overall is unclear.

Maybe a simple indication would be to compute the standard deviation on 
the per second tps? The median maybe interesting as well.

> I do have some more data, but those are the most interesting charts. The rest 
> usually shows about the same thing (or nothing).
>
> Overall, I'm not quite sure the patches actually achieve the intended goals. 
> On the 10k SAS drives I got better performance, but apparently much more 
> variable behavior. On SSDs, I get a bit worse results.

Indeed.

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Speed up Clog Access by increasing CLOG buffers
Next
From: Andrew Dunstan
Date:
Subject: Re: Equivalent of --enable-tap-tests in MSVC scripts