Re: checkpointer continuous flushing - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: checkpointer continuous flushing
Date
Msg-id alpine.DEB.2.10.1603180856300.31871@sto
Whole thread Raw
In response to Re: checkpointer continuous flushing  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: checkpointer continuous flushing
List pgsql-hackers
Hello Tomas,

> But I do think it's a very useful tool when it comes to measuring the 
> consistency of behavior over time, assuming you're asking questions 
> about the intervals and not the original transactions.

For a throttled run, I think it is better to check whether or not the 
system could handle the load "as expected", i.e. with reasonnable latency, 
so somehow I'm interested in the "original transactions" as scheduled by 
the client, and whether they were processed efficiently, but then it must 
be aggregated by interval to get some statistics.

> For example, had there been intervals with vastly different transaction 
> rates, we'd see that on the tps charts (i.e. the chart would be much more 
> gradual or wobbly, just like the "unpatched" one). Or if there were intervals 
> with much higher variance of latencies, we'd see that on the STDDEV chart.

On HDDs what happens is that transactions are "blocked/freezed", the tps 
is very low, the latency very high, but then with few tx (even 1 or 0 at 
time) and all latencies very bad but nevertheless close one to the other, 
in a bad way, the resulting stddev may be quite small anyway.

> I'll consider repeating the benchmark and logging some reasonable sample of 
> transactions

Beware that this measure is skewed, because on HDDs when the system is 
stuck, it is stuck on very few transactions which are waiting, but they
would seldom show on statistics are there are very few of them. That is 
why I'm interested in those that could not make it, hence my interest in 
--latency-limit option which just say that.

>>> So I don't think this would make any measurable difference in practice.
>> 
>> I think that it may show that 25% of the time the system could not
>> match the target tps, even if it can handle much more on average, so
>> the tps achieved when discarding late transactions would be under
>> 4000 tps.
>
> You mean the 'throttled-tps' chart?

Yes.

> Yes, that one shows that without the patches, there's a lot of intervals 
> where the tps was much lower - presumably due to a lot of slow 
> transactions.

Yep. That is what is measured with the latency limit option, by counting 
the dropped transactions that where not processed in a timely maner.

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Performance degradation in commit ac1d794
Next
From: "Shulgin, Oleksandr"
Date:
Subject: Re: pg_hba_lookup function to get all matching pg_hba.conf entries