On Fri, 2005-03-04 at 20:10 -0500, Greg Stark wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
>
> > Amdahl's Law tells me that looking at the checkpoints is the next best
> > action for tuning, since they add considerably to the average response
> > time. Looking at the oprofile for the run as a whole is missing out the
> > delayed transaction behaviour that occurs during checkpoints.
>
> Even aside from the effect it has on average response time. I would point out
> that many applications are governed by the worst case more than the average
> throughput.
>
> For a web server, for example (or any OLTP application in general), it doesn't
> matter if the database can handle x transactions/s on average. What matters is
> that 100% of the time the latency is below the actual rate of requests. If
> every 30m latency suddenly spikes up beyond that, even for only a minute, then
> it will fall behind in the requests. The user will effectively see a
> completely unresponsive web server.
>
> So I would really urge you to focus your attention on the maximum latency
> figure. It's at least if not *more* important than the average throughput
> number.
Sorry Greg, clearly my English was poor.
The checkpoints are the source of the peak latency on transactions, so
we are in complete agreement.
> PS That's why I was pushing before for the idea that the server should try to
> spread the I/O from one checkpoint out over more or less the time interval
> between checkpoints. If it's been 30m since the last checkpoint then you have
> about 30m to do the I/O for this checkpoint. (Though I would suggest a safety
> factor of aiming to be finished within 50% of the time.)
I don't want to fix it before I know what the issue is.
Best Regards, Simon Riggs