Gregory Stark wrote:
> "Heikki Linnakangas" <heikki@enterprisedb.com> writes:
>> Now that the checkpoints are spread out more, the response times are very
>> smooth.
>
> So obviously the reason the results are so dramatic is that the checkpoints
> used to push the i/o bandwidth demand up over 100%. By spreading it out you
> can see in the io charts that even during the checkpoint the i/o busy rate
> stays just under 100% except for a few data points.
>
> If I understand it right Greg Smith's concern is that in a busier system where
> even *with* the load distributed checkpoint the i/o bandwidth demand during t
> he checkpoint was *still* being pushed over 100% then spreading out the load
> would only exacerbate the problem by extending the outage.
>
> To that end it seems like what would be useful is a pair of tests with and
> without the patch with about 10% larger warehouse size (~ 115) which would
> push the i/o bandwidth demand up to about that level.
I still don't see how spreading the writes could make things worse, but
running more tests is easy. I'll schedule tests with more warehouses
over the weekend.
> It might even make sense to run a test with an outright overloaded to see if
> the patch doesn't exacerbate the condition. Something with a warehouse size of
> maybe 150. I would expect it to fail the TPCC constraints either way but what
> would be interesting to know is whether it fails by a larger margin with the
> LDC behaviour or a smaller margin.
I'll do that as well, though experiences with tests like that in the
past have been that it's hard to get repeatable results that way.
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com