>>> synchronous_commit = off does make a significant difference.
>>
>> Sure, but I had thought about that and kept this one...
>
> But why are you then saying this is fundamentally limited to 160
> xacts/sec?
I'm just saying that the tested load generates mostly random IOs (probably
on average over 1 page per transaction), random IOs are very slow on a
HDD, so I do not expect great tps.
>> I think I found one possible culprit: I automatically wrote 300 seconds for
>> checkpoint_timeout, instead of 30 seconds in your settings. I'll have to
>> rerun the tests with this (unreasonnable) figure to check whether I really
>> get a regression.
>
> I've not seen meaningful changes in the size of the regression between 30/300s.
At 300 seconds (5 minutes) the checkpoints of the accumulated takes 15-25
minutes, during which the database is mostly offline, and there is no
clear difference with/without sort+flush.
>> Other tests I ran with "reasonnable" settings on a large (scale=800) db did
>> not show any significant performance regression, up to now.
>
> Try running it so that the data set nearly, but not entirely fit into
> the OS page cache, while definitely not fitting into shared_buffers. The
> scale=800 just worked for that on my hardware, no idea how it's for yours.
> That seems to be the point where the effect is the worst.
I have 16GB memory on the tested host, same as your hardware I think, so I
use scale 800 => 12GB at the beginning of the run. Not sure it fits the
bill as I think it fits in memory, so the load is mostly write and no/very
few reads. I'll also try with scale 1000.
--
Fabien.