On Tue, Mar 13, 2012 at 11:18 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Tue, Mar 13, 2012 at 6:44 PM, Josh Berkus <josh@agliodbs.com> wrote:
>>> That's a speedup of nearly a factor of two, so clearly fsync-related
>>> stalls are a big problem here, even with wal_buffers cranked up
>>> through the ceiling.
>>
>> Hmmmm. Do you have any ability to test on XFS?
>
> It seems I do.
>
> XFS, with fsync = on:
> tps = 14746.687499 (including connections establishing)
> XFS, with fsync = off:
> tps = 25121.876560 (including connections establishing)
>
> No real dramatic difference there, maybe a bit slower.
>
> On further thought, it may be that this is just a simple case of too
> many checkpoints. With fsync=off, we don't have to actually write all
> that dirty data back to disk. I'm going to try cranking up
> checkpoint_segments and see what happens.
OK, this is bizarre. I wiped out my XFS filesystem and put ext4 back,
and look at this:
tps = 19105.740878 (including connections establishing)
tps = 19687.674409 (including connections establishing)
That's a jump of nearly a third from before. I'm not sure what's
different. Nothing, AFAIK. I drop and recreate the database after
every test run, so I don't see why this should be so much better,
unless ext4 degrades over time (even though the FS is nearly empty,
and I'm dropping the whole database after each test run).
Then I tried it with checkpoint_segments=3000 rather than 300.
tps = 26750.190469 (including connections establishing)
Hmm, what happens with checkpoint_segments=3000 and fsync=off?
tps = 30395.583366 (including connections establishing)
Hmm, and what if I set checkpoint_segments=300 and fsync=off?
tps = 26029.160919 (including connections establishing)
Not sure what to make of all this, yet.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company