Re: checkpointer continuous flushing - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: checkpointer continuous flushing
Date
Msg-id alpine.DEB.2.10.1601140956310.28426@sto
Whole thread Raw
In response to Re: checkpointer continuous flushing  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hello Andres,

>> Argh! This is a key point: the sort/flush is designed to help HDDs, and
>> would have limited effect on SSDs, and it seems that you are showing that
>> the effect is in fact negative on SSDs, too bad:-(
>
> As you quoted, I could reproduce the slowdown both with SSDs *and* with
> rotating disks.

Ok, once again I misunderstood. So you have a regression on HDD with the 
settings you pointed out, I can try that.

>> On SSDs, the linux IO scheduler works quite well, so this is a place where I
>> would consider simply disactivating flushing and/or sorting.
>
> Not my experience. In different scenarios, primarily with a large
> shared_buffers fitting the whole hot working set, the patch
> significantly improves performance.

Good! That would be what I expected, but I have no way to test that.

>>> postgres-ckpt14 \
>>>       -D /srv/temp/pgdev-dev-800/ \
>>>       -c maintenance_work_mem=2GB \
>>>       -c fsync=on \
>>>       -c synchronous_commit=off \
>>
>> I'm not sure I like this one. I guess the intention is to focus on
>> checkpointer writes and reduce the impact of WAL writes. Why not.
>
> Now sure what you mean? s_c = off is *very* frequent in the field.

Too bad, because for me it is really disactivating the D of ACID...

I think that this setting would not issue the "sync" calls on the WAL 
file, which means that the impact of WAL writing is somehow reduced and 
random writes (more or less for each transaction) is switched to 
sequential writes by the IO scheduler.

>>> My laptop 1 EVO 840, 1 i7-4800MQ, 16GB ram:
>>> master:
>>> scaling factor: 800
>>
>> The DB is probably about 12GB, so it fits in memory in the end, meaning that
>> there should be only write activity after some time? So this is not really
>> the case where it does not fit in memory, but it is large enough to get
>> mostly random IOs both in read & write, so why not.
>
> Doesn't really fit into ram - shared buffers uses some space (which will
> be double buffered) and the xlog will use some more.

Hmmm. My understanding is that you are really using about 6GB of shared 
buffer data in a run, plus some write only stuff...

xlog is flush/synced constantly and never read again, I would be surprise 
that it has a significant memory impact.

>>> ckpt-14 (flushing by backends disabled):
>>
>> Is this comment refering to "synchronous_commit = off"?
>> I guess this is the same on master above, even if not written?
>
> No, what I mean by that is that I didn't active flushing writes in
> backends -

I'm not sure that I understand. What is the actual corresponding directive 
in the configuration file?

>>> As you can see there's roughly a 30% performance regression on the
>>> slower SSD and a ~9% on the faster one. HDD results are similar (but I
>>> can't repeat on the laptop right now since the 2nd hdd is now an SSD).
>>
>> Ok, that is what I would have expected, the larger the database, the smaller
>> the impact of sorting & flushin on SSDs.
>
> Again: "HDD results are similar". I primarily tested on a 4 disk raid10
> of 4 disks, and a raid0 of 20 disks.

I guess similar but with a much lower tps. Anyway I can try that.

-- 
Fabien.



pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: extend pgbench expressions with functions
Next
From: Marco Atzeri
Date:
Subject: Re: Removing service-related code in pg_ctl for Cygwin