Re: posix_fadvise() and pg_receivexlog - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: posix_fadvise() and pg_receivexlog
Date
Msg-id 53E33297.2020706@vmware.com
Whole thread Raw
In response to Re: posix_fadvise() and pg_receivexlog  (Mitsumasa KONDO <kondo.mitsumasa@gmail.com>)
Responses Re: posix_fadvise() and pg_receivexlog
Re: posix_fadvise() and pg_receivexlog
List pgsql-hackers
On 08/07/2014 10:10 AM, Mitsumasa KONDO wrote:
> 2014-08-07 13:47 GMT+09:00 Fujii Masao <masao.fujii@gmail.com>:
>
>> On Thu, Aug 7, 2014 at 3:59 AM, Heikki Linnakangas
>> <hlinnakangas@vmware.com> wrote:
>>> On 08/06/2014 08:39 PM, Fujii Masao wrote:
>>>> The WAL files that pg_receivexlog writes will not be re-read soon
>>>> basically,
>>>> so we can advise the OS to release any cached pages when WAL file is
>>>> closed. I feel inclined to change pg_receivexlog that way. Thought?
>>>
>>>
>>> -1. The OS should be smart enough to not thrash the cache by files that
>> are
>>> written sequentially and never read.
>>
> OS's buffer strategy is optimized for general situation. Do you forget OS
> hackers discussion last a half of year?
>
>> Yep, the OS should be so smart, but I'm not sure if it actually is. Maybe
>> not,
>> so I was thinking that posix_fadvise is called when the server closes WAL
>> file.
>
> That's right.

Well, I'd like to hear someone from the field complaining that 
pg_receivexlog is thrashing the cache and thus reducing the performance 
of some other process. Or a least a synthetic test case that 
demonstrates that happening.

> By the way, does pg_receivexlog process have fsync() in every WAL commit?

It fsync's each file after finishing to write it. Ie. each WAL file is 
fsync'd once.

> If yes, I think that we need no or less fsync() option for the better
> performance. It is general in NOSQL storages.
> If no, we need fsync() option for more getting reliability and data
> integrarity.

Hmm. An fsync=off style option might make sense, although I doubt the 
one fsync at end of file is causing a performance problem for anyone in 
practice. Haven't heard any complaints, anyway.

An option to fsync after every commit record might make sense if you use 
pg_receivexlog with synchronous replication. Doing that would require 
parsing the WAL, though, to see where the commit records are. But then 
again, the fsync's wouldn't need to correspond to commit records. We 
could fsync just before we go to sleep to wait for more WAL to be received.

- Heikki




pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: A worst case for qsort
Next
From: Peter Geoghegan
Date:
Subject: Re: B-Tree support function number 3 (strxfrm() optimization)