Re: WALWriter active during recovery - Mailing list pgsql-hackers

From didier
Subject Re: WALWriter active during recovery
Date
Msg-id CAJRYxuJFZ_a8NF0wOwu08JeMFndC-_=3xEN-v5w-qE=gQXnojg@mail.gmail.com
Whole thread Raw
In response to Re: WALWriter active during recovery  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: WALWriter active during recovery  (Simon Riggs <simon@2ndQuadrant.com>)
Re: WALWriter active during recovery  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
Hi,

On Tue, Dec 16, 2014 at 6:07 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 16 December 2014 at 14:12, Heikki Linnakangas
> <hlinnakangas@vmware.com> wrote:
>> On 12/15/2014 08:51 PM, Simon Riggs wrote:
>>>
>>> Currently, WALReceiver writes and fsyncs data it receives. Clearly,
>>> while we are waiting for an fsync we aren't doing any other useful
>>> work.
>>>
>>> Following patch starts WALWriter during recovery and makes it
>>> responsible for fsyncing data, allowing WALReceiver to progress other
>>> useful actions.
On many Linux systems it may not do that much (2.6.32 and 3.2 are bad,
3.13 is better but still it slows the fsync).

If there's a fsync in progress WALReceiver will:
1- slow the fsync because its writes to the same file are grabbed by the fsync
2- stall until the end of fsync.

from 'stracing' a test program simulating this pattern:
two processes, one writes to a file the second fsync it.

20279 11:51:24.037108 fsync(5 <unfinished ...>
20278 11:51:24.053524 <... nanosleep resumed> NULL) = 0 <0.020281>
20278 11:51:24.053691 lseek(3, 1383612416, SEEK_SET) = 1383612416 <0.000119>
20278 11:51:24.053965 write(3, "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"...,
8192) = 8192 <0.000111>
20278 11:51:24.054190 nanosleep({0, 20000000}, NULL) = 0 <0.020243>
....
20278 11:51:24.404386 lseek(3, 194772992, SEEK_SET <unfinished ...>
20279 11:51:24.754123 <... fsync resumed> ) = 0 <0.716971>
20279 11:51:24.754202 close(5 <unfinished ...>
20278 11:51:24.754232 <... lseek resumed> ) = 194772992 <0.349825>

Yes that's a 300ms lseek...

>>
>>
>> What other useful actions can WAL receiver do while it's waiting? It doesn't
>> do much else than receive WAL, and fsync it to disk.
>
> So now it will only need to do one of those two things.
>

Regards
Didier



pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: analyze_new_cluster.bat and delete_old_cluster.bat not ignored with vcregress upgradecheck
Next
From: David Rowley
Date:
Subject: Re: speedup tidbitmap patch: cache page