Re: WAL prefetch - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: WAL prefetch
Date
Msg-id 19b3c454-ca4c-bda5-6521-2f893f4451a9@postgrespro.ru
Whole thread Raw
In response to Re: WAL prefetch  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: WAL prefetch
List pgsql-hackers

On 22.06.2018 11:35, Konstantin Knizhnik wrote:
>
>
> On 21.06.2018 19:57, Tomas Vondra wrote:
>>
>>
>> On 06/21/2018 04:01 PM, Konstantin Knizhnik wrote:
>>> I continue my experiments with WAL prefetch.
>>> I have embedded prefetch in Postgres: now walprefetcher is started 
>>> together with startup process and is able to help it to speedup 
>>> recovery.
>>> The patch is attached.
>>>
>>> Unfortunately result is negative (at least at my desktop: SSD, 16Gb 
>>> RAM). Recovery with prefetch is 3 times slower than without it.
>>> What I am doing:
>>>
>>> Configuration:
>>>      max_wal_size=min_wal_size=10Gb,
>>>      shared)buffers = 1Gb
>>> Database:
>>>       pgbench -i -s 1000
>>> Test:
>>>       pgbench -c 10 -M prepared -N -T 100 -P 1
>>>       pkill postgres
>>>       echo 3 > /proc/sys/vm/drop_caches
>>>       time pg_ctl -t 1000 -D pgsql -l logfile start
>>>
>>> Without prefetch it is 19 seconds (recovered about 4Gb of WAL), with 
>>> prefetch it is about one minute. About 400k blocks are prefetched.
>>> CPU usage is small (<20%), both processes as in "Ds" state.
>>>
>>
>> Based on a quick test, my guess is that the patch is broken in 
>> several ways. Firstly, with the patch attached (and 
>> wal_prefetch_enabled=on, which I think is needed to enable the 
>> prefetch) I can't even restart the server, because pg_ctl restart 
>> just hangs (the walprefetcher process gets stuck in WaitForWAL, IIRC).
>>
>> I have added an elog(LOG,...) to walprefetcher.c, right before the 
>> FilePrefetch call, and (a) I don't see any actual prefetch calls 
>> during recovery but (b) I do see the prefetch happening during the 
>> pgbench. That seems a bit ... wrong?
>>
>> Furthermore, you've added an extra
>>
>>     signal_child(BgWriterPID, SIGHUP);
>>
>> to SIGHUP_handler, which seems like a bug too. I don't have time to 
>> investigate/debug this further.
>>
>> regards
>
> Sorry, updated version of the patch is attached.
> Please also notice that you can check number of prefetched pages using 
> pg_stat_activity() - it is reported for walprefetcher process.
> Concerning the fact that you have no see prefetches at recovery time: 
> please check that min_wal_size and max_wal_size are large enough and 
> pgbench (or whatever else)
> committed large enough changes so that recovery will take some time.
>
>

I have improved my WAL prefetch patch. The main reason of slowdown 
recovery speed with enabled prefetch was that it doesn't take in account 
initialized pages  (XLOG_HEAP_INIT_PAGE)
and doesn't remember (cache) full page writes.
The main differences of new version of the patch:

1. Use effective_cache_size as size of cache of prefetched blocks
2. Do not prefetch blocks sent in shared buffers
3. Do not prefetch blocks  for RM_HEAP_ID with XLOG_HEAP_INIT_PAGE bit set
4. Remember new/fpw pages in prefetch cache, to avoid prefetch them for 
subsequent  WAL records.
5. Add min/max prefetch lead parameters to make it possible to 
synchronize speed of prefetch with speed of replay.
6. Increase size of open file cache to avoid redundant open/close 
operations.






-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Attachment

pgsql-hackers by date:

Previous
From: Rajkumar Raghuwanshi
Date:
Subject: alter index WITH ( storage_parameter = value [, ... ] ) for partition index.
Next
From: Amit Khandekar
Date:
Subject: Re: Concurrency bug in UPDATE of partition-key