Re: automatic analyze: readahead - add "IO read time" log message - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: automatic analyze: readahead - add "IO read time" log message
Date
Msg-id ef69691a-8ba4-18af-6d0f-adbacb43967a@enterprisedb.com
Whole thread Raw
In response to Re: automatic analyze: readahead - add "IO read time" log message  (Stephen Frost <sfrost@snowman.net>)
Responses Re: automatic analyze: readahead - add "IO read time" log message
List pgsql-hackers
On 11/9/20 7:06 PM, Stephen Frost wrote:
> Greetings,
> 
> * Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
>> On 11/4/20 5:02 PM, Stephen Frost wrote:
>>> * Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
>>>>> If you highlight "738754560" in the output it appears to duplicate the
>>>>> syscalls issued until it preads() - in case of "738754560" offset it was
>>>>> asked for 3 times. Also I wouldn't  imagine in wildest dreams that
>>>>> posix_fadvise(POSIX_FADV_WILLNEED) is such a cheap syscall.
>>>>
>>>> IMHO that'a a bug in the patch, which always tries to prefetch all
>>>> "future" blocks, including those that were already prefetched. It
>>>> probably needs to do something like bitmap heap scan where we track
>>>> what was already prefetched and only issue the new blocks.
>>>
>>> Updated patch attached which:
>>>
>>> - Starts out by pre-fetching the first effective_io_concurrency number
>>>   of blocks we are going to want, hopefully making it so the kernel will
>>>   trust our fadvise's over its own read-ahead, right from the start.
>>> - Makes sure the prefetch iterator is pushed forward whenever the
>>>   regular interator is moved forward.
>>> - After each page read, issues a prefetch, similar to BitmapHeapScan, to
>>>   hopefully avoiding having the prefetching get in the way of the
>>>   regular i/o.
>>> - Added some comments, ran pgindent, added a commit message.
>>
>> Nice, that was quick ;-)
> 
> :)
> 
>>> I do think we should also include patch that Jakub wrote previously
>>> which adds information about the read rate of ANALYZE.
>>
>> +1
> 
> Attached is an updated patch which updates the documentation and
> integrates Jakub's initial work on improving the logging around
> auto-analyze (and I made the logging in auto-vacuum more-or-less match
> it).
> 

Thanks. I'll do some testing/benchmarking once my machines are free, in
a couple days perhaps. But as I said before, I don't expect this to
behave very differently from other places that already do prefetching.

>>> I'll look at integrating that into this patch and then look at a new
>>> patch to do something similar for VACUUM in a bit.
>>
>> +1
> 
> I spent some time looking into this but it's a bit complicated..  For
> some sound reasons, VACUUM will avoid skipping through a table when
> there's only a few pages that it could skip (not skipping allows us to
> move forward the relfrozenxid).  That said, perhaps we could start doing
> prefetching once we've decided that we're skipping.  We'd need to think
> about if we have to worry about the VM changing between the pre-fetching
> and the time when we're actually going to ask for the page..  I don't
> *think* that's an issue because only VACUUM would be changing the pages
> to be all-frozen or all-visible, and so if we see a page that isn't one
> of those then we're going to want to visit that page and that's not
> going to change, and we probably don't need to care about a page that
> used to be all-frozen and now isn't during this run- but if the prefetch
> went ahead and got page 10, and now page 8 is not-all-frozen and the
> actual scan is at page 5, then maybe it wants page 8 next and that isn't
> what we pre-fetched...
> 
> Anyhow, all-in-all, definitely more complicated and probably best
> considered and discussed independently>

+1

FWIW I wonder if this should be tracked separately in the CF app, as
it's very different from the original "add some logging" patch, which
makes the CF entry rather misleading.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: automatic analyze: readahead - add "IO read time" log message
Next
From: "David G. Johnston"
Date:
Subject: Re: Disable WAL logging to speed up data loading