Hi,
On 11/4/20 5:02 PM, Stephen Frost wrote:
> Greetings,
>
> * Tomas Vondra (tomas.vondra@2ndquadrant.com) wrote:
>>> If you highlight "738754560" in the output it appears to duplicate the
>>> syscalls issued until it preads() - in case of "738754560" offset it was
>>> asked for 3 times. Also I wouldn't imagine in wildest dreams that
>>> posix_fadvise(POSIX_FADV_WILLNEED) is such a cheap syscall.
>>
>> IMHO that'a a bug in the patch, which always tries to prefetch all
>> "future" blocks, including those that were already prefetched. It
>> probably needs to do something like bitmap heap scan where we track
>> what was already prefetched and only issue the new blocks.
>
> Updated patch attached which:
>
> - Starts out by pre-fetching the first effective_io_concurrency number
> of blocks we are going to want, hopefully making it so the kernel will
> trust our fadvise's over its own read-ahead, right from the start.
> - Makes sure the prefetch iterator is pushed forward whenever the
> regular interator is moved forward.
> - After each page read, issues a prefetch, similar to BitmapHeapScan, to
> hopefully avoiding having the prefetching get in the way of the
> regular i/o.
> - Added some comments, ran pgindent, added a commit message.
>
Nice, that was quick ;-)
> I do think we should also include patch that Jakub wrote previously
> which adds information about the read rate of ANALYZE.
>
+1
> I'll look at integrating that into this patch and then look at a new
> patch to do something similar for VACUUM in a bit.
>
+1
> If you're doing further benchmarking of ANALYZE though, this would
> probably be the better patch to use. Certainly improved performance
> here quite a bit with effective_io_concurrency set to 16.
>
Yeah. I'd expect this to be heavily dependent on hardware.
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services