On 06/11/2023 19:05, Tomas Vondra wrote:
> As for the path forward, I think the prefetching is demonstrably
> beneficial. There are cases where it can't help or even harms
> performance. I think the success depends on three areas:
>
> (a) reducing the costs of the prefetching - For example right now we
> build the index tuples twice (once for prefetch, once for the insert),
> but maybe there's a way to do that only once? There are also predicate
> indexes, and so on.
>
> (b) being smarter about when to prefetch - For example if we only have
> one "prefetchable" index, it's somewhat pointless to prefetch (for
> single-row cases). And so on.
>
> (c) not prefetching when already cached - This is somewhat related to
> the previous case, but perhaps it'd be cheaper to first check if the
> data is already cached. For shared buffers it should not be difficult,
> for page cache we could use preadv2 with RWF_NOWAIT flag. The question
> is if this is cheap enough to be cheaper than just doing posix_fadvise
> (which however only deals with shared buffers).
I don't like this approach. It duplicates the tree-descend code, and it
also duplicates the work of descending the tree at runtime. And it only
addresses index insertion; there are a lot of places that could benefit
from prefetching or async execution like this.
I think we should think of this as async execution rather than
prefetching. We don't have the general infrastructure for writing async
code, but if we did, this would be much simpler. In an async programming
model, like you have in many other languages like Rust, python or
javascript, there would be no separate prefetching function. Instead,
aminsert() would return a future that can pause execution if it needs to
do I/O. Something like this:
aminsert_futures = NIL;
/* create a future for each index insert */
for (<all indexes>)
{
aminsert_futures = lappend(aminsert_futures, aminsert(...));
}
/* wait for all the futures to finish */
await aminsert_futures;
The async-aware aminsert function would run to completion quickly if all
the pages are already in cache. If you get a cache miss, it would start
an async I/O read for the page, and yield to the other insertions until
the I/O completes.
We already support async execution of FDWs now, with the
ForeignAsyncRequest() and ForeignAsyncConfigureWait() callbacks. Can we
generalize that?
--
Heikki Linnakangas
Neon (https://neon.tech)