Re: PoC: prefetching data between executor nodes (e.g. nestloop + indexscan) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: PoC: prefetching data between executor nodes (e.g. nestloop + indexscan)
Date
Msg-id CAH2-Wz=TKjMa9GGNXJg_sz-meYiNA2giMjmwNw5jw5LryO3Qsg@mail.gmail.com
Whole thread Raw
In response to Re: PoC: prefetching data between executor nodes (e.g. nestloop + indexscan)  (Tomas Vondra <tomas@vondra.me>)
List pgsql-hackers
On Tue, Aug 27, 2024 at 6:44 PM Tomas Vondra <tomas@vondra.me> wrote:
> > One reason to do it this way is because it cuts down on index descent
> > costs, and other executor overheads. But it is likely that it will
> > also make prefetching itself more effective, too -- just because
> > prefetching will naturally end up with fewer, larger batches of
> > logically related work.
> >
>
> Perhaps.

I expect this to be particularly effective whenever there is naturally
occuring locality. I think that's fairly common. We'll sort the SAOP
array on the nbtree side, as we always do.

> So nestloop would pass down multiple values, the inner subplan
> would do whatever it wants (including prefetching), and then return the
> matching rows, somehow?

Right.

> It's not very clear to me how would we return
> the tuples for many matches, but it seems to shift the prefetching
> closer to the "normal" index prefetching discussed elsewhere.

It'll be necessary to keep track of which outer side rows relate to
which inner-side array values (within a given batch/block). Some new
data structure will be needed to manage that book keeping.

Currently, we deduplicate arrays for SAOP scans. I suppose that it
works that way because it's not really clear what it would mean for
the scan to have duplicate array keys. I don't see any need to change
that for block nested loop join/whatever this is. We would have to use
the new data structure to "pair up" outer side tuples with their
associated inner side result sets, at the end of processing each
batch/block. That way we avoid repeating the same inner index scan
within a given block/batch -- a little like with a memoize node.

Obviously, that's the case where we can exploit naturally occuring
locality most effectively -- the case where multiple duplicate inner
index scans are literally combined into only one. But, as I already
touched on, locality will be important in a variety of cases, not just
this one.

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
Next
From: Matthias van de Meent
Date:
Subject: Re: Showing primitive index scan count in EXPLAIN ANALYZE (for skip scan and SAOP scans)