Re: index prefetching - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: index prefetching
Date
Msg-id 1ec17146-6d46-4997-af53-41c959f18496@vondra.me
Whole thread Raw
In response to Re: index prefetching  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers

On 7/16/25 17:29, Peter Geoghegan wrote:
> On Wed, Jul 16, 2025 at 4:40 AM Tomas Vondra <tomas@vondra.me> wrote:
>> For "uniform" data set, both prefetch patches do much better than master
>> (for low selectivities it's clearer in the log-scale chart). The
>> "complex" prefetch patch appears to have a bit of an edge for >1%
>> selectivities. I find this a bit surprising, the leaf pages have ~360
>> index items, so I wouldn't expect such impact due to not being able to
>> prefetch beyond the end of the current leaf page. But could be on
>> storage with higher latencies (this is the cloud SSD on azure).
> 
> How can you say that the "complex" patch has "a bit of an edge for >1%
> selectivities"?
> 
> It looks like a *massive* advantage on all "linear" test results.
> Those are only about 1/3 of all tests -- but if I'm not mistaken
> they're the *only* tests where prefetching could be expected to help a
> lot. The "cyclic" tests are adversarial/designed to make the patch
> look bad. The "uniform" tests have uniformly random heap accesses (I
> think), which can only be helped so much by prefetching.
> 
> For example, with "linear_10 / eic=16 / sync", it looks like "complex"
> has about half the latency of "simple" in tests where selectivity is
> 10. The advantage for "complex" is even greater at higher
> "selectivity" values. All of the other "linear" test results look
> about the same.
> 
> Have I missed something?
> 

That paragraph starts with "for uniform data set", and the statement
about 1% selectivities was only about that particular data set.

You're right there's a massive difference on all the "correlated" data
sets. I believe (assume) that's caused by the same issue, discussed in
this thread (where the simple patch seems to do fewer fadvise calls). I
only picked the "cyclic" data set as an example, representing this.

FWIW I suspect the difference on "uniform" data set might be caused by
this too, because at ~5% selectivity the queries start to hit pages
multiple times (there are ~20 rows/page, hence ~5% means ~1 row). But
it's much weaker than on the correlated data sets, of course.

regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: index prefetching
Next
From: Tomas Vondra
Date:
Subject: Re: index prefetching