Home > mailing lists

Re: index prefetching - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: index prefetching
Date	August 26 02:48:17
Msg-id	y32ow7vfdrrcchxrwgfr5tbwg3ncm4f3rlp65svcbqorbm3tmn@cwvipbplnqyt Whole thread Raw
In response to	Re: index prefetching (Tomas Vondra <tomas@vondra.me>)
Responses	Re: index prefetching
List	pgsql-hackers

Tree view

Hi,

On 2025-08-25 15:00:39 +0200, Tomas Vondra wrote:
> Thanks. Based on the testing so far, the patch seems to be a substantial
> improvement. What's needed to make this prototype committable?

Mainly some testing infrastructure that can trigger this kind of stream. The
logic is too finnicky for me to commit it without that.


> I assume this is PG19+ improvement, right? It probably affects PG18 too,
> but it's harder to hit / the impact is not as bad as on PG19.

Yea. It does apply to 18 too, but I can't come up with realistic scenarios
where it's a real issue. I can repro a slowdown when using many parallel
seqscans with debug_io_direct=data - but that's even slower in 17...


> On a related note, my test that generates random datasets / queries, and
> compares index prefetching with different io_method values found a
> pretty massive difference between worker and io_uring. I wonder if this
> might be some issue in io_method=worker.

> while with index prefetching (with the aio prototype patch), it looks
> like this:
> 
>                                 QUERY PLAN
>   ----------------------------------------------------------------------
>    Index Scan using idx on t (actual rows=9048576.00 loops=1)
>      Index Cond: ((a >= 16150) AND (a <= 4540437))
>      Index Searches: 1
>      Prefetch Distance: 2.032
>      Prefetch Count: 868165
>      Prefetch Stalls: 2140228
>      Prefetch Skips: 6039906
>      Prefetch Resets: 0
>      Stream Ungets: 0
>      Stream Forwarded: 4
>      Prefetch Histogram: [2,4) => 855753, [4,8) => 12412
>      Buffers: shared hit=2577599 read=455610
>    Planning:
>      Buffers: shared hit=78 read=26 dirtied=1
>    Planning Time: 1.032 ms
>    Execution Time: 3150.578 ms
>   (16 rows)
> 
> So it's about 2x slower. The prefetch distance collapses, because
> there's a lot of cache hits (about 50% of requests seem to be hits of
> already visited blocks). I think that's a problem with how we adjust the
> distance, but I'll post about that separately.
> 
> Let's try to simply set io_method=io_uring:
> 
>                                 QUERY PLAN
>   ----------------------------------------------------------------------
>    Index Scan using idx on t  (actual rows=9048576.00 loops=1)
>      Index Cond: ((a >= 16150) AND (a <= 4540437))
>      Index Searches: 1
>      Prefetch Distance: 2.032
>      Prefetch Count: 868165
>      Prefetch Stalls: 2140228
>      Prefetch Skips: 6039906
>      Prefetch Resets: 0
>      Stream Ungets: 0
>      Stream Forwarded: 4
>      Prefetch Histogram: [2,4) => 855753, [4,8) => 12412
>      Buffers: shared hit=2577599 read=455610
>    Planning:
>      Buffers: shared hit=78 read=26
>    Planning Time: 2.212 ms
>    Execution Time: 1837.615 ms
>   (16 rows)
> 
> That's much closer to master (and the difference could be mostly noise).
> 
> I'm not sure what's causing this, but almost all regressions my script
> is finding look like this - always io_method=worker, with distance close
> to 2.0. Is this some inherent io_method=worker overhead?

I think what you might be observing might be the inherent IPC / latency
overhead of the worker based approach. This is particularly pronounced if the
workers are idle (and the CPU they get scheduled on is clocked down). The
latency impact of that is small, but if you never actually get to do much
readahead it can be visible.

Greetings,

Andres Freund

pgsql-hackers by date:

From: Jacob Champion
Date: 26 August, 02:36:48
Subject: Re: Explicitly enable meson features in CI

From: Nathan Bossart
Date: 26 August, 03:08:15
Subject: Re: Use LW_SHARED in WakeupWalSummarizer() for WALSummarizerLock lock

Re: index prefetching - Mailing list pgsql-hackers

Previous

Next