Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date | |
Msg-id | ed18f73d-216d-42f4-b08a-04a50a802d62@enterprisedb.com Whole thread Raw |
In response to | Re: BitmapHeapScan streaming read user and prelim refactoring (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: BitmapHeapScan streaming read user and prelim refactoring
|
List | pgsql-hackers |
On 3/1/24 17:51, Melanie Plageman wrote: > On Fri, Mar 1, 2024 at 9:05 AM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> On 3/1/24 02:18, Melanie Plageman wrote: >>> On Thu, Feb 29, 2024 at 6:44 PM Tomas Vondra >>> <tomas.vondra@enterprisedb.com> wrote: >>>> >>>> On 2/29/24 23:44, Tomas Vondra wrote: >>>> 1) On master there's clear difference between eic=0 and eic=1 cases, but >>>> on the patched build there's literally no difference - for example the >>>> "uniform" distribution is clearly not great for prefetching, but eic=0 >>>> regresses to eic=1 poor behavior). >>> >>> Yes, so eic=0 and eic=1 are identical with the streaming read API. >>> That is, eic 0 does not disable prefetching. Thomas is going to update >>> the streaming read API to avoid issuing an fadvise for the last block >>> in a range before issuing a read -- which would mean no prefetching >>> with eic 0 and eic 1. Not doing prefetching with eic 1 actually seems >>> like the right behavior -- which would be different than what master >>> is doing, right? >> >> I don't think we should stop doing prefetching for eic=1, or at least >> not based just on these charts. I suspect these "uniform" charts are not >> a great example for the prefetching, because it's about distribution of >> individual rows, and even a small fraction of rows may match most of the >> pages. It's great for finding strange behaviors / corner cases, but >> probably not a sufficient reason to change the default. > > Yes, I would like to see results from a data set where selectivity is > more correlated to pages/heap fetches. But, I'm not sure I see how > that is related to prefetching when eic = 1. > >> I think it makes sense to issue a prefetch one page ahead, before >> reading/processing the preceding one, and it's fairly conservative >> setting, and I assume the default was chosen for a reason / after >> discussion. > > Yes, I suppose the overhead of an fadvise does not compare to the IO > latency of synchronously reading that block. Actually, I bet the > regression I saw by accidentally moving BitmapAdjustPrefetchIterator() > after table_scan_bitmap_next_block() would be similar to the > regression introduced by making eic = 1 not prefetch. > > When you think about IO concurrency = 1, it doesn't imply prefetching > to me. But, I think we want to do the right thing and have parity with > master. > >> My suggestion would be to keep the master behavior unless not practical, >> and then maybe discuss changing the details later. The patch is already >> complicated enough, better to leave that discussion for later. > > Agreed. Speaking of which, we need to add back use of tablespace IO > concurrency for the streaming read API (which is used by > BitmapHeapScan in master). > >>> With very low selectivity, you are less likely to get readahead >>> (right?) and similarly less likely to be able to build up > 8kB IOs -- >>> which is one of the main value propositions of the streaming read >>> code. I imagine that this larger read benefit is part of why the >>> performance is better at higher selectivities with the patch. This >>> might be a silly experiment, but we could try decreasing >>> MAX_BUFFERS_PER_TRANSFER on the patched version and see if the >>> performance gains go away. >> >> Sure, I can do that. Do you have any particular suggestion what value to >> use for MAX_BUFFERS_PER_TRANSFER? > > I think setting it to 1 would be the same as always master -- doing > only 8kB reads. The only thing about that is that I imagine the other > streaming read code has some overhead which might end up being a > regression on balance even with the prefetching if we aren't actually > using the ranges/vectored capabilities of the streaming read > interface. Maybe if you just run it for one of the very obvious > performance improvement cases? I can also try this locally. > Here's some results from a build with #define MAX_BUFFERS_PER_TRANSFER 1 There are three columns: - master - patched (original patches, with MAX_BUFFERS_PER_TRANSFER=128kB) - patched-single (MAX_BUFFERS_PER_TRANSFER=8kB) The color scales are always branch compared to master. I think the expectation was that setting the transfer to 1 would make it closer to master, reducing some of the regressions. But in practice the effect is the opposite. - In "cached" runs, this eliminates the small improvements (light green), but leaves the regressions behind. - In "uncached" runs, this exacerbates the regressions, particularly for low selectivities (small values of matches). I don't have a good intuition on why this would be happening :-( regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
pgsql-hackers by date: