Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: BitmapHeapScan streaming read user and prelim refactoring
Date
Msg-id ed18f73d-216d-42f4-b08a-04a50a802d62@enterprisedb.com
Whole thread Raw
In response to Re: BitmapHeapScan streaming read user and prelim refactoring  (Melanie Plageman <melanieplageman@gmail.com>)
Responses Re: BitmapHeapScan streaming read user and prelim refactoring
List pgsql-hackers

On 3/1/24 17:51, Melanie Plageman wrote:
> On Fri, Mar 1, 2024 at 9:05 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> On 3/1/24 02:18, Melanie Plageman wrote:
>>> On Thu, Feb 29, 2024 at 6:44 PM Tomas Vondra
>>> <tomas.vondra@enterprisedb.com> wrote:
>>>>
>>>> On 2/29/24 23:44, Tomas Vondra wrote:
>>>> 1) On master there's clear difference between eic=0 and eic=1 cases, but
>>>> on the patched build there's literally no difference - for example the
>>>> "uniform" distribution is clearly not great for prefetching, but eic=0
>>>> regresses to eic=1 poor behavior).
>>>
>>> Yes, so eic=0 and eic=1 are identical with the streaming read API.
>>> That is, eic 0 does not disable prefetching. Thomas is going to update
>>> the streaming read API to avoid issuing an fadvise for the last block
>>> in a range before issuing a read -- which would mean no prefetching
>>> with eic 0 and eic 1. Not doing prefetching with eic 1 actually seems
>>> like the right behavior -- which would be different than what master
>>> is doing, right?
>>
>> I don't think we should stop doing prefetching for eic=1, or at least
>> not based just on these charts. I suspect these "uniform" charts are not
>> a great example for the prefetching, because it's about distribution of
>> individual rows, and even a small fraction of rows may match most of the
>> pages. It's great for finding strange behaviors / corner cases, but
>> probably not a sufficient reason to change the default.
> 
> Yes, I would like to see results from a data set where selectivity is
> more correlated to pages/heap fetches. But, I'm not sure I see how
> that is related to prefetching when eic = 1.
> 
>> I think it makes sense to issue a prefetch one page ahead, before
>> reading/processing the preceding one, and it's fairly conservative
>> setting, and I assume the default was chosen for a reason / after
>> discussion.
> 
> Yes, I suppose the overhead of an fadvise does not compare to the IO
> latency of synchronously reading that block. Actually, I bet the
> regression I saw by accidentally moving BitmapAdjustPrefetchIterator()
> after table_scan_bitmap_next_block() would be similar to the
> regression introduced by making eic = 1 not prefetch.
> 
> When you think about IO concurrency = 1, it doesn't imply prefetching
> to me. But, I think we want to do the right thing and have parity with
> master.
> 
>> My suggestion would be to keep the master behavior unless not practical,
>> and then maybe discuss changing the details later. The patch is already
>> complicated enough, better to leave that discussion for later.
> 
> Agreed. Speaking of which, we need to add back use of tablespace IO
> concurrency for the streaming read API (which is used by
> BitmapHeapScan in master).
> 
>>> With very low selectivity, you are less likely to get readahead
>>> (right?) and similarly less likely to be able to build up > 8kB IOs --
>>> which is one of the main value propositions of the streaming read
>>> code. I imagine that this larger read benefit is part of why the
>>> performance is better at higher selectivities with the patch. This
>>> might be a silly experiment, but we could try decreasing
>>> MAX_BUFFERS_PER_TRANSFER on the patched version and see if the
>>> performance gains go away.
>>
>> Sure, I can do that. Do you have any particular suggestion what value to
>> use for MAX_BUFFERS_PER_TRANSFER?
> 
> I think setting it to 1 would be the same as always master -- doing
> only 8kB reads. The only thing about that is that I imagine the other
> streaming read code has some overhead which might end up being a
> regression on balance even with the prefetching if we aren't actually
> using the ranges/vectored capabilities of the streaming read
> interface. Maybe if you just run it for one of the very obvious
> performance improvement cases? I can also try this locally.
> 

Here's some results from a build with

  #define MAX_BUFFERS_PER_TRANSFER 1

There are three columns:

- master
- patched (original patches, with MAX_BUFFERS_PER_TRANSFER=128kB)
- patched-single (MAX_BUFFERS_PER_TRANSFER=8kB)

The color scales are always branch compared to master.

I think the expectation was that setting the transfer to 1 would make it
closer to master, reducing some of the regressions. But in practice the
effect is the opposite.

- In "cached" runs, this eliminates the small improvements (light
green), but leaves the regressions behind.

- In "uncached" runs, this exacerbates the regressions, particularly for
low selectivities (small values of matches).


I don't have a good intuition on why this would be happening :-(

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Peter Smith
Date:
Subject: Re: Synchronizing slots from primary to standby
Next
From: Peter Smith
Date:
Subject: Re: pub/sub - specifying optional parameters without values.