Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date | |
Msg-id | 8de6f6d7-9b9d-47a4-98cc-84bf31a314db@enterprisedb.com Whole thread Raw |
In response to | Re: BitmapHeapScan streaming read user and prelim refactoring (Melanie Plageman <melanieplageman@gmail.com>) |
Responses |
Re: BitmapHeapScan streaming read user and prelim refactoring
Re: BitmapHeapScan streaming read user and prelim refactoring |
List | pgsql-hackers |
On 2/29/24 22:19, Melanie Plageman wrote: > On Thu, Feb 29, 2024 at 7:54 AM Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> >> >> >> On 2/29/24 00:40, Melanie Plageman wrote: >>> On Wed, Feb 28, 2024 at 6:17 PM Tomas Vondra >>> <tomas.vondra@enterprisedb.com> wrote: >>>> >>>> >>>> >>>> On 2/28/24 21:06, Melanie Plageman wrote: >>>>> On Wed, Feb 28, 2024 at 2:23 PM Tomas Vondra >>>>> <tomas.vondra@enterprisedb.com> wrote: >>>>>> >>>>>> On 2/28/24 15:56, Tomas Vondra wrote: >>>>>>>> ... >>>>>>> >>>>>>> Sure, I can do that. It'll take a couple hours to get the results, I'll >>>>>>> share them when I have them. >>>>>>> >>>>>> >>>>>> Here are the results with only patches 0001 - 0012 applied (i.e. without >>>>>> the patch introducing the streaming read API, and the patch switching >>>>>> the bitmap heap scan to use it). >>>>>> >>>>>> The changes in performance don't disappear entirely, but the scale is >>>>>> certainly much smaller - both in the complete results for all runs, and >>>>>> for the "optimal" runs that would actually pick bitmapscan. >>>>> >>>>> Hmm. I'm trying to think how my refactor could have had this impact. >>>>> It seems like all the most notable regressions are with 4 parallel >>>>> workers. What do the numeric column labels mean across the top >>>>> (2,4,8,16...) -- are they related to "matches"? And if so, what does >>>>> that mean? >>>>> >>>> >>>> That's the number of distinct values matched by the query, which should >>>> be an approximation of the number of matching rows. The number of >>>> distinct values in the data set differs by data set, but for 1M rows >>>> it's roughly like this: >>>> >>>> uniform: 10k >>>> linear: 10k >>>> cyclic: 100 >>>> >>>> So for example matches=128 means ~1% of rows for uniform/linear, and >>>> 100% for cyclic data sets. >>> >>> Ah, thank you for the explanation. I also looked at your script after >>> having sent this email and saw that it is clear in your script what >>> "matches" is. >>> >>>> As for the possible cause, I think it's clear most of the difference >>>> comes from the last patch that actually switches bitmap heap scan to the >>>> streaming read API. That's mostly expected/understandable, although we >>>> probably need to look into the regressions or cases with e_i_c=0. >>> >>> Right, I'm mostly surprised about the regressions for patches 0001-0012. >>> >>> Re eic 0: Thomas Munro and I chatted off-list, and you bring up a >>> great point about eic 0. In old bitmapheapscan code eic 0 basically >>> disabled prefetching but with the streaming read API, it will still >>> issue fadvises when eic is 0. That is an easy one line fix. Thomas >>> prefers to fix it by always avoiding an fadvise for the last buffer in >>> a range before issuing a read (since we are about to read it anyway, >>> best not fadvise it too). This will fix eic 0 and also cut one system >>> call from each invocation of the streaming read machinery. >>> >>>> To analyze the 0001-0012 patches, maybe it'd be helpful to run tests for >>>> individual patches. I can try doing that tomorrow. It'll have to be a >>>> limited set of tests, to reduce the time, but might tell us whether it's >>>> due to a single patch or multiple patches. >>> >>> Yes, tomorrow I planned to start trying to repro some of the "red" >>> cases myself. Any one of the commits could cause a slight regression >>> but a 3.5x regression is quite surprising, so I might focus on trying >>> to repro that locally and then narrow down which patch causes it. >>> >>> For the non-cached regressions, perhaps the commit to use the correct >>> recheck flag (0004) when prefetching could be the culprit. And for the >>> cached regressions, my money is on the commit which changes the whole >>> control flow of BitmapHeapNext() and the next_block() and next_tuple() >>> functions (0010). >>> >> >> I do have some partial results, comparing the patches. I only ran one of >> the more affected workloads (cyclic) on the xeon, attached is a PDF >> comparing master and the 0001-0014 patches. The percentages are timing >> vs. the preceding patch (green - faster, red - slower). > > Just confirming: the results are for uncached? > Yes, cyclic data set, uncached case. I picked this because it seemed like one of the most affected cases. Do you want me to test some other cases too? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: