Re: BitmapHeapScan streaming read user and prelim refactoring - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date | |
Msg-id | 6367e86a-7269-4eae-9a35-5a3864854911@enterprisedb.com Whole thread Raw |
In response to | Re: BitmapHeapScan streaming read user and prelim refactoring (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
List | pgsql-hackers |
On 4/6/24 02:51, Tomas Vondra wrote: > > * The one question I'm somewhat unsure about is why Tom chose to use the > "wrong" recheck flag in the 2017 commit, when the correct recheck flag > is readily available. Surely that had a reason, right? But I can't think > of one ... > I've been wondering about this a bit more, so I decided to experiment and try to construct a case for which the current code prefetches the wrong blocks, and the patch fixes that. But I haven't been very successful so far :-( My understanding was that the current code should do the wrong thing if I alternate all-visible and not-all-visible pages. This understanding is not correct, as I learned, because the thing that needs to change is the recheck flag, not visibility :-( I'm still posting what I tried, perhaps you will have an idea how to alter it to demonstrate the incorrect behavior with current master. The test was very simple: create table t (a int, b int) with (fillfactor=10); insert into t select mod((i/22),2), (i/22) from generate_series(0,1000) s(i); create index on t (a); which creates a table with 46 pages, 22 rows per page, column "a" alternates between 0/1 on pages, column "b" increments on each page (so "b" identifies page). and then delete from t where mod(b,8) = 0; which deletes tuples on pages 0, 8, 16, 24, 32, 40, so these pages will need to be prefetched as not-all-visible by this query explain analyze select count(1) from t where a = 0 when forced to do bitmap heap scan. The other even-numbered pages remain all-visible. I added a bit of logging into BitmapPrefetch(), but even with master I get this: LOG: prefetching block 8 0 current block 6 0 LOG: prefetching block 16 0 current block 14 0 LOG: prefetching block 24 0 current block 22 0 LOG: prefetching block 32 0 current block 30 0 LOG: prefetching block 40 0 current block 38 0 So it prefetches the correct pages (the other value is the recheck flag for that block from the iterator result). Turns out (and I realize the comment about the assumption actually states that, I just failed to understand it) the thing that would have to differ for the blocks is the recheck flag. But that can't actually happen because that's set by the AM/opclass and the built-in ones do essentially this: .../hash.c: scan->xs_recheck = true; .../nbtree.c: scan->xs_recheck = false; gist opclasses (e.g. btree_gist): /* All cases served by this function are exact */ *recheck = false; spgist opclasses (e.g. geo_spgist): /* All tests are exact. */ out->recheck = false; If there's an opclass that alters the recheck flag, it's well hidden and I missed it. Anyway, after this exercise and learning more about the recheck flag, I think I agree the assumption is unnecessary. It's pretty harmless because none of the built-in opclasses alters the recheck flag, but the correct recheck flag is readily available. I'm still a bit puzzled why the 2017 commit even relied on this assumption, though. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: