Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers
From | James Coleman |
---|---|
Subject | Re: Use of additional index columns in rows filtering |
Date | |
Msg-id | CAAaqYe_=eEPNArD-Vi9C3ugEkpnPvfWO5LV0NWEhW159d1WVUg@mail.gmail.com Whole thread Raw |
In response to | Re: Use of additional index columns in rows filtering (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: Use of additional index columns in rows filtering
|
List | pgsql-hackers |
Hello, I've cc'd Jeff Davis on this due to a conversation we had at PGCon about applying filters on index tuples during index scans. I've also cc'd Andres Freund because I think this relates to his musing in [1] that: > One thing I have been wondering around this is whether we should not have > split the code for IOS and plain indexscans... I think I remember Peter Geoghegan also wondering (I can't remember if this was in conversation at PGCon about index skip scans or in a hackers thread) about how we compose these various index scan optimizations. To be certain this is probably a thing to tackle as a follow-on to this patch, but it does seem to me that what we are implicitly realizing is that (unlike with bitmap scans, I think) it doesn't really make a lot of conceptual sense to have index only scans be a separate node from index scans. Instead it's likely better to consider it an optimization to index scans that can dynamically kick in when it's able to be of use. That would allow it to compose with e.g. prefetching in the aforelinked thread. At the very least we would need pragmatic (e.g., cost of dynamically applying optimizations) rather than conceptual reasons to argue they should continue to be separate. Apologies for that lengthy preamble; on to the patch under discussion: On Thu, Jun 8, 2023 at 1:34 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > Hi, > > I took a stab at this and implemented the trick with the VM - during > index scan, we also extract the filters that only need the indexed > attributes (just like in IOS). And then, during the execution we: > > 1) scan the index using the scan keys (as before) > > 2) if the heap page is all-visible, we check the new filters that can > be evaluated on the index tuple > > 3) fetch the heap tuple and evaluate the filters Thanks for working on this; I'm excited about this class of work (along with index prefetching and other ideas I think there's a lot of potential for improving index scans). > This is pretty much exactly the same thing we do for IOS, so I don't see > why this would be incorrect while IOS is correct. > > This also adds "Index Filter" to explain output, to show which filters > are executed on the index tuple (at the moment the filters are a subset > of "Filter"), so if the index tuple matches we'll execute them again on > the heap tuple. I guess that could be fixed by having two "filter" > lists, depending on whether we were able to evaluate the index filters. Given that we show index filters and heap filters separately it seems like we might want to maintain separate instrumentation counts of how many tuple were filtered by each set of filters. > Most of the patch is pretty mechanical - particularly the planning part > is about identifying filters that can be evaluated on the index tuple, > and that code was mostly shamelessly copied from index-only scan. > > The matching of filters to index is done in check_index_filter(), and > it's simpler than match_clause_to_indexcol() as it does not need to > consider operators etc. (I think). But maybe it should be careful about > other things, not sure. This would end up requiring some refactoring of the existing index matching code (or alternative caching on IndexOptInfo), but match_filter_to_index() calling check_index_filter() results in constructs a bitmapset of index columns for every possible filter which seems wasteful (I recognize this is a bit of a proof-of-concept level v1). > The actual magic happens in IndexNext (nodeIndexscan.c). As mentioned > earlier, the idea is to check VM and evaluate the filters on the index > tuple if possible, similar to index-only scans. Except that we then have > to fetch the heap tuple. Unfortunately, this means the code can't use > index_getnext_slot() anymore. Perhaps we should invent a new variant > that'd allow evaluating the index filters in between. It does seem there are some refactoring opportunities there. > With the patch applied, the query plan changes from: > > ... > > to > > QUERY PLAN > ------------------------------------------------------------------- > Limit (cost=0.42..3662.15 rows=1 width=12) > (actual time=13.663..13.667 rows=0 loops=1) > Buffers: shared hit=544 > -> Index Scan using t_a_include_b on t > (cost=0.42..3662.15 rows=1 width=12) > (actual time=13.659..13.660 rows=0 loops=1) > Index Cond: (a > 1000000) > Index Filter: (b = 4) > Rows Removed by Index Recheck: 197780 > Filter: (b = 4) > Buffers: shared hit=544 > Planning Time: 0.105 ms > Execution Time: 13.690 ms > (10 rows) > > ... I did also confirm that this properly identifies cases Jeff had mentioned to me like "Index Filter: (((a * 2) > 500000) AND ((b % 10) = 4))". I noticed also you still had questions/TODOs about handling index scans for join clauses. Regards, James Coleman 1: https://www.postgresql.org/message-id/20230609000600.syqy447e6metnvyj%40awork3.anarazel.de
pgsql-hackers by date: