Re: Use of additional index columns in rows filtering - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Use of additional index columns in rows filtering
Date
Msg-id c6b5da88-3ce1-14f2-f969-d819bd3acc5d@enterprisedb.com
Whole thread Raw
In response to Re: Use of additional index columns in rows filtering  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Use of additional index columns in rows filtering
List pgsql-hackers
Hi,

I took a stab at this and implemented the trick with the VM - during
index scan, we also extract the filters that only need the indexed
attributes (just like in IOS). And then, during the execution we:

  1) scan the index using the scan keys (as before)

  2) if the heap page is all-visible, we check the new filters that can
     be evaluated on the index tuple

  3) fetch the heap tuple and evaluate the filters

This is pretty much exactly the same thing we do for IOS, so I don't see
why this would be incorrect while IOS is correct.

This also adds "Index Filter" to explain output, to show which filters
are executed on the index tuple (at the moment the filters are a subset
of "Filter"), so if the index tuple matches we'll execute them again on
the heap tuple. I guess that could be fixed by having two "filter"
lists, depending on whether we were able to evaluate the index filters.

Most of the patch is pretty mechanical - particularly the planning part
is about identifying filters that can be evaluated on the index tuple,
and that code was mostly shamelessly copied from index-only scan.

The matching of filters to index is done in check_index_filter(), and
it's simpler than match_clause_to_indexcol() as it does not need to
consider operators etc. (I think). But maybe it should be careful about
other things, not sure.

The actual magic happens in IndexNext (nodeIndexscan.c). As mentioned
earlier, the idea is to check VM and evaluate the filters on the index
tuple if possible, similar to index-only scans. Except that we then have
to fetch the heap tuple. Unfortunately, this means the code can't use
index_getnext_slot() anymore. Perhaps we should invent a new variant
that'd allow evaluating the index filters in between.


With the patch applied, the query plan changes from:

                                 QUERY PLAN
    -------------------------------------------------------------------
     Limit  (cost=0.42..10929.89 rows=1 width=12)
            (actual time=94.649..94.653 rows=0 loops=1)
       Buffers: shared hit=197575 read=661
       ->  Index Scan using t_a_include_b on t
          (cost=0.42..10929.89 rows=1 width=12)
          (actual time=94.646..94.647 rows=0 loops=1)
             Index Cond: (a > 1000000)
             Filter: (b = 4)
             Rows Removed by Filter: 197780
             Buffers: shared hit=197575 read=661
     Planning Time: 0.091 ms
     Execution Time: 94.674 ms
    (9 rows)

to

                                 QUERY PLAN
    -------------------------------------------------------------------
     Limit  (cost=0.42..3662.15 rows=1 width=12)
            (actual time=13.663..13.667 rows=0 loops=1)
       Buffers: shared hit=544
       ->  Index Scan using t_a_include_b on t
           (cost=0.42..3662.15 rows=1 width=12)
           (actual time=13.659..13.660 rows=0 loops=1)
             Index Cond: (a > 1000000)
             Index Filter: (b = 4)
             Rows Removed by Index Recheck: 197780
             Filter: (b = 4)
             Buffers: shared hit=544
     Planning Time: 0.105 ms
     Execution Time: 13.690 ms
    (10 rows)

which is much closer to the "best" case:

                                 QUERY PLAN
    -------------------------------------------------------------------
     Limit  (cost=0.42..4155.90 rows=1 width=12)
            (actual time=10.152..10.156 rows=0 loops=1)
       Buffers: shared read=543
       ->  Index Scan using t_a_b_idx on t
           (cost=0.42..4155.90 rows=1 width=12)
           (actual time=10.148..10.150 rows=0 loops=1)
             Index Cond: ((a > 1000000) AND (b = 4))
             Buffers: shared read=543
     Planning Time: 0.089 ms
     Execution Time: 10.176 ms
    (7 rows)


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment

pgsql-hackers by date:

Previous
From: Dave Cramer
Date:
Subject: Re: Named Prepared statement problems and possible solutions
Next
From: Russell Foster
Date:
Subject: Postgres v15 windows bincheck regression test failures