On 2/15/23 09:57, Maxim Ivanov wrote:
> Hi All,
>
> I'd like to report what seems to be a missing optimization
> opportunity or understand why it is not possible to achieve.
>
> TLDR; additional index column B specified in CREATE INDEX .. (A)
> INCLUDE(B) is not used to filter rows in queries like WHERE B = $1
> ORDER BY A during IndexScan. https://dbfiddle.uk/iehtq44L
>
> ...
>
> Here is the plan (notice high "shared hit"):
>
> Limit (cost=0.42..10955.01 rows=1 width=12) (actual time=84.283..84.284 rows=0 loops=1)
> Output: a, b, d
> Buffers: shared hit=198307
> -> Index Scan using t_a_include_b on public.t (cost=0.42..10955.01 rows=1 width=12) (actual time=84.280..84.281
rows=0loops=1)
> Output: a, b, d
> Index Cond: (t.a > 1000000)
> Filter: (t.b = 4)
> Rows Removed by Filter: 197805
> Buffers: shared hit=198307
> Planning:
> Buffers: shared hit=30
> Planning Time: 0.201 ms
> Execution Time: 84.303 ms
>
Yeah. The reason for this behavior is pretty simple:
1) When matching clauses to indexes in match_clause_to_index(), we only
look at key columns (nkeycolumns). We'd need to check all columns
(ncolumns) and remember if the clause matched a key or included one.
2) index_getnext_slot would need to get "candidate" TIDs using
conditions on keys, and then check the clauses on included
columns.
Seems doable, unless I'm missing some fatal issue.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company