Re: Index Onlys Scan for expressions - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: Index Onlys Scan for expressions
Date
Msg-id CAPpHfduhVbyGDmLzxDt6sm3rAbVQQ=xXYB2DOSNm=eA9deJ1BA@mail.gmail.com
Whole thread Raw
In response to Re: Index Onlys Scan for expressions  (Oleg Bartunov <obartunov@gmail.com>)
Responses Re: Index Onlys Scan for expressions  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Tue, Aug 16, 2016 at 9:09 AM, Oleg Bartunov <obartunov@gmail.com> wrote:
On Tue, Aug 16, 2016 at 1:03 AM, Ildar Musin <i.musin@postgrespro.ru> wrote:
> Hi, hackers!
>
> There is a known issue that index only scan (IOS) can only work with simple
> index keys based on single attributes and doesn't work with index
> expressions. In this patch I propose a solution that adds support of IOS for
> index expressions. Here's an example:
>
> create table abc(a int, b int, c int);
> create index on abc ((a * 1000 + b), c);
>
> with t1 as (select generate_series(1, 1000) as x),
>      t2 as (select generate_series(0, 999) as x)
> insert into abc(a, b, c)
>     select t1.x, t2.x, t2.x from t1, t2;
> vacuum analyze;
>
> Explain results with the patch:
>
> explain (analyze, buffers) select a * 1000 + b + c from abc where a * 1000 +
> b = 1001;
>                                                        QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------
>  Index Only Scan using abc_expr_c_idx on abc  (cost=0.42..4.45 rows=1
> width=4) (actual time=0.032..0.033 rows=1 loops=1)
>    Index Cond: ((((a * 1000) + b)) = 1001)
>    Heap Fetches: 0
>    Buffers: shared hit=4
>  Planning time: 0.184 ms
>  Execution time: 0.077 ms
> (6 rows)
>
> Before the patch it was:
>
> explain (analyze, buffers) select a * 1000 + b + c from abc where a * 1000 +
> b = 1001;
>                                                      QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------
>  Index Scan using abc_expr_c_idx on abc  (cost=0.42..8.45 rows=1 width=4)
> (actual time=0.039..0.041 rows=1 loops=1)
>    Index Cond: (((a * 1000) + b) = 1001)
>    Buffers: shared hit=4
>  Planning time: 0.177 ms
>  Execution time: 0.088 ms
> (5 rows)
>
> This solution has limitations though: the restriction or the target
> expression tree (or its part) must match exactly the index. E.g. this
> expression will pass the check:
>
> select a * 1000 + b + 100 from ...
>
> but this will fail:
>
> select 100 + a * 1000 + b from ...
>
> because the parser groups it as:
>
> (100 + a * 1000) + b
>
> In this form it won't match any index key. Another case is when we create
> index on (a+b) and then make query like 'select b+a ...' or '... where b+a =
> smth' -- it won't match. This applies to regular index scan too. Probably it
> worth to discuss the way to normalize index expressions and clauses and work
> out more convenient way to match them.

pg_operator.oprcommutative ?

Do you mean pg_operator.oprcom?
In these examples we're also lacking of pg_operator.oprassociative...
Another problem is computational complexity of such transformations.  AFAIR, few patches for making optimizer smarter with expressions were already rejected because of this reason.
Also, even if we would have such transformation of expressions, floating points types would be still problematic, because (a + b) + c = a + (b + c) is not exactly true for them because of computational error.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: Declarative partitioning - another take
Next
From: Michael Meskes
Date:
Subject: Re: Let's get rid of the separate minor version numbers for shlibs