Re: [HACKERS] Secondary index access optimizations - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Re: [HACKERS] Secondary index access optimizations
Date
Msg-id bb56e4b0-0bf2-cfda-1dbc-83380843b6fc@postgrespro.ru
Whole thread Raw
In response to Re: [HACKERS] Secondary index access optimizations  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers


On 08.10.2018 00:16, David Rowley wrote:
On 5 October 2018 at 04:45, Konstantin Knizhnik
<k.knizhnik@postgrespro.ru> wrote:
Will the following test be enough:

-- check that columns for parent table are correctly mapped to child
partition of their order doesn't match
create table paren (a int, b text) partition by range(a);
create table child_1 partition of paren for values from (0) to (10);
create table child_2 (b text, a int);
alter table paren attach partition child_2 for values from (10) to (20);
insert into paren values (generate_series(0,19), generate_series(100,119));

explain (costs off) select * from paren where a between 0 and 9;
explain (costs off) select * from paren where a between 10 and 20;
explain (costs off) select * from paren where a >= 5;
explain (costs off) select * from paren where a <= 15;

select count(*) from paren where a >= 5;
select count(*) from paren where a < 15;

drop table paren cascade;
I started looking at this to see if this test would cause a crash with
the original code, but it does not. The closest I can get is:

drop table parent;
create table parent (a bytea, b int) partition by range(a);
create table child_1 (b int, a bytea);
alter table parent attach partition child_1 for values from ('a') to ('z');
explain (costs off) select * from parent where b = 1;

But despite the varattnos of the bytea and int accidentally matching,
there's no crash due to the way operator_predicate_proof() requires
more than just the varno and varattno to match. It requires the Vars
to be equal(), which includes vartype, and those are not the same. So
the proof just fails.

In short, probably this test is doing nothing in addition to what
various other tests are doing. So given the test is unable to crash
the unfixed code, then I think it's likely not a worthwhile test to
add.

I wrote:
create table listp (a int, b int) partition by list(a);
create table listp1 partition of listp for values in(1);
create index listp_a_b_idx on listp (a,b);

and a query:

select * from listp where a = 1 order by b;

if we remove the "a = 1" qual, then listp_a_b_idx can't be used.
I had a look at what allows this query still to use the index and it's
down to pathkey_is_redundant() returning true because there's still an
equivalence class containing {a,1}. I don't quite see any reason why
it would not be okay to rely on that working, but that only works for
pathkeys. If you have a case like:

set max_parallel_workers_per_gather=0;
create table listp (a int, b int) partition by list(a);
create table listp1 partition of listp for values in(1);
insert into listp select 1,x from generate_Series(1,1000000) x;
create index listp_a_b_idx on listp (a,b);
vacuum analyze listp;
explain analyze select * from listp where a = 1 and b = 1;

the "a = 1" will be dropped and the index on (a,b) does not get used.

Patched results in:

postgres=# explain analyze select * from listp where a = 1 and b = 1;                                                QUERY PLAN
------------------------------------------------------------------------------------------------------------Append  (cost=0.00..16925.01 rows=1 width=8) (actual
time=0.019..169.231 rows=1 loops=1)  ->  Seq Scan on listp1  (cost=0.00..16925.00 rows=1 width=8)
(actual time=0.017..169.228 rows=1 loops=1)        Filter: (b = 1)        Rows Removed by Filter: 999999Planning Time: 0.351 msExecution Time: 169.257 ms
(6 rows)

Whereas unpatched gets:

postgres=# explain analyze select * from listp where a = 1 and b = 1;                                                           QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------Append  (cost=0.42..4.45 rows=1 width=8) (actual time=0.657..0.660
rows=1 loops=1)  ->  Index Only Scan using listp1_a_b_idx on listp1
(cost=0.42..4.44 rows=1 width=8) (actual time=0.653..0.655 rows=1
loops=1)        Index Cond: ((a = 1) AND (b = 1))        Heap Fetches: 0Planning Time: 32.303 msExecution Time: 0.826 ms
(6 rows)

so I was probably wrong about suggesting set_append_rel_size() as a
good place to remove these quals. It should perhaps be done later, or
maybe we can add some sort of marker to the qual to say it does not
need to be enforced during execution. Probably the former would be
best as we don't want to show these in EXPLAIN.

Well, I made a different conclusion from this problem (inability use of compound index because of redundant qual elimination).
Is it really good idea to define compound index with first key equal to partitioning key?
Restriction on this key in any case will lead to partition pruning. We do no need index for it...
In your case if we create index 
listp_b_idx:

 create index listp_b_idx on listp (b);

then right plan we be generated:

 Append  (cost=0.42..8.45 rows=1 width=8) (actual time=0.046..0.047 rows=1 loops=1)
   ->  Index Scan using listp1_b_idx on listp1  (cost=0.42..8.44 rows=1 width=8) (actual time=0.046..0.046 rows=1 loops=1)
         Index Cond: (b = 1)

and it is definitely more efficient than original plan with unpacked Postgres.
Append  (cost=0.42..4.45 rows=1 width=8) (actual time=0.657..0.660
rows=1 loops=1)  ->  Index Only Scan using listp1_a_b_idx on listp1
(cost=0.42..4.44 rows=1 width=8) (actual time=0.653..0.655 rows=1
loops=1)        Index Cond: ((a = 1) AND (b = 1))        Heap Fetches: 0


In any case, I have attached yet another version of the patch: it is my original patch with removed predicate test proof logic.
Unlike your patch, it works also for CHECK constraints, not only for standard partitions.

-- 
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 
Attachment

pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: Calculate total_table_pages after set_base_rel_sizes()
Next
From: Amit Kapila
Date:
Subject: Re: WIP: Avoid creation of the free space map for small tables