Using b-tree index for >= condition when joining - Mailing list pgsql-general
From | Łukasz Dąbek |
---|---|
Subject | Using b-tree index for >= condition when joining |
Date | |
Msg-id | CAE2Xe9tLfLxE_pthPLFerywyA4RxtGTwMCqKpGk6VqMYOj6Lfw@mail.gmail.com Whole thread Raw |
Responses |
Re: Using b-tree index for >= condition when joining
|
List | pgsql-general |
Hello All! I am having a problem with nudging postgres to choose a good plan for a query involving a left join and an inequality constraint on a column with b-tree index. Let's say both tbl1 and tbl2 tables have date column with an index on it. Queries like "SELECT * FROM tbl1 WHERE date >= CONSTANT" are using index scan, as expected. Now let's define a view: =# CREATE VIEW vw1 AS SELECT t1.date as date, t1.x as x, t2.y as y FROM tbl1 t1 LEFT JOIN tbl2 t2 USING (date); Query of the form "SELECT * FROM vw1 WHERE date = '2020-04-21'" is using index scan on both tables: =# EXPLAIN SELECT * FROM vw1 WHERE date = '2020-04-21'; QUERY PLAN ------------------------------------------------------------------------------------------------ Hash Right Join (cost=91208.02..112781024.50 rows=10000000000 width=12) Hash Cond: (t2.date = t1.date) -> Index Scan using tbl2_date_idx on tbl2 t2 (cost=0.43..188393.92 rows=100000 width=8) Index Cond: (date = '2019-04-21'::date) -> Hash (cost=89566.58..89566.58 rows=100000 width=8) -> Bitmap Heap Scan on tbl1 t1 (cost=1875.43..89566.58 rows=100000 width=8) Recheck Cond: (date = '2019-04-21'::date) -> Bitmap Index Scan on tbl1_date_idx (cost=0.00..1850.43 rows=100000 width=0) Index Cond: (date = '2019-04-21'::date) (I know the total number of rows estimated for this and next queries is enormous, in reality there are more conditions on the join but I want to keep the example small) However when an inequality is used the query plan seems inefficient: =# EXPLAIN SELECT * FROM vw1 WHERE date >= '2020-04-21'; QUERY PLAN ---------------------------------------------------------------------------------------------- Hash Left Join (cost=483538.43..4617954384.38 rows=410400000000 width=12) Hash Cond: (t1.date = t2.date) -> Index Scan using tbl1_date_idx on tbl1 t1 (cost=0.43..369147.38 rows=4104000 width=8) Index Cond: (date >= '2019-04-21'::date) -> Hash (cost=234163.00..234163.00 rows=15200000 width=8) -> Seq Scan on tbl2 t2 (cost=0.00..234163.00 rows=15200000 width=8) It looks like the inequality on date isn't pushed down below the left join? I can get the plan I'd like to have by putting the same constraint on the date column on the second table: =# EXPLAIN SELECT * FROM tbl1 t1 LEFT JOIN tbl2 t2 USING (date) WHERE t1.date >= '2019-04-21' AND t2.date >= '2019-04-21'; QUERY PLAN -------------------------------------------------------------------------------------------------- Hash Join (cost=281625.87..1651822721.88 rows=112860000000 width=26) Hash Cond: (t2.date = t1.date) -> Index Scan using tbl2_date_idx on tbl2 t2 (cost=0.43..369784.44 rows=4180000 width=15) Index Cond: (date >= '2019-04-21'::date) -> Hash (cost=210285.43..210285.43 rows=4104000 width=15) -> Bitmap Heap Scan on tbl1 t1 (cost=76822.43..210285.43 rows=4104000 width=15) Recheck Cond: (date >= '2019-04-21'::date) -> Bitmap Index Scan on tbl1_date_idx (cost=0.00..75796.43 rows=4104000 width=0) Index Cond: (date >= '2019-04-21'::date) Is it possible to define a view vw2 such that queries of the form "SELECT * FROM vw2 WHERE date >= CONSTANT" use the plan I pasted above? Thanks in advance for help, Lukasz
pgsql-general by date: