Re: Optimization of range queries - Mailing list pgsql-hackers

From Teodor Sigaev
Subject Re: Optimization of range queries
Date
Msg-id 28d7508c-bd3d-4597-a4c6-d0b37355a137@sigaev.ru
Whole thread Raw
In response to Optimization of range queries  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: Optimization of range queries  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
List pgsql-hackers
Hi!

12 years ago I proposed patch to which could "union" OR clauses into one 
range clause if it's possible. In that time pgsql could not use IS NULL 
as index clause, so patch doesn't support that

https://www.postgresql.org/message-id/flat/45742C51.9020602%40sigaev.ru

option number 4), all other are already committed.


Konstantin Knizhnik wrote:
> Hi hackers,
> 
> Postgres optimizer is not able to build efficient execution plan for the 
> following query:
> 
> explain select * from  people_raw where not ("ID"<2068113880 AND "INN" 
> is not null) and "ID"<=2068629726 AND "INN" is not null;
>                                           QUERY PLAN
> -------------------------------------------------------------------------------------------- 
> 
>   Bitmap Heap Scan on people_raw  (cost=74937803.72..210449640.49 
> rows=121521030 width=336)
>     Recheck Cond: ("ID" <= 2068629726)
>     Filter: (("INN" IS NOT NULL) AND (("ID" >= 2068113880) OR ("INN" IS 
> NULL)))
>     ->  Bitmap Index Scan on "People_pkey" (cost=0.00..74907423.47 
> rows=2077021718 width=0)
>           Index Cond: ("ID" <= 2068629726)
> (5 rows)
> 
> 
> Here the table is very large, but query effects only relatively small 
> number of rows located in the range: [2068113880,2068629726]
> But unfortunately optimizer doesn't take it into the account.
> Moreover, using "is not null" and "not null" is both operands of AND is 
> not smart:
>       (("INN" IS NOT NULL) AND (("ID" >= 2068113880) OR ("INN" IS NULL)))
> 
> If I remove "is not null" condition, then plan is perfect:
> 
> explain select * from  people_raw where not ("ID"<2068113880) and 
> "ID"<=2068629726;
>                                           QUERY PLAN
> -------------------------------------------------------------------------------------------- 
> 
>   Index Scan using "People_pkey" on people_raw  (cost=0.58..196745.57 
> rows=586160 width=336)
>     Index Cond: (("ID" >= 2068113880) AND ("ID" <= 2068629726))
> (2 rows)
> 
> Before starting  investigation of the problem, I will like to know 
> opinion and may be some advise of people familiar with optimizer:
> how difficult will be to handle this case and where to look.
> 
> Thanks in advance,
> 

-- 
Teodor Sigaev                      E-mail: teodor@sigaev.ru
                                       WWW: http://www.sigaev.ru/


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS
Next
From: Tom Lane
Date:
Subject: Re: pgsql: Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.