Re: PATCH: index-only scans with partial indexes - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: PATCH: index-only scans with partial indexes |
Date | |
Msg-id | 55F6AF33.8000206@2ndquadrant.com Whole thread Raw |
In response to | Re: PATCH: index-only scans with partial indexes (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: PATCH: index-only scans with partial indexes
|
List | pgsql-hackers |
On 09/14/2015 12:51 PM, Kyotaro HORIGUCHI wrote: > I rethinked on this from the first. > >> Sorry. >> >>> Hi, this looks to be a bug of cost_index(). The attached patch >>> would fix that. >> >> No, that's wrong. please forget the patch. The qual in qpquals >> should be indexquals which is excluded because it is not >> necessary to be applied. The right way would be remove the cost >> for qpqual in cost_index(). > > Your patch allows index only scan even if a qual contains > non-index column when the qual can be removed by implication from > index predicates. > > So the 'implied' clauses is not needed ever after. It should be > excluded from cost estimation and it is not needed on execution > even if index only scan is found not to be doable finally. > > So the implicit quals may be removed on building index paths but > I think check_index_only is not the place. > > Removing implied quals from index quals is not only for index > *only* scan so the place for removing such quals is in > build_index_paths, in the loop of step 1. After removing the > quals there, check_index_only will naturally give disired result. > > # I remember that I have tried the same or similar thing. I don't > # recall what made me give up then. I don't think this is particularly related to the patch, because some of the anomalies can be observed even on master. For example, let's force the index scans to be non-IOS by requesting another column from the heap: create table t (a int, b int, c int); insert into t select i,i,i from generate_series(1,1000000) s(i); create index idx1 on t (a) where b between 300000 and 600000; create index idx2 on t (a,b) where b between 300000 and600000; analyze t; vacuum t; The indexes have exactly the same size (thanks to alignment of IndexTuples), and should have exactly the same statistics: test=# \di+ List of relations Schema | Name | Type | Owner | Table | Size | Description --------+------+-------+-------+-------+---------+------------- public | idx1 | index | user | t | 6600 kB | public| idx2 | index | user | t | 6600 kB | (2 rows) Now, let's see the query reading column 'c' (forcing heap fetches) explain select c from t where b between 300000 and 600000; QUERY PLAN ----------------------------------------------------------------------- Index Scan using idx1 on t (cost=0.42..10945.99rows=300971 width=4) (1 row) drop index idx1; set enable_bitmapscan = off; explain select c from t where b between 300000 and 600000; QUERY PLAN ----------------------------------------------------------------------- Index Scan using idx2 on t (cost=0.42..19688.08rows=300971 width=4) Index Cond: ((b >= 300000) AND (b <= 600000)) (2 rows) I claim that either both or none of the indexes should use "Index Cond". This is exactly the same reason that lead to the strange behavior after applying the patch, but in this case the heap access actually introduces some additional cost so the issue is not that obvious. But in reality the costs should be pretty much exactly the same - the indexes have exactly the same size, statistics, selectivity etc. Also, the plan difference suggests this is not merely a costing issue, because while with idx1 (first plan) it was correctly detected we don't need to evaluate the condition on the partial index, on idx2 that's not true and we'll waste time doing that. So we probably can't just tweak the costing a bit - this probably needs to be addressed when actually building the index path. regards -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: