Re: PATCH: index-only scans with partial indexes - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: PATCH: index-only scans with partial indexes |
Date | |
Msg-id | 20150917.194553.172396487.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: PATCH: index-only scans with partial indexes (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: PATCH: index-only scans with partial indexes
|
List | pgsql-hackers |
Hi, sorry in advance for hardly readable long descriptions.. At Mon, 14 Sep 2015 13:27:47 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in <55F6AF33.8000206@2ndquadrant.com> > I don't think this is particularly related to the patch, because some > of the anomalies can be observed even on master. For example, let's > force the index scans to be non-IOS by requesting another column from > the heap: ... > I claim that either both or none of the indexes should use "Index > Cond". Perhaps I understood your point and I think I understood this. Planner dicides whether to use the partial index far before creating specific paths, when measuring baserel sizes. If some partial index is implied by the restriction list, check_partial_indexes() marks the index as predOk, which means that the index is usable for the baserel scan even if the restriction clauses doesn't match the index columns. Then create_index_paths allows to generating paths for partial indexes with predOK. Index clauses of the created paths for the partial indexes don't contain clauses doesn't match the index columns. It is empty for your first example and it is the same with restrictinfo for the second. Finally, create_indexscan_plan strips baserestriction caluses implied by index predicate in addtion to index conditions. So both of your example has no filter conditions. The following example would show you the result of the above steps. explain select c from t where b between 300000 + 1 and 600000 and c = 3; QUERY PLAN ------------------------------------------------------------------Index Scan using idx1 on t (cost=0.42..11665.77 rows=1width=4) Filter: ((b >= 300001) AND (c = 3)) The index predicate (b >= 300000 AND b <= 600000) implies the restrction (b >= 300001 AND b <= 600000) so idx1 is allowed to be used, then conversely the restriction b >= 300001 is implied by the index predicate b >= 300000 so it is not shown as Index Cond and the other two are not, so they are shown and executed as Filter. But regardless of whether stripped as implied conditions or not at planning phase, the cost for all clauses that don't match the index columns are added when creating index paths. That will be one of the cause of cost error. > This is exactly the same reason that lead to the strange behavior > after applying the patch, but in this case the heap access actually > introduces some additional cost so the issue is not that obvious. So you're right on the point that check_index_only is doing wrong. It should properly ignore restrictions implied by index predicates as your patch is doing. But cost_index doesn't know that some nonindex-conditions of baserestrictinfo is finally useless, and it is assuming that nonindex conditions are always applied on heap tuples. After all, what should be done to properly ignore useless conditions would be, 1. Make create_index_paths() to make the list of restrict clauses which are implied by the index predicate of the index in focus. The list would be stored as a member in IndexOptInfo. Then create index clauses excluding the listed clauses and call get_index_paths using them. Modify check_index_only() to ignore the listed clauses when pulling varattnos.This is similar but different a bit to what I said in the previous mail. 2. Pass the listed clauses to generated IndexPath. 3. Modify extract_nonindex_conditions to be called with the exclusion list and the cluases are exluded from the resultof the function. 4. Make create_indexscan_plan to extract qpqual excluding the exclusion list. The same result could be obtained by more smarter way but the steps above will archive right decision on whether to do index only scan on partial index and correct cost estimate propery ignoring unused restrictions. Does this make sense for you? > But in reality the costs should be pretty much exactly the same - the > indexes have exactly the same size, statistics, selectivity etc. > > Also, the plan difference suggests this is not merely a costing issue, > because while with idx1 (first plan) it was correctly detected we > don't need to evaluate the condition on the partial index, on idx2 > that's not true and we'll waste time doing that. So we probably can't > just tweak the costing a bit - this probably needs to be addressed > when actually building the index path. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: