Re: PATCH: index-only scans with partial indexes - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: PATCH: index-only scans with partial indexes
Date
Msg-id 20150914.190901.175666570.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: PATCH: index-only scans with partial indexes  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: PATCH: index-only scans with partial indexes
List pgsql-hackers
Hi, this looks to be a bug of cost_index(). The attached patch
would fix that.

=====
The following part in cost_index,

> cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
> 
> run_cost += cpu_per_tuple * tuples_fetched;

Adds, *cpu_tuple_cost* (which is 0.01) + qpqual_cost.per_tuple
(0.0025) per tuple even they are index tuples. On the other hand
getnericcostestimate adds the following value for the same deed.

> indexTotalCost += numIndexTuples * num_sa_scans * (cpu_index_tuple_cost + qual_op_cost);

cpu_index_tuple_cost is 0.005, just a half of cpu_tuple cost as
default. I think this should be the culprit of the difference.

For confirmation, setting cpu_tuple_cost to 0.05 to equate with
cpu_index_tuple_cost and the oppisit makes the estimate for both
indexes the same value.


set cpu_tuple_cost to 0.005;
explain select a from t where b < 300000;                               QUERY PLAN                                 
---------------------------------------------------------------------------Index Only Scan using idx2 on t
(cost=0.42..7022.06rows=297876 width=4)  Index Cond: (b < 300000)
 
(2 rows)

explain select a from t where b < 300000;                               QUERY PLAN                                 
---------------------------------------------------------------------------Index Only Scan using idx1 on t
(cost=0.42..7022.66rows=297876 width=4)
 
(1 row)

This should be a bug.  The attached patch would fix this and
perhaps costs for all of your examples should match except for
errors of double precision. I think it is enough since
IndexOnlyScan may not have quals on columns out of the index in
focus so qpquals should be index quals.

regards,


At Mon, 14 Sep 2015 10:00:24 +0200, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote in
<55F67E98.5050904@2ndquadrant.com>
> On 09/14/2015 09:35 AM, Kyotaro HORIGUCHI wrote:
> > Hi,
> ,,,
> >> Which is exactly the difference between costs from amcostestimate
> >>
> >> idx1: 4769.115000 + 0.015 * 297823 = 9236.460000
> >> idx2: 6258.230000 + 0.010 * 297823 = 9236.460000
> >
> > These calculations are exactly right, but you overlooked the
> > breakedown of indexTotalCost for idx2.
> >
> >> Sppoky! Although it seems like a mere coincidence, thanks to the nice
> >> round numbers of tuples in the table, and lucky choice of two
> >> conditions.
> >
> > As said above, it is not a conincidence. The exactly same
> > calculation about baserestrictinfo is simply calculated in
> > different places, cost_index for the former and
> > btcostestiamte(genericcostestimate) for the latter.
> 
> By "coincidence" I meant that we happened to choose such a number of
> conditions in the index predicate & query that this perfect match is
> possible. Apparently there are two places that manipulate the costs
> and in this particular case happen to perfectly compensate the
> effects.

Ok, I understood.

> As demonstrated by the example with a single condition, the costs may
> actually differ for different numbers of clauses (e.g. using a single
> clause makes the wider index - unexpectedly - cheaper).
> 
> >
> > We should properly ignore or remove the implicitly-applied quals
> > for partial indexes on cost estimation.
> 
> Probably. So far I've traced the difference to build_index_paths()
> where we build index_clauses by iterating over index columns - the
> smaller index does not have the column from the predicate, so we don't
> add the clause. I'm not particularly familiar with this part of the
> code, so I wonder where's the best place to fix this, though.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index d107d76..d354dc2 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -516,7 +516,12 @@ cost_index(IndexPath *path, PlannerInfo *root, double loop_count)    cost_qual_eval(&qpqual_cost,
qpquals,root);    startup_cost += qpqual_cost.startup;
 
-    cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;
+
+    /* Indexonly scan applies this qual on index tuples */
+    if (indexonly)
+        cpu_per_tuple = cpu_index_tuple_cost + qpqual_cost.per_tuple;
+    else
+        cpu_per_tuple = cpu_tuple_cost + qpqual_cost.per_tuple;    run_cost += cpu_per_tuple * tuples_fetched;

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: Review: check existency of table for -t option (pg_dump) when pattern...
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: PATCH: index-only scans with partial indexes