Re: Bad estimate with partial index - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Bad estimate with partial index
Date
Msg-id 2627347.1650402516@sss.pgh.pa.us
Whole thread Raw
In response to Re: Bad estimate with partial index  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bad estimate with partial index  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
List pgsql-hackers
I wrote:
> it looks like the problem is that the extended stats haven't been used
> while forming the estimate of the number of index entries retrieved, so
> we overestimate the cost of using this index.
> That seems like a bug.  Tomas?

I dug into this enough to locate the source of the problem.
btcostestimate includes the partial index clauses in what it
sends to clauselist_selectivity, but per the comments for
add_predicate_to_index_quals:

 * Note that indexQuals contains RestrictInfo nodes while the indpred
 * does not, so the output list will be mixed.  This is OK for both
 * predicate_implied_by() and clauselist_selectivity(), but might be
 * problematic if the result were passed to other things.

That comment was true when it was written, but it's been falsified
by the extended-stats patches, which have added a whole lot of logic
in and under clauselist_selectivity that ignores clauses that are not
RestrictInfos.

While we could perhaps fix this by having add_predicate_to_index_quals
add RestrictInfos, I'm inclined to feel that the extended-stats code
is in the wrong.  The contract for clauselist_selectivity has always
been that it could optimize if given RestrictInfos rather than bare
clauses, not that it would fail to work entirely without them.
There are probably more places besides add_predicate_to_index_quals
that are relying on that.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: using an end-of-recovery record in all cases
Next
From: Michael Paquier
Date:
Subject: Re: Estimating HugePages Requirements?