Re: Make the qual cost on index Filter slightly higher than qual coston index Cond. - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Make the qual cost on index Filter slightly higher than qual coston index Cond.
Date
Msg-id CAKU4AWre9=+iFdfhsJy-6wJn8OTdNACfbuWntj6J2vsG6VcHcQ@mail.gmail.com
Whole thread Raw
In response to Re: Make the qual cost on index Filter slightly higher than qual coston index Cond.  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
List pgsql-hackers


On Fri, May 29, 2020 at 9:37 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Fri, May 29, 2020 at 6:40 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
>
>>
>> >so we need to optimize the cost model for such case, the method is the
>> >patch I mentioned above.
>>
>> Making the planner more robust w.r.t. to estimation errors is nice, but
>> I wouldn't go as far saying we should optimize for such cases. The stats
>> can be arbitrarily off, so should we expect the error to be 10%, 100% or
>> 1000000%?
>
>
> I don't think my patch relay on anything like that.   My patch doesn't fix the
> statistics issue,  just adding the extra cost on qual cost on Index Filter part.
> Assume the query pattern are where col1= X and col2 = Y. The impacts are :
> 1).  Make the cost of (col1, other_column) is higher than (col1, col2)
> 2). The relationship between seqscan and index scan on index (col1, other_column)
> is changed, (this is something I don't want).  However my cost difference between
> index scan & seq scan usually very huge, so the change above should has
> nearly no impact on that choice.   3). Make the cost higher index scan for
> Index (col1) only.  Overall I think nothing will make thing worse.

When the statistics is almost correct (or better than what you have in
your example), the index which does not cover all the columns in all
the conditions will be expensive anyways because of extra cost to
access heap for the extra rows not filtered by that index. An index
covering all the conditions would have its scan cost cheaper since
there will be fewer rows and hence fewer heap page accesses because of
more filtering. So I don't think we need any change in the current
costing model.
 
Thank you for your reply.  Looks you comments is based on the statistics
is almost correct (or better than what I have in my example),  That is true. 
However my goal is to figure out a way which can generate better plan even
the statistics is not correct (the statistics with such issue is not very uncommon,
I just run into one such case and spend 1 week to handle some non-technology 
stuff after that).   I think the current issue is even my patch can make the worst case
better, we need to make sure the average performance not worse. 

--
Best Regards
Andy Fan

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: pg_dump fail to not dump public schema orders
Next
From: Andy Fan
Date:
Subject: Re: Make the qual cost on index Filter slightly higher than qual coston index Cond.