Re: a wrong index choose when statistics is out of date - Mailing list pgsql-hackers
From | Andy Fan |
---|---|
Subject | Re: a wrong index choose when statistics is out of date |
Date | |
Msg-id | 87le6xb5j9.fsf@163.com Whole thread Raw |
In response to | Re: a wrong index choose when statistics is out of date (David Rowley <dgrowleyml@gmail.com>) |
Responses |
Re: a wrong index choose when statistics is out of date
|
List | pgsql-hackers |
David Rowley <dgrowleyml@gmail.com> writes: > On Tue, 5 Mar 2024 at 00:37, Andy Fan <zhihuifan1213@163.com> wrote: >> >> David Rowley <dgrowleyml@gmail.com> writes: >> > I don't think it would be right to fudge the costs in any way, but I >> > think the risk factor for IndexPaths could take into account the >> > number of unmatched index clauses and increment the risk factor, or >> > "certainty_factor" as it is currently in my brain-based design. That >> > way add_path() would be more likely to prefer the index that matches >> > the most conditions. >> >> This is somehow similar with my proposal at [1]? What do you think >> about the treat 'col op const' as 'col op $1' for the marked column? >> This could just resolve a subset of questions in your mind, but the >> method looks have a solid reason. > > Do you mean this? Yes, it is not cautious enough to say "similar" too quick. After reading your opinion again, I think what you are trying to do is adding one more dimension to Path compared with the existing cost and pathkey information and it would take effects on add_path stage. That is impressive, and I'm pretty willing to do more testing once the v1 is done. I just noted you have expressed your idea about my proposal 1, > We should do anything like add column options in the meantime. Those > are hard to remove once added. I will try it very soon. and I'm a bit of upset no one care about my proposal 2 which is the AI method, I see many companies want to introduce AI to planner even I don't seen any impressive success, but this user case looks like a candidate. >> + /* >> + * To make the planner more robust to handle some inaccurate statistics >> + * issue, we will add a extra cost to qpquals so that the less qpquals >> + * the lower cost it has. >> + */ >> + cpu_run_cost += 0.01 * list_length(qpquals); > > I don't think it's a good idea to add cost penalties like you proposed > there. This is what I meant by "I don't think it would be right to > fudge the costs in any way". > > If you modify the costs to add some small penalty so that the planner > is more likely to favour some other plan, what happens if we then > decide the other plan has some issue and we want to penalise that for > some other reason? Adding the 2nd penalty might result in the original > plan choice again. Which one should be penalised more? I think the > uncertainty needs to be tracked separately. > > Fudging the costs like this is also unlikely to play nicely with > add_path's use of STD_FUZZ_FACTOR. There'd be an incentive to do > things like total_cost *= STD_FUZZ_FACTOR; to ensure we get a large > enough penalty. I agree and I just misunderstood your proposal yesterday. -- Best Regards Andy Fan
pgsql-hackers by date: