Re: Plan stability versus near-exact ties in cost estimates - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: Plan stability versus near-exact ties in cost estimates
Date
Msg-id 4F90A591.8060608@nasby.net
Whole thread Raw
In response to Plan stability versus near-exact ties in cost estimates  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Plan stability versus near-exact ties in cost estimates  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 4/19/12 5:39 PM, Tom Lane wrote:
> Now, neither of these fixes is perfect: what they would do is remove
> platform-specific instability from where the costs are basically equal
> and add some more in the range where the costs differ by almost exactly
> the fuzz factor.  But the behavior near that point is platform-specific
> already, just not quite as much, and it's surely something we're
> unlikely to trip over in the regression tests.

I can't help but think of complaints we get from users regarding plan stability, even though this is a case of taking
thatto an extreme. Because this case is extreme (changing plans due to 1e-16 of difference) it's fairly easy to fix
thisspecific case. In order to get 9.2 out the door maybe fixing just this case is the right thing to do. But ISTM this
isjust an example of a bigger problem.
 

One of the complaints we've seen is that the planner will sometimes choose a plan that has a marginally lower cost
(wheremarginally in this case is significantly more than 1e-16 ;) even though that plan will perform really poorly if
thestats are off. I have wondered if that could be addressed by introducing the concept of an error range to each plan.
Myidea is that each node would predict how much the cost estimate would change if the stats were off by some amount. If
twoplans are close to the same cost, you would want to choose the plan that had the lower error range, trading off a
smallamount of theoretical performance for less risk of getting a horrible plan if the stats assumptions proved to be
wrong.

I believe that would fix this specific case because even though to plans might come out with a nearly identical cost it
isunlikely that they would also have a nearly identical error range.
 
-- 
Jim C. Nasby, Database Architect                   jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Plan stability versus near-exact ties in cost estimates
Next
From: Peter Geoghegan
Date:
Subject: Re: Timsort performance, quicksort