On 30/06/18 03:03, Tomas Vondra wrote:
> On 06/29/2018 04:51 PM, Teodor Sigaev wrote:
>>
>>>> I tried to attack the cost_sort() issues and hope on that basis we
>>>> can solve problems with 0002 patch and improve incremental sort patch.
>>>>
>>>
>>> OK, will do. Thanks for working on this!
>>
>> I hope, now we have a better cost_sort(). The obvious way is a try
>> all combination of pathkeys in get_cheapest_group_keys_order() and
>> choose cheapest one by cost_sort().
>
>> But it requires N! operations and potentially could be very
>> expensive in case of large number of pathkeys and doesn't solve the
>> issue with user-knows-what-he-does pathkeys.
>
> Not sure. There are N! combinations, but this seems like a good
> candidate for backtracking [1]. You don't have to enumerate and
> evaluate all N! combinations, just construct one and then abandon
> whole classes of combinations as soon as they get more expensive than
> the currently best one. That's thanks to additive nature of the
> comparison costing, because appending a column to the sort key can
> only make it more expensive. My guess is this will make this a non-issue.
>
> [1] https://en.wikipedia.org/wiki/Backtracking
>
>>
>> We could suggest an order of pathkeys as patch suggests now and if
>> cost_sort() estimates cost is less than 80% (arbitrary chosen) cost
>> of user-suggested pathkeys then it use our else user pathkeys.
>>
>
> I really despise such arbitrary thresholds. I'd much rather use a more
> reliable heuristics by default, even if it gets it wrong in some cases
> (which it will, but that's natural).
>
> regards
>
Additionally put an upper limit threshold on the number of combinations
to check, fairly large by default?
If first threshold is exceeded, could consider checking out a few more
selected at random from paths not yet checked, to avoid any bias caused
by stopping a systematic search. This might prove important when N! is
fairly large.
Cheers,
Gavin