On Mon, Jun 16, 2014 at 8:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Not having looked at the patch, but: I think the probability of
> useless-noise HINTs could be substantially reduced if the code prints a
> HINT only when there is a single available alternative that is clearly
> better than the others in Levenshtein distance. I'm not sure how much
> better is "clearly better", but I exclude "zero" from that. I see that
> the original description of the patch says that it will arbitrarily
> choose one alternative when there are several with equal Levenshtein
> distance, and I'd say that's a bad idea.
I disagree. I happen to think that making some guess is better than no
guess at all here, given the fact that there aren't too many
possibilities to choose from. I think that it might be particularly
annoying to not show some suggestion in the event of a would-be
ambiguous column reference where the column name is itself wrong,
since both mistakes are common. For example, "order_id" was specified
instead of one of either "o.orderid" or "ol.orderid", as in my
original examples. If some correct alias was specified, that would
make the new code prefer the appropriate Var, but it might not be, and
that should be okay in my view.
I'm not trying to remove the need for human judgement here. We've all
heard stories about people who did things like input "Portland" into
their GPS only to end up in Maine rather than Oregon, but I think in
general you can only go so far in worrying about those cases.
--
Peter Geoghegan