Peter Geoghegan <pg@heroku.com> writes:
> On Thu, Nov 20, 2014 at 11:03 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> That does seem to give better results, but it still seems awfully
>> complicated. If we just used Levenshtein with all-default cost
>> factors and a distance cap equal to Max(strlen(what_user_typed),
>> strlen(candidate_match), 3), what cases that you think are important
>> would be harmed?
> Well, just by plugging in default Levenshtein cost factors, I can see
> the following regression:
> *** /home/pg/postgresql/src/test/regress/expected/join.out 2014-11-20
> 10:17:55.042291912 -0800
> --- /home/pg/postgresql/src/test/regress/results/join.out 2014-11-20
> 11:42:15.670108745 -0800
> ***************
> *** 3452,3458 ****
> ERROR: column atts.relid does not exist
> LINE 1: select atts.relid::regclass, s.* from pg_stats s join
> ^
> - HINT: Perhaps you meant to reference the column "atts"."indexrelid".
I do not have a problem with deciding that that is not a "regression";
in fact, not giving that hint seems like a good conservative behavior
here. By your logic, we should also be prepared to suggest
"supercalifragilisticexpialidocious" when the user enters "ocious".
It's simply a bridge too far for what is supposed to be a hint for
minor typos. You sound like you want to turn it into something that
will look up column names for people who are too lazy to even try to
type the right thing. While I can see the value of such a tool within
certain contexts, firing completed queries at a live SQL engine
is not one of them.
regards, tom lane