On 03.09.22 06:30, Nathan Bossart wrote:
> On Fri, Sep 02, 2022 at 10:06:54PM -0400, Tom Lane wrote:
>> I think the distance limit of 5 is too loose though. I see that
>> it accommodates examples like "passfile" for "password", which
>> seems great at first glance; but it also allows fundamentally
>> silly suggestions like "user" for "server" or "host" for "foo".
>> We'd need something smarter than Levenshtein if we want to offer
>> "passfile" for "password" without looking stupid on a whole lot
>> of other cases --- those words seem close, but they are close
>> semantically not textually.
>
> Yeah, it's really only useful for simple misspellings, but IMO even that is
> rather handy.
>
> I noticed that the parse_relation.c stuff excludes matches where more than
> half the characters are different, so I added that here and lowered the
> distance limit to 4. This seems to prevent the silly suggestions (e.g.,
> "host" for "foo") while retaining the more believable ones (e.g.,
> "passfile" for "password"), at least for the small set of examples covered
> in the tests.
I think this code is compact enough and the hints it produces are
reasonable, so I think we could go with it.
I notice that for column misspellings, the hint is phrased "Perhaps you
meant X." whereas here we have "Did you mean X?". Let's make that uniform.