On Fri, Sep 02, 2022 at 10:06:54PM -0400, Tom Lane wrote:
> I think the distance limit of 5 is too loose though. I see that
> it accommodates examples like "passfile" for "password", which
> seems great at first glance; but it also allows fundamentally
> silly suggestions like "user" for "server" or "host" for "foo".
> We'd need something smarter than Levenshtein if we want to offer
> "passfile" for "password" without looking stupid on a whole lot
> of other cases --- those words seem close, but they are close
> semantically not textually.
Yeah, it's really only useful for simple misspellings, but IMO even that is
rather handy.
I noticed that the parse_relation.c stuff excludes matches where more than
half the characters are different, so I added that here and lowered the
distance limit to 4. This seems to prevent the silly suggestions (e.g.,
"host" for "foo") while retaining the more believable ones (e.g.,
"passfile" for "password"), at least for the small set of examples covered
in the tests.
--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com