Re: Doing better at HINTing an appropriate column within errorMissingColumn() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Date
Msg-id 11288.1418663072@sss.pgh.pa.us
Whole thread Raw
In response to Re: Doing better at HINTing an appropriate column within errorMissingColumn()  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Doing better at HINTing an appropriate column within errorMissingColumn()  (Stephen Frost <sfrost@snowman.net>)
List pgsql-hackers
Robert Haas <robertmhaas@gmail.com> writes:
> On Sun, Dec 14, 2014 at 8:24 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> Moving this patch to CF 2014-12 as work is still going on, note that
>> it is currently marked with Robert as reviewer and that its current
>> status is "Needs review".

> The status here is more like "waiting around to see if anyone else has
> an opinion".  The issue is what should happen when you enter qualified
> name like alvaro.herrera and there is no column named anything like
> herrara in the RTE named alvaro, but there is some OTHER RTE that
> contains a column with a name that is only a small Levenshtein
> distance away from herrera, like roberto.correra.  The questions are:

> 1. Should we EVER give a you-might-have-meant hint in a case like this?
> 2. If so, does it matter whether the RTE name is just a bit different
> from the actual RTE or whether it's completely different?  In other
> words, might we skip the hint in the above case but give one for
> alvara.correra?

It would be astonishingly silly to not care about the RTE name's distance,
if you ask me.  This is supposed to detect typos, not thinkos.

I think there might be some value in a separate heuristic that, when
you typed foo.bar and that doesn't match but there is a baz.bar, suggests
that maybe you meant baz.bar, even if baz is not close typo-wise.  This
would be addressing the thinko case not the typo case, so the rules ought
to be quite different --- in particular I doubt that it'd be a good idea
to hint this way if the column names don't match exactly.  But in any
case the key point is that this is a different heuristic addressing a
different failure mode.  We should not try to make the
levenshtein-distance heuristic address that case.

So my two cents is that when considering a qualified name, this patch
should take levenshtein distance across the two components equally.
There's no good reason to suppose that typos will attack one name
component more (nor less) than the other.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Something is broken in logical decoding with CLOBBER_CACHE_ALWAYS
Next
From: Robert Haas
Date:
Subject: Re: Status of CF 2014-10 and upcoming 2014-12