Re: Doing better at HINTing an appropriate column within errorMissingColumn() - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Date
Msg-id 53A0B315.3020205@agliodbs.com
Whole thread Raw
In response to Doing better at HINTing an appropriate column within errorMissingColumn()  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Doing better at HINTing an appropriate column within errorMissingColumn()  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 06/17/2014 01:59 PM, Tom Lane wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> Emitting a suggestion with a large distance seems like it could be
>> rather irritating.  If the user types in SELECT prodct_id FROM orders,
>> and that column does not exist, suggesting "product_id", if such a
>> column exists, will likely be well-received.  Suggesting a column
>> named, say, "price", however, will likely make at least some users say
>> "no I didn't mean that you stupid @%!#" - because probably the issue
>> there is that the user selected from the completely wrong table,
>> rather than getting 6 of the 9 characters they typed incorrect.
> 
> Yeah, that's my point exactly.  There's no very good reason to assume that
> the intended answer is in fact among the set of column names we can see;
> and if it *is* there, the Levenshtein distance to it isn't going to be
> all that large.  I think that suggesting "foobar" when the user typed
> "glorp" is not only not helpful, but makes us look like idiots.

Well, there's two different issues:

(1) offering a suggestion which is too different from what the user
typed.  This is easily limited by having a max distance (most likely a
distance/length ratio, with a max of say, 0.5).  The only drawback of
this would be the extra cpu cycles to calculate it, and some arguments
about what the max distance should be.  But for the sake of the
children, let's not have a GUC for it.

(2) If there are multiple columns with the same levenschtien distance,
which one do you suggest?  The current code picks a random one, which
I'm OK with.  The other option would be to list all of the columns.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com



pgsql-hackers by date:

Previous
From: Kevin Grittner
Date:
Subject: Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Next
From: Josh Berkus
Date:
Subject: Re: Minmax indexes