Re: Doing better at HINTing an appropriate column within errorMissingColumn() - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Doing better at HINTing an appropriate column within errorMissingColumn()
Date
Msg-id CA+TgmoYsSC0Wa1b3tLKPJ-3Ew=98JFsQgheGYc49p2TU36sMFQ@mail.gmail.com
Whole thread Raw
In response to Re: Doing better at HINTing an appropriate column within errorMissingColumn()  (Peter Geoghegan <pg@heroku.com>)
Responses Re: Doing better at HINTing an appropriate column within errorMissingColumn()  (Peter Geoghegan <pg@heroku.com>)
List pgsql-hackers
On Sat, Dec 20, 2014 at 7:30 PM, Peter Geoghegan <pg@heroku.com> wrote:
> On Sat, Dec 20, 2014 at 3:17 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> Attached patch implements this scheme.
>
> I had another thought: "NAMEDATALEN + 1" is a better representation of
> "infinity" for matching purposes than INT_MAX. I probably should have
> made that change, too. It would then not have been necessary to
> "#include <limits.h>". I think that this is a useful
> belt-and-suspenders precaution against integer overflow. It almost
> certainly won't matter, since it's very unlikely that the best match
> within an RTE will end up being a dropped column, but we might as well
> do it that way (Levenshtein distance is costed in multiples of code
> point changes, but the maximum density is 1 byte per codepoint).

Good idea.

Looking over the latest patch, I think we could simplify the code so
that you don't need multiple FuzzyAttrMatchState objects.  Instead of
creating a separate one for each RTE and then merging them, just have
one.  When you find an inexact-RTE name match, set a field inside the
FuzzyAttrMatchState -- maybe with a name like rte_penalty -- to the
Levenshtein distance between the RTEs.  Then call scanRTEForColumn()
and pass down the same state object.  Now let
updateFuzzyAttrMatchState() work out what it needs to do based on the
observed inter-column distance and the currently-in-force RTE penalty.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Table-level log_autovacuum_min_duration
Next
From: Michael Paquier
Date:
Subject: Re: moving from contrib to bin