Re: [PATCH] RFC: Add length parameterised dmetaphone functions - Mailing list pgsql-hackers

From Albe Laurenz
Subject Re: [PATCH] RFC: Add length parameterised dmetaphone functions
Date
Msg-id A737B7A37273E048B164557ADEF4A58B50FD0041@ntex2010a.host.magwien.gv.at
Whole thread Raw
In response to [PATCH] RFC: Add length parameterised dmetaphone functions  (Christian Marie <christian@ponies.io>)
List pgsql-hackers
Christian Marie wrote:
> A developer I work with was trying to use dmetaphone to group people names into
> equivalence classes. He found that many long names would be grouped together
> when they shouldn't be, this turned out to be because dmetaphone has an
> undocumented upper bound on its output length, of four. This is obviously
> impractical for many use cases.
> 
> This patch addresses this by adding and documenting an optional argument to
> dmetaphone and dmetaphone_alt that specifies the maximum output length. This
> makes it possible to use dmetaphone on much longer inputs.
> 
> Backwards compatibility is catered for by making the new argument optional,
> defaulting to the old, hard-coded value of four. We now have:
> 
>     dmetaphone(text source) returns text
>     dmetaphone(text source, int max_output_length) returns text
>     dmetaphone_alt(text source) returns text
>     dmetaphone_alt(text source, int max_output_length) returns text

I like the idea.

How about:   dmetaphone(text source, int max_output_length DEFAULT 4) returns text   dmetaphone_alt(text source, int
max_output_lengthDEFAULT 4) returns text
 

Saves two functions and is self-documenting.

> +postgres=# select dmetaphone('unicorns');
> + dmetaphone
> +------------
> + ANKR
> +(1 row)
> +
> +postgres=# select dmetaphone('unicorns', 8);
> + dmetaphone
>  ------------
> - KMP
> + ANKRNS
>  (1 row)
>  </screen>
>   </sect2>

Yeah, "ponies" would have been too short...

Yours,
Laurenz Albe

pgsql-hackers by date:

Previous
From: Artur Zakirov
Date:
Subject: Re: [PROPOSAL] Improvements of Hunspell dictionaries support
Next
From: Artur Zakirov
Date:
Subject: Re: [PROPOSAL] Improvements of Hunspell dictionaries support