Christian Marie wrote:
> A developer I work with was trying to use dmetaphone to group people names into
> equivalence classes. He found that many long names would be grouped together
> when they shouldn't be, this turned out to be because dmetaphone has an
> undocumented upper bound on its output length, of four. This is obviously
> impractical for many use cases.
>
> This patch addresses this by adding and documenting an optional argument to
> dmetaphone and dmetaphone_alt that specifies the maximum output length. This
> makes it possible to use dmetaphone on much longer inputs.
>
> Backwards compatibility is catered for by making the new argument optional,
> defaulting to the old, hard-coded value of four. We now have:
>
> dmetaphone(text source) returns text
> dmetaphone(text source, int max_output_length) returns text
> dmetaphone_alt(text source) returns text
> dmetaphone_alt(text source, int max_output_length) returns text
I like the idea.
How about: dmetaphone(text source, int max_output_length DEFAULT 4) returns text dmetaphone_alt(text source, int
max_output_lengthDEFAULT 4) returns text
Saves two functions and is self-documenting.
> +postgres=# select dmetaphone('unicorns');
> + dmetaphone
> +------------
> + ANKR
> +(1 row)
> +
> +postgres=# select dmetaphone('unicorns', 8);
> + dmetaphone
> ------------
> - KMP
> + ANKRNS
> (1 row)
> </screen>
> </sect2>
Yeah, "ponies" would have been too short...
Yours,
Laurenz Albe