Re: PATCH: Allow empty targets in unaccent dictionary - Mailing list pgsql-hackers

From Abhijit Menon-Sen
Subject Re: PATCH: Allow empty targets in unaccent dictionary
Date
Msg-id 20140629114328.GA31670@toroid.org
Whole thread Raw
In response to Re: PATCH: Allow empty targets in unaccent dictionary  (Abhijit Menon-Sen <ams@2ndQuadrant.com>)
Responses Re: PATCH: Allow empty targets in unaccent dictionary  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi.

I've attached a patch to contrib/unaccent as outlined in my review the
other day. I'm familiar with multiple languages in which modifiers are
separate characters (but not Arabic), so I decided to try a quick test
because I was curious.

I added a line containing only U+0940 (DEVANAGARI VOWEL SIGN II) to
unaccent.rules, and tried the following (the argument to unaccent is
U+0915 U+0940, and the result is U+0915):

    ams=# select unaccent('unaccent','की ');
     unaccent
    ----------
     क
    (1 row)

So the patch works fine: it correctly removes the modifier.

To add a test, however, it would be necessary to add this modifier to
unaccent.rules. But if we're adding one modifier to unaccent.rules, we
really should add them all. I have nowhere near the motivation needed to
add all the Devanagari modifiers, let alone any of the other languages I
know, and even if I did, it still wouldn't address Mohammad's use case.

(As a separate matter, it's not clear to me if stripping these modifiers
using unaccent is something everyone will want to do.)

So, though I'm not fond of saying it, perhaps the right thing to do is
to forget my earlier objection (that the patch didn't have tests), and
just commit as-is. It's a pretty straightforward patch, and it works.

I'm setting this as ready for committer.

-- अभजत "unaccented in three languages" മനന-সন

Attachment

pgsql-hackers by date:

Previous
From: "MauMau"
Date:
Subject: Re: [Fwd: Re: proposal: new long psql parameter --on-error-stop]
Next
From: Mohammad Alhashash
Date:
Subject: Re: PATCH: Allow empty targets in unaccent dictionary