Re: [HACKERS] Extra Vietnamese unaccent rules - Mailing list pgsql-hackers

From Dang Minh Huong
Subject Re: [HACKERS] Extra Vietnamese unaccent rules
Date
Msg-id 69AE3AFD-BA0B-4A41-B32C-BA62CF7C70DB@gmail.com
Whole thread Raw
In response to Re: [HACKERS] Extra Vietnamese unaccent rules  (Thomas Munro <thomas.munro@enterprisedb.com>)
Responses Re: [HACKERS] Extra Vietnamese unaccent rules  (Dang Minh Huong <kakalot49@gmail.com>)
List pgsql-hackers

On May 29, 29 Heisei, at 10:47, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

On Sun, May 28, 2017 at 7:55 PM, Dang Minh Huong <kakalot49@gmail.com> wrote:
Thanks for reporting and lecture about unicode.
I attached a patch as the instruction from Thomas. Could you confirm it.

-           is_plain_letter(table[codepoint.combining_ids[0]]) and \
+           (is_plain_letter(table[codepoint.combining_ids[0]]) or\
+            len(table[codepoint.combining_ids[0]].combining_ids) > 1) and \

Shouldn't you use "or is_letter_with_marks()", instead of "or len(...)
1"?  Your test might catch something that isn't based on a 'letter'
(according to is_plain_letter).  Otherwise this looks pretty good to
me.  Please add it to the next commitfest.

Thanks for confirm, sir.
I will add it to the next CF soon.

I expect that some users in Vietnam will consider this to be a bugfix,
which raises the question of whether to backpatch it.  Perhaps we
could consider fixing it for 10.  Then users of older versions could
grab the rules file from 10 to use with 9.whatever if they want to do
that and reindex their data as appropriate.

I am also inclined to the fixing it for 10, because it will not affect to current users.
But do you want to back-patch to all supported versions Kha Nguyen?
# I would also want to note that, not only Vietnamese characters were missed to add from the rule list.


---
Thanks and best regards,
Dang Minh Huong

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: [HACKERS] Fix a typo in execExpr.c
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] pg_resetwal is broken if run from v10 against older version of PG data directory