Re: Mac OS: invalid byte sequence for encoding "UTF8" - Mailing list pgsql-hackers

From Artur Zakirov
Subject Re: Mac OS: invalid byte sequence for encoding "UTF8"
Date
Msg-id 56AA28BD.7080108@postgrespro.ru
Whole thread Raw
In response to Re: Mac OS: invalid byte sequence for encoding "UTF8"  (Artur Zakirov <a.zakirov@postgrespro.ru>)
Responses Re: Mac OS: invalid byte sequence for encoding "UTF8"  (Artur Zakirov <a.zakirov@postgrespro.ru>)
Re: Mac OS: invalid byte sequence for encoding "UTF8"  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Mac OS: invalid byte sequence for encoding "UTF8"  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 27.01.2016 15:28, Artur Zakirov wrote:
> On 27.01.2016 14:14, Stas Kelvich wrote:
>> Hi.
>>
>> I tried that and confirm strange behaviour. It seems that problem with
>> small cyrillic letter ‘х’. (simplest obscene language filter? =)
>>
>> That can be reproduced with simpler test
>>
>> Stas
>>
>>
>
> The test program was corrected. Now it uses wchar_t type. And it works
> correctly and gives right output.
>
> I think the NIImportOOAffixes() in spell.c should be corrected to avoid
> this bug.
>

I have attached a patch. It adds new functions parse_ooaffentry() and
get_nextentry() and fixes a couple comments.

Now russian and other supported dictionaries can be used for text search
in Mac OS.

parse_ooaffentry() parses an affix file entry instead of sscanf(). It
has a similar algorithm to the parse_affentry() function.

Should I create a new patch to fix this bug (as I did) or this patch
should go with the patch
http://www.postgresql.org/message-id/56AA02EE.6090004@postgrespro.ru ?

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Template for commit messages
Next
From: "Joshua D. Drake"
Date:
Subject: Re: New committer