tsearch2: enable non ascii stop words with C locale - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject tsearch2: enable non ascii stop words with C locale
Date
Msg-id 20070211.172038.109995693.t-ishii@sraoss.co.jp
Whole thread Raw
Responses Re: tsearch2: enable non ascii stop words with C locale  (Teodor Sigaev <teodor@sigaev.ru>)
List pgsql-hackers
Hi,

Currently tsearch2 does not accept non ascii stop words if locale is
C. Included patches should fix the problem. Patches against PostgreSQL
8.2.3.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
*** wordparser/parser.c~    2007-01-16 00:16:11.000000000 +0900
--- wordparser/parser.c    2007-02-10 18:04:59.000000000 +0900
***************
*** 246,251 ****
--- 246,266 ---- static int p_islatin(TParser * prs) {
+     if (prs->usewide)
+     {
+         if (lc_ctype_is_c())
+         {
+             unsigned int c = *(unsigned int*)(prs->wstr + prs->state->poschar);
+ 
+             /*
+              * any non-ascii symbol with multibyte encoding
+              * with C-locale is a latin character
+              */
+             if ( c > 0x7f )
+                 return 1;
+         }
+     }
+      return (p_isalpha(prs) && p_isascii(prs)) ? 1 : 0; }

pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: HOT for PostgreSQL 8.3
Next
From: Peter Eisentraut
Date:
Subject: Re: XML export