Re: [Fwd: Re: tsearch in core patch] - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [Fwd: Re: tsearch in core patch]
Date
Msg-id 12580.1182745564@sss.pgh.pa.us
Whole thread Raw
In response to Re: [Fwd: Re: tsearch in core patch]  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Responses Re: [Fwd: Re: tsearch in core patch]  (Tatsuo Ishii <ishii@sraoss.co.jp>)
Re: [Fwd: Re: tsearch in core patch]  ("Mike Rylander" <mrylander@gmail.com>)
List pgsql-hackers
Tatsuo Ishii <ishii@sraoss.co.jp> writes:
> Ok, probably we need to copy the English stemming rule to the one for
> Japanese.

Pardon my ignorance here, but is the concept of stemming even relevant
to Japanese/Chinese/Korean?  What little I know about ideographic
languages suggests it wouldn't work well.  And surely the specific rules
in the Snowball project's English stemmer wouldn't work.

> I think same thing (commonly used English with local
> language) can be applied to Chinese and Korean.

Well, it's not hard at all to find chunks of English text that have
embedded bits of French, Spanish, or what-have-you, but that's not an
argument for trying to intermix the stemmers.  I doubt that such simple
bits of program could tell the language difference well enough to
determine which stemming rules to apply.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Server-side support of all encodings
Next
From: Tatsuo Ishii
Date:
Subject: Re: [Fwd: Re: tsearch in core patch]