Re: tsearch in core patch - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: tsearch in core patch
Date
Msg-id 20070622161806.GP8949@alvh.no-ip.org
Whole thread Raw
In response to Re: tsearch in core patch  (teodor@sigaev.ru)
Responses Re: tsearch in core patch
Re: tsearch in core patch
List pgsql-hackers
teodor@sigaev.ru wrote:
> > Why not do it the other way around?
> > es_ES        spanish
> > Spanish_Spain    spanish
> > ru_RU        russian
> > pt_BR        portuguese_brazil
> >
> > That way you don't need any funny index.  Or do you need the list of
> > locales for each language? (but even if you do, you can easily obtain it
> > by indexing both columns separately using btrees anyway)
> 
> Yes, that's possible but that icreases number of identical configuration:
> russian_win     Russian_Russia
> russian_unix    ru_RU
> 
> They doesn't differ except locale name.

But why do you need them to be different at all?  Just make it

russian     Russian_Russia
russian     ru_RU

Does that not work for some reason?

What I was really suggesting was having a table mapping locale names
into "tsearch languages".  Then the configuration could be made based on
the language, not on the locale name.  So the stopword list is for
"russian", regardless of whether the locale is Russian_Russia or ru_RU.

Is this only for the stopword list, or does it also affect selecting a
stemmer?

Note: it's possible that the stopword list is different for brazilian
portuguese than portuguese portuguese, which is why I was suggesting
using a language "portuguese_brazil" and not just "postuguese".  Whereas
you need a single stopword list for all the countries speaking spanish,
which is why you need only one language called spanish.

-- 
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"Llegará una época en la que una investigación diligente y prolongada sacará
a la luz cosas que hoy están ocultas" (Séneca, siglo I)


pgsql-hackers by date:

Previous
From: "Florian G. Pflug"
Date:
Subject: Re: Worries about delayed-commit semantics
Next
From: Tatsuo Ishii
Date:
Subject: Re: [Fwd: Re: tsearch in core patch]