Home > mailing lists

Re: tsearch in core patch - Mailing list pgsql-hackers

From	Alvaro Herrera
Subject	Re: tsearch in core patch
Date	June 22, 2007 16:18:21
Msg-id	20070622161806.GP8949@alvh.no-ip.org Whole thread Raw
In response to	Re: tsearch in core patch (teodor@sigaev.ru)
Responses	Re: tsearch in core patch Re: tsearch in core patch
List	pgsql-hackers

Tree view

teodor@sigaev.ru wrote:
> > Why not do it the other way around?
> > es_ES        spanish
> > Spanish_Spain    spanish
> > ru_RU        russian
> > pt_BR        portuguese_brazil
> >
> > That way you don't need any funny index.  Or do you need the list of
> > locales for each language? (but even if you do, you can easily obtain it
> > by indexing both columns separately using btrees anyway)
> 
> Yes, that's possible but that icreases number of identical configuration:
> russian_win     Russian_Russia
> russian_unix    ru_RU
> 
> They doesn't differ except locale name.

But why do you need them to be different at all?  Just make it

russian     Russian_Russia
russian     ru_RU

Does that not work for some reason?

What I was really suggesting was having a table mapping locale names
into "tsearch languages".  Then the configuration could be made based on
the language, not on the locale name.  So the stopword list is for
"russian", regardless of whether the locale is Russian_Russia or ru_RU.

Is this only for the stopword list, or does it also affect selecting a
stemmer?

Note: it's possible that the stopword list is different for brazilian
portuguese than portuguese portuguese, which is why I was suggesting
using a language "portuguese_brazil" and not just "postuguese".  Whereas
you need a single stopword list for all the countries speaking spanish,
which is why you need only one language called spanish.

-- 
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"Llegará una época en la que una investigación diligente y prolongada sacará
a la luz cosas que hoy están ocultas" (Séneca, siglo I)

pgsql-hackers by date:

From: "Florian G. Pflug"
Date: 22 June 2007, 16:16:45
Subject: Re: Worries about delayed-commit semantics

From: Tatsuo Ishii
Date: 22 June 2007, 16:27:07
Subject: Re: [Fwd: Re: tsearch in core patch]

Re: tsearch in core patch - Mailing list pgsql-hackers

Previous

Next