Re: fts, compond words? - Mailing list pgsql-general

From Marcus Engene
Subject Re: fts, compond words?
Date
Msg-id 439D6F99.7070809@engene.se
Whole thread Raw
In response to Re: fts, compond words?  (Marcus Engene <mengpg@engene.se>)
Responses Re: fts, compond words?
List pgsql-general
 > That a simple case, what about languages as norwegian or german? They
 > has compound words and ispell dictionary can split them to lexemes.
 > But, usialy there is more than one variant of separation:
 >
 > forbruksvaremerkelov
 > forbruk vare merke lov
 > forbruk vare merkelov
 > forbruk varemerke lov
 > forbruk varemerkelov
 > forbruksvare merke lov
 > forbruksvare merkelov
 > (notice: I don't know translation, just an example. When we working
on > compound word support we found word which has 24 variant of
 > separation!!)
 >
 > So, query 'a + forbruksvaremerkelov' will be awful:
 >
 > a + ( (forbruk & vare & merke & lov) | (forbruk & vare & merkelov) |
... )
 >
 > Of course, that is examle just from mind, but solution of phrase
 > search should work reasonably with such corner cases.

(Sorry for replying in the wrong place in the thread, I was away for a
trip and unsubscribed meanwhile)

I'm a swede and swedish is similair to norweigan and german. Take this
example:

lång hårig kvinna
långhårig kvinna

Words are put together to make a new word with different meaning. The
first example means "tall hairy woman" and the second is "woman with
long hair". If I would be on f.ex a date site, I'd want the distinction.
;-) If not, i should enter both strings
("lång hårig" | långhårig) & kvinna
...which is perfectly acceptable.

IMHO I don't see any point in splitting these words.


Let's go back to the subject, what about a syntax like this:

idxfti @@ to_tsquery('default', 'pizza & (Chicago | [New York]')

Ie the exact match string is always atomic. Wouldn't that be doable
without any logical implications?

Best regards,
Marcus

pgsql-general by date:

Previous
From: Marko Kreen
Date:
Subject: Quick hack: permissions generator
Next
From: Frank van Vugt
Date:
Subject: Re: PL/pgSQL : notion of deferred execution