> That a simple case, what about languages as norwegian or german? They
> has compound words and ispell dictionary can split them to lexemes.
> But, usialy there is more than one variant of separation:
>
> forbruksvaremerkelov
> forbruk vare merke lov
> forbruk vare merkelov
> forbruk varemerke lov
> forbruk varemerkelov
> forbruksvare merke lov
> forbruksvare merkelov
> (notice: I don't know translation, just an example. When we working
on > compound word support we found word which has 24 variant of
> separation!!)
>
> So, query 'a + forbruksvaremerkelov' will be awful:
>
> a + ( (forbruk & vare & merke & lov) | (forbruk & vare & merkelov) |
... )
>
> Of course, that is examle just from mind, but solution of phrase
> search should work reasonably with such corner cases.
(Sorry for replying in the wrong place in the thread, I was away for a
trip and unsubscribed meanwhile)
I'm a swede and swedish is similair to norweigan and german. Take this
example:
lång hårig kvinna
långhårig kvinna
Words are put together to make a new word with different meaning. The
first example means "tall hairy woman" and the second is "woman with
long hair". If I would be on f.ex a date site, I'd want the distinction.
;-) If not, i should enter both strings
("lång hårig" | långhårig) & kvinna
...which is perfectly acceptable.
IMHO I don't see any point in splitting these words.
Let's go back to the subject, what about a syntax like this:
idxfti @@ to_tsquery('default', 'pizza & (Chicago | [New York]')
Ie the exact match string is always atomic. Wouldn't that be doable
without any logical implications?
Best regards,
Marcus