we're using tsearch2 with the german dictionary
http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/dicts/ispell/ispell-german-compound.tar.gz
for fulltext search.
the indexing is configured as follows:
CREATE TEXT SEARCH DICTIONARY public.german (
TEMPLATE = ispell,
DictFile = german,
AffFile = german,
StopWords = german
);
CREATE TEXT SEARCH CONFIGURATION public.default ( COPY = pg_catalog.german );
ALTER TEXT SEARCH CONFIGURATION public.default
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
word, hword, hword_part
WITH public.german;
-------------------------
select * from ts_debug('default', 'hundshütte');
works as expected: creates the two lexemes: "{hund,hütte}"
BUT
SELECT to_tsvector('default','lovely und bauarbeiter/in');
looses a lot of stuff:
"'bauarbeiter/in':2"
some more debugging shows:
SELECT * from ts_debug('default','lovely und bauarbeiter/in');
"asciiword";"Word, all ASCII";"lovely";"{german}";"german";""
"blank";"Space symbols";" ";"{}";"";""
"asciiword";"Word, all ASCII";"und";"{german}";"german";"{}"
"blank";"Space symbols";" ";"{}";"";""
"file";"File or path
name";"bauarbeiter/in";"{simple}";"simple";"{bauarbeiter/in}"
a) unknown words are just beeing dropped
b) words with slashes are interpreted as file paths and the first path
is beeing dropped.
any idea how we can fix this?
jodok
--
Jodok Batlogg, Vorstand
Lovely Systems AG
Telefon +43 5572 908060, Fax +43 5572 908060-77, Mobil +43 664 9636963
Schmelzhütterstraße 26a, 6850 Dornbirn, Austria
Sitz: Dornbirn, FB: Landesgericht Feldkirch, FN: 208859x, UID: ATU51736705
Aufsichtsratsvorsitzender: Christian Lutz
Vorstand: Jodok Batlogg, Manfred Schwendinger