Home > mailing lists

Multiple word synonyms (maybe?) - Mailing list pgsql-general

From	Tim van der Linden
Subject	Multiple word synonyms (maybe?)
Date	October 20, 2015 13:35:48
Msg-id	20151020193538.df8194ca307fb5f9cb0ab13d@shisaa.jp Whole thread Raw
Responses	Re: Multiple word synonyms (maybe?) (rob stone <floriparob@gmail.com>) Re: Multiple word synonyms (maybe?) (Geoff Winkless <pgsqladmin@geoff.dj>)
List	pgsql-general

Tree view

Hi All

I have a question regarding PostgreSQL's full text capabilities and (presumably) the synonym dictionary.

I'm currently implementing FTS on a medical themed setup which uses domain specific jargon to denote a bunch of stuff.
Aspecific request I wish to implement here are the jargon synonyms that are heavily used. 

Of course, I can simply go ahead and create my own synonym dictionary with a jargon specific synonym file to feed it.
However,most of the synonyms are comprised out of more then a single word.  

The term "heart attack" for example has the following "synonyms":

- Acute MI
- MI
- Myocardial infarction

As far as I understand it, the tokenizer within PostgreSQL FTS engine splits words on spaces to generate tokens which
arethen proposed to each dictionary. I think it is therefor impossible to have "multi-word synonyms" in this sense as
multiplewords cannot reach the dictionary. The term "heart attack" would be presented as the tokens "heart" and
"attack".

From a technical standpoint I understand FTS is about looking at individual words and lexemizing them ... yet from a
naturallanguage lookup perspective you still wish to tie "Heart attack" to "Acute MI" so when a client search on one,
theother will turn up as well. 

Should I write my own tokenizer to catch all these words and present them as a single token? Or is this completely
outsidethe realm of FTS (or FTS within Postgresql)? 

Cheers,
Tim

pgsql-general by date:

From: Nicolas Paris
Date: 20 October 2015, 12:28:25
Subject: Re: PSQL Tools

From: rob stone
Date: 20 October 2015, 13:58:12
Subject: Re: Multiple word synonyms (maybe?)

Multiple word synonyms (maybe?) - Mailing list pgsql-general

Previous

Next