Re: using Tsearch2 for chemical text - Mailing list pgsql-general

From Tom Lane
Subject Re: using Tsearch2 for chemical text
Date
Msg-id 6075.1185403875@sss.pgh.pa.us
Whole thread Raw
In response to using Tsearch2 for chemical text  (Rajarshi Guha <rguha@indiana.edu>)
Responses Re: using Tsearch2 for chemical text  (Tatsuo Ishii <ishii@postgresql.org>)
Re: using Tsearch2 for chemical text  (Naz Gassiep <naz@mira.net>)
List pgsql-general
Rajarshi Guha <rguha@indiana.edu> writes:
> My problem is that the name column contains names of chemicals. Now
> for many cases this may simply be a number (1674-56-2) and in other
> cases it may be an alphanumeric string (such as (-)O-acetylcarnitine
> or 1,2-cis-dihydroxybenzoate). In some cases it is a well-known word
> (say viagra or calcium  chloride or pentathol).

> My question is: will Tsearch2 be able to handle this type of text?

I think you might need to write a custom lexer to divide the strings
into meaningful units.  If there are subsections of these names that
make sense to search for, then tsearch2 can certainly handle the
mechanics of that, but I doubt that the standard rules will divide
these names into lexemes usefully.

            regards, tom lane

pgsql-general by date:

Previous
From: Rajarshi Guha
Date:
Subject: using Tsearch2 for chemical text
Next
From: "Dann Corbit"
Date:
Subject: Re: using Tsearch2 for chemical text