Thread: CREATE CUSTOM TEXT SEARCH PARSER
Hi,
I'd like to build a custom text search parser and then use it within a custom text search configuration.
It would be great if you could give us an example showing how to build a custom parser, including examples of start, gettoken and end functions.
It would be even greater if one could add custom rules for parsing text to the default parser....
I appreciate so much any hint!
Katharina
I'd like to build a custom text search parser and then use it within a custom text search configuration.
It would be great if you could give us an example showing how to build a custom parser, including examples of start, gettoken and end functions.
It would be even greater if one could add custom rules for parsing text to the default parser....
I appreciate so much any hint!
Katharina
Katharina kuhn <katykuhn@gmail.com> wrote: > I'd like to build a custom text search parser and then use it > within a custom text search configuration. > It would be great if you could give us an example showing how to > build a custom parser, including examples of start, gettoken and > end functions. You might want to look at the contrib/test_parser directory. Then again, you might not -- I needed some custom tsearch2 parsing behavior and struggled with a custom parser based on that for a couple days before I decided that it was easier to use regular expression functions within pl/pgsql to pick out what I wanted and cast it to a tsvector. This was less code and seemed less fragile than the developing soemthing based on the contrib example. YMMV, of course. This motivated me to put a rewrite of the current tsearch2 parser to something based on regular expressions onto my personal PostgreSQL TODO list. (No guarantees on when I might get to it, though.) -Kevin
Thank you Kevin!
I'll look at the contrib/test_parser directory.
Any way, I agree with you. I actually made a pl/pgsql function for pre-parsing documents
based on my own needs, and cast the results to a tsvector normally. It works fine enough!
Katharina
I'll look at the contrib/test_parser directory.
Any way, I agree with you. I actually made a pl/pgsql function for pre-parsing documents
based on my own needs, and cast the results to a tsvector normally. It works fine enough!
Katharina
On Tue, Nov 2, 2010 at 2:58 PM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
Katharina kuhn <katykuhn@gmail.com> wrote:You might want to look at the contrib/test_parser directory. Then
> I'd like to build a custom text search parser and then use it
> within a custom text search configuration.
> It would be great if you could give us an example showing how to
> build a custom parser, including examples of start, gettoken and
> end functions.
again, you might not -- I needed some custom tsearch2 parsing
behavior and struggled with a custom parser based on that for a
couple days before I decided that it was easier to use regular
expression functions within pl/pgsql to pick out what I wanted and
cast it to a tsvector. This was less code and seemed less fragile
than the developing soemthing based on the contrib example. YMMV, of
course.
This motivated me to put a rewrite of the current tsearch2 parser to
something based on regular expressions onto my personal PostgreSQL
TODO list. (No guarantees on when I might get to it, though.)
-Kevin