tsearch parser overhaul - Mailing list pgsql-hackers

From Kevin Grittner
Subject tsearch parser overhaul
Date
Msg-id 4B210D9E020000250002D344@gw.wicourts.gov
Whole thread Raw
In response to Re: tsearch parser inefficiency if text includes urls or emails - new version  (Alvaro Herrera <alvherre@commandprompt.com>)
Responses Re: tsearch parser overhaul
List pgsql-hackers
re:
http://archives.postgresql.org/pgsql-hackers/2009-11/msg00754.php

Alvaro Herrera <alvherre@commandprompt.com> wrote:
> Kevin Grittner wrote:
> 
>> (Note: I personally would much rather see the performance
>> penalty addressed this way, and a TODO added for the more
>> invasive work, than to leave this alone for the next release if
>> there's nobody willing to tackle the problem at a more
>> fundamental level.)
> 
> +1
I haven't added a TODO yet because I'm not sure how to frame it. 
I'm inclined that it would be no more work to replace the current
recursively called state engine with something easier to read and
understand than to try to fix the current oddities.  Perhaps
something along the lines of this?:
http://vo.astronet.ru/arxiv/dict_regex.html
I suspect we'd need to get it to use the same regexp code used
elsewhere in PostgreSQL.
Thoughts?
-Kevin


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: XLogInsert
Next
From: Robert Haas
Date:
Subject: Re: Adding support for SE-Linux security