Home > mailing lists

Re: Notes about fixing regexes and UTF-8 (yet again) - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Notes about fixing regexes and UTF-8 (yet again)
Date	February 18, 2012 19:45:30
Msg-id	7392.1329608710@sss.pgh.pa.us Whole thread Raw
In response to	Re: Notes about fixing regexes and UTF-8 (yet again) (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
List	pgsql-hackers

Tree view

Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Yeah, it's conceivable that we could implement something whereby
>> characters with codes above some cutoff point are handled via runtime
>> calls to iswalpha() and friends, rather than being included in the
>> statically-constructed DFA maps.  The cutoff point could likely be a lot
>> less than U+FFFF, too, thereby saving storage and map build time all
>> round.

> It's been proposed to build a “regexp” type in PostgreSQL which would
> store the DFA directly and provides some way to run that DFA out of its
> “storage” without recompiling.

> Would such a mechanism be useful here?

No, this is about what goes into the DFA representation in the first
place, not about how we store it and reuse it.
        regards, tom lane

pgsql-hackers by date:

From: Dimitri Fontaine
Date: 18 February 2012, 19:12:36
Subject: Re: Future of our regular expression code

From: Tom Lane
Date: 18 February 2012, 19:56:00
Subject: Re: Future of our regular expression code

Re: Notes about fixing regexes and UTF-8 (yet again) - Mailing list pgsql-hackers

Previous

Next