Re: Notes about fixing regexes and UTF-8 (yet again) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Notes about fixing regexes and UTF-8 (yet again)
Date
Msg-id 7392.1329608710@sss.pgh.pa.us
Whole thread Raw
In response to Re: Notes about fixing regexes and UTF-8 (yet again)  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
List pgsql-hackers
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> Yeah, it's conceivable that we could implement something whereby
>> characters with codes above some cutoff point are handled via runtime
>> calls to iswalpha() and friends, rather than being included in the
>> statically-constructed DFA maps.  The cutoff point could likely be a lot
>> less than U+FFFF, too, thereby saving storage and map build time all
>> round.

> It's been proposed to build a “regexp” type in PostgreSQL which would
> store the DFA directly and provides some way to run that DFA out of its
> “storage” without recompiling.

> Would such a mechanism be useful here?

No, this is about what goes into the DFA representation in the first
place, not about how we store it and reuse it.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Dimitri Fontaine
Date:
Subject: Re: Future of our regular expression code
Next
From: Tom Lane
Date:
Subject: Re: Future of our regular expression code