Thread: precompiling regular expressions in plpgsql

precompiling regular expressions in plpgsql

From
Brian Hurt
Date:
Is there a way to precompile regular expressions in plpgsql?

I have a plpgsql function that's doing some fairly heavy duty regular
expression matching, including a lot of calls to substring, as I iterate
over a table.  Worse yet, the regular expressions when written out
completely are fairly complex- except that they are a small number of
common patterns done over and over again.  To make them readable and
writable, I've been splitting them up into variables, so I now have code
like (pardon the typos):

    opt_space = '[[:space:]]*';
    start_of_string = '^';
    end_of_string = opt_space || '$';
    num = opt_space ||
'(([[:digit:]]+(\\.[[:digit:]]+))|(\\.[[:digit:]]+))';


    FOR rec IN SELECT * FROM table
    LOOP

       IF (rec.column ~ (start_of_string || num || opt_space || '%' ||
opt_space || '&'
          || num || end_of_string ))
       THEN
          ...

And much much more in the general pattern.

Now, the strings that I'm using as the pattern as a regular expression
really are constant, even if they don't look like it to postgres.  Is it
possible to precompile these expressions (basically, build up the tables
for the DFA engine ahead of time) in plpgsql?  For various reasons,
switching to another language like perl or python is politically fraught.

Brian


Re: precompiling regular expressions in plpgsql

From
Tom Lane
Date:
Brian Hurt <bhurt@janestcapital.com> writes:
> Now, the strings that I'm using as the pattern as a regular expression
> really are constant, even if they don't look like it to postgres.  Is it
> possible to precompile these expressions (basically, build up the tables
> for the DFA engine ahead of time) in plpgsql?

This already happens internally --- you might need to increase the size
of the cache in src/backend/utils/adt/regexp.c ...

            regards, tom lane