Thread: regexp_split_to_array hangs backend

regexp_split_to_array hangs backend

From
"Pavel Stehule"
Date:
Hello,

I found small bug

regexp_split_to_array('123456','1');
regexp_split_to_array('123456','6');
regexp_split_to_array('123456','.');

these parameters hangs backend.

following patch correct it

Regards
Pavel Stehule

./regexp.c
*** ./regexp.c.orig     2007-08-10 14:17:15.000000000 +0200
--- ./regexp.c  2007-08-10 14:19:36.000000000 +0200
***************
*** 1048,1053 ****
--- 1048,1056 ----               {                       int length = splitctx->match.rm_so - startpos + 1;

+                       /* set the offset to the end of this match for
next time */
+                       splitctx->offset = pmatch->rm_eo;
+                       /*                        * If we are trying to match at the beginning
of the string and                        * we got a zero-length match, or if we just
matched where we
***************
*** 1063,1070 ****
         Int32GetDatum(startpos),
         Int32GetDatum(length));

-                       /* set the offset to the end of this match for
next time */
-                       splitctx->offset = pmatch->rm_eo;
                       return result;               }
--- 1066,1071 ----


Re: regexp_split_to_array hangs backend

From
Tom Lane
Date:
"Pavel Stehule" <pavel.stehule@gmail.com> writes:
> I found small bug

> regexp_split_to_array('123456','1');
> regexp_split_to_array('123456','6');
> regexp_split_to_array('123456','.');

> these parameters hangs backend.

This code's got more problems than that :-(

The one that's bothering me right now is that regexp_match() and
regexp_split() cache a compiled regex on first entry to the function,
and then blithely assume it will still be there on repeated calls.

I think probably the best thing to do is do all the matching on the
first call, and have the saved state include an array of character
positions of matches; then repeat calls to the SRF just iterate through
the array.

It seems a bit short of comments too.  Working on it now.
        regards, tom lane