On Sat, 10 Feb 2007, Jeremy Drake wrote:
> On Sat, 10 Feb 2007, Neil Conway wrote:
>
> > * I'm not clear about the control flow in regexp_matches() and
> > regexp_split(). Presumably it's not possible for the call_cntr to
> > actually exceed max_calls, so the error message in these cases should be
> > elog(ERROR), not ereport (the former is for "shouldn't happen" bug
> > scenarios, the latter is for user-facing errors). Can you describe the
> > logic here (e.g. via comments) a bit more?
>
> I added some comments, and changed to using elog instead of ereport.
I fixed a couple more things in this patch. I changed the max calls limit
to the real limit, rather than the arbitrarily high limit that was
previously set (three times the length of the string in bytes). Also, I
changed the checks for offset to compare against wide_len rather than
orig_len, since in multibyte character sets orig_len is the length in
bytes of the string in whatever encoding it is in, while wide_len is the
length in characters, which is what everything else in these functions
deal with.
The calls to text_substr have me somewhat concerned now, also. I think
performance starts to look like O(n^2) in multibyte character sets. But I
think doing anything about it would require this code to know more about
the internals of text than it has any right to. I guess settle for the
correctness now, and if performance is a problem this can be addressed.
Would hate to make this code even more ugly due to premature
optimization...
--
When does summertime come to Minnesota, you ask?
Well, last year, I think it was a Tuesday.