> On Jan 3, 2021, at 7:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Over the holiday break I've been fooling with some regex performance
> improvements. I don't have anything ready to show yet in that line,
> but I was feeling the need for more-thorough test coverage, so I set
> to work on something that's been in the back of my mind for a long
> time: we need to absorb the test cases that Henry Spencer wrote way-
> back-when for that code, which up to now existed only as a script
> in the Tcl test suite. That state of affairs seemed OK to me in
> the beginning, when we thought of Tcl as the upstream for that code,
> and figured they'd vet any nontrivial changes. They've pretty
> thoroughly dropped the ball on that though, and indeed I think that
> they now believe *we're* the upstream. So we need to have a test
> suite that reflects that status, at least to the extent of running
> all the test cases that exist for that code.
>
> Accordingly, here's a new src/test/modules package that creates a
> function modeled on regexp_matches(), but using a set of flags that
> matches Spencer's design for the Tcl test suite, allowing parts
> of src/backend/regex/ to be tested that can't be reached with our
> existing SQL-exposed functions. The test scripts in the module
> reproduce all the tests in Tcl's "tests/reg.test" script as of
> Tcl 8.6.10, plus a few others that I felt advisable, such as tests
> for the lookbehind constraints we added a few years ago. (Note:
> Tcl also has regexp.test and regexpComp.test, but those seem to be
> oriented towards testing their language-specific wrappers not the
> regex engine itself.)
>
> According to my testing, this increases our test code coverage for
> src/backend/regex/ from 71.1% to 86.7%, which is not too shabby,
> especially seeing that a lot of the remainder is not-deterministically-
> reachable code for malloc failure handling.
>
> Thoughts? Is anyone interested in reviewing this? Since it's only
> test code, I'd be okay with pushing it without review, but I'd be
> happy if someone else wants to look at it.
I've quickly read this over and generally like it. Thanks for working on this!
Have you thought about whether If it weren't in test/modules, it might be nice to expose test_regex from SQL with a
slightlydifferent interface that doesn't throw on regex compilation error? Maybe something in contrib? It might be
usefulto some users to validate regular expressions. I'm just asking... I don't have any problem with how you have it
here.
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company