Re: Future of our regular expression code - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Future of our regular expression code
Date
Msg-id 1863.1329594761@sss.pgh.pa.us
Whole thread Raw
In response to Re: Future of our regular expression code  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Future of our regular expression code  (Simon Riggs <simon@2ndQuadrant.com>)
Re: Future of our regular expression code  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Re: Future of our regular expression code  (Brendan Jurd <direvus@gmail.com>)
List pgsql-hackers
Stephen Frost <sfrost@snowman.net> writes:
> * Simon Riggs (simon@2ndQuadrant.com) wrote:
>> Do we have volunteers that might save Tom from taking on this task?
>> It's not something that requires too much knowledge and experience of
>> PostgreSQL, so is an easier task for a newcomer.

> Sure, it doesn't require knowledge of PG, but I dare say there aren't
> very many newcomers who are going to walk in knowing how to manage
> complex regex code..  I haven't seen too many who can update gram.y,
> much less make our regex code handle Unicode better.  I'm all for
> getting other people to help with the code, of course, but I wouldn't
> hold my breath and leave existing bugs open on the hopes that someone's
> gonna show up.

Yeah ... if you *don't* know the difference between a DFA and an NFA,
you're likely to find yourself in over your head.  Having said that,
this is eminently learnable stuff and pretty self-contained, so somebody
who had the time and interest could make themselves into an expert in
a reasonable amount of time.  I'm not really eager to become the
project's regex guru, but only because I have ninety-nine other things
to do not because I don't find it interesting.  Right at the moment I'm
probably far enough up the learning curve that I can fix the backref
problem faster than anyone else, so I'm kind of inclined to go do that.
But I'd be entirely happy to let someone else become the lead hacker in
this area going forward.  What we can't do is just pretend that it
doesn't need attention.

In the long run I do wish that Spencer's code would become a standalone
package and have more users than just us and Tcl, but that is definitely
work I don't have time for now.  I think somebody would need to commit
significant amounts of time over multiple years to give it any real hope
of success.

One immediate consequence of deciding that we are lead maintainers and
not just consumers is that we should put in some regression tests,
instead of taking the attitude that the Tcl guys are in charge of that.
I have a head cold today and am not firing on enough cylinders to do
anything actually complicated, so I was thinking of spending the
afternoon transliterating the Tcl regex test cases into SQL as a
starting point.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Jan Urbański
Date:
Subject: Re: Potential reference miscounts and segfaults in plpython.c
Next
From: Simon Riggs
Date:
Subject: Re: Initial 9.2 pgbench write results