Re: [HACKERS] writing new regexp functions - Mailing list pgsql-patches
From | Jeremy Drake |
---|---|
Subject | Re: [HACKERS] writing new regexp functions |
Date | |
Msg-id | Pine.BSO.4.64.0702041254011.28908@resin.csoft.net Whole thread Raw |
In response to | Re: [HACKERS] writing new regexp functions (David Fetter <david@fetter.org>) |
List | pgsql-patches |
On Sun, 4 Feb 2007, David Fetter wrote: > On Fri, Feb 02, 2007 at 07:01:33PM -0800, Jeremy Drake wrote: > > > Let me know if you see any bugs or issues with this code, and I am > > open to suggestions for further regression tests ;) > > > Things that I still want to look into: > > * regexp flags (a la regexp_replace). > > One more text field at the end is how the regexp_replace() one does > it. That's how I did it. > > * maybe make regexp_matches return setof whatever, if given a 'g' flag > > return all matches in string. > > This is doable with current machinery, albeit a little clumsily. I have implemented this too. > > * maybe a join function that works as an aggregate > > SELECT join(',', col) FROM tbl > > currently can be written as > > SELECT array_to_string(ARRAY(SELECT col FROM tbl), ',') > > The array_accum() aggregate in the docs works OK for this purpose. I have not tackled this yet, I think it may be better to stick with the ARRAY() construct for now. So, here is the new version of the code, and also a new version of the patch to core, which fixes some compile warnings that I did not see at first because I was using ICC rather than GCC. Here is the README.regexp_ext from the tar file: This package contains regexp functions beyond those currently provided in core PostgreSQL, utilizing the regexp engine built into core. This is still a work-in-progress. The most recent version of this code can be found at http://www.jdrake.com/postgresql/regexp/regexp_ext.tar.gz and the prerequisite patch to PostgreSQL core, which has been submitted for review, can be found at http://www.jdrake.com/postgresql/regexp/regexp-export.patch The .tar.gz file expects to be untarred in contrib/. I have made some regression tests that can be run using 'make installcheck' as normal for contrib. I think they exercise the corner cases in the code, but I may very well have missed some. It requires the above mentioned patch to core to compile, as it takes advantage of new exported functions from src/backend/utils/adt/regexp.c. Let me know if you see any bugs or issues with this code, and I am open to suggestions for further regression tests ;) Functions implemented in this module: * regexp_split(str text, pattern text) RETURNS SETOF text regexp_split(str text, pattern text, flags text) RETURNS SETOF text returns each section of the string delimited by the pattern. * regexp_matches(str text, pattern text) RETURNS text[] returns all capture groups when matching pattern against string in an array * regexp_matches(str text, pattern text, flags text) RETURNS SETOF (prematch text, fullmatch text, matches text[], postmatch text) returns all capture groups when matching pattern against string in an array. also returns the entire match in fullmatch. if the 'g' option is given, returns all matches in the string. if the 'r' option is given, also return the text before and after the match in prematch and postmatch respectively. See the regression tests for more details about usage and return values. Recent changes: * I have put the pattern after the string in all of the functions, as discussed on the pgsql-hackers mailing list. * regexp flags (a la regexp_replace). * make regexp_matches return setof whatever, if given a 'g' flag return all matches in string. Things that I still want to look into: * maybe a join function that works as an aggregate SELECT join(',', col) FROM tbl currently can be written as SELECT array_to_string(ARRAY(SELECT col FROM tbl), ',') -- Philogeny recapitulates erogeny; erogeny recapitulates philogeny.
Attachment
pgsql-patches by date: