Re: writing new regexp functions - Mailing list pgsql-hackers

From Jeremy Drake
Subject Re: writing new regexp functions
Date
Msg-id Pine.BSO.4.64.0702020011160.28908@resin.csoft.net
Whole thread Raw
In response to Re: writing new regexp functions  (David Fetter <david@fetter.org>)
Responses Re: writing new regexp functions  (Jeremy Drake <pgsql@jdrake.com>)
Re: writing new regexp functions  (Jeremy Drake <pgsql@jdrake.com>)
List pgsql-hackers
On Thu, 1 Feb 2007, David Fetter wrote:

> Yes, although it might have the same name, as in regex_match(pattern
> TEXT, string TEXT, return_pre_and_post BOOL).
>
> The data structure could be something like
>
> TYPE matches (
>     prematch TEXT,
>      match    TEXT[],
>      postmatch TEXT
> )

I just coded up for this:

CREATE FUNCTION regexp_matches(IN str text, IN pattern text) RETURNS
text[]   AS 'MODULE_PATHNAME', 'regexp_matches'   LANGUAGE C IMMUTABLE STRICT;

CREATE FUNCTION regexp_matches(       IN str text, IN pattern text, IN return_pre_and_post bool,       OUT prematch
text,OUT fullmatch text, OUT matches text[], OUT
 
postmatch text) RETURNS record   AS 'MODULE_PATHNAME', 'regexp_matches'   LANGUAGE C IMMUTABLE STRICT;


Which works like this:

jeremyd=# \pset null '\\N'
Null display is "\N".
jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(beque)$re$);regexp_matches
----------------{bar,beque}
(1 row)

jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(beque)$re$, false);prematch | fullmatch |   matches   | postmatch
----------+-----------+-------------+-----------\N       | \N        | {bar,beque} | \N
(1 row)

jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(beque)$re$, true);prematch | fullmatch |   matches   | postmatch
----------+-----------+-------------+-----------foo      | barbeque  | {bar,beque} | baz
(1 row)


And then you also have this behavior in the matches array:

jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(.*)(beque)$re$);regexp_matches
----------------{bar,"",beque}
(1 row)

jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(.+)(beque)$re$);regexp_matches
----------------\N
(1 row)

jeremyd=# select * from regexp_matches('foobarbequebaz',
$re$(bar)(.+)?(beque)$re$); regexp_matches
------------------{bar,NULL,beque}
(1 row)

Reasonable?

-- 
A.A.A.A.A.:An organization for drunks who drive


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Function proposal to find the type of a datum
Next
From: NikhilS
Date:
Subject: --enable-debug does not work with gcc