Re: patch adding new regexp functions - Mailing list pgsql-patches

From Alvaro Herrera
Subject Re: patch adding new regexp functions
Date
Msg-id 20070215155625.GM4682@alvh.no-ip.org
Whole thread Raw
In response to Re: patch adding new regexp functions  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: patch adding new regexp functions
List pgsql-patches
Tom Lane wrote:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > so that you would have the position for each match, automatically.  Is
> > this information available in the regex code?
>
> Certainly, that's where we got the text snippets from to begin with.
> However, I'm not sure that this is important enough to justify a special
> type --- for one thing, since we don't have arrays of composites, that
> would foreclose responding to Peter's concern that SETOF is the wrong
> thing.

My point is that if you want to have the order in which the matches were
found, you can do that easily by looking at the positions; no need to
create an ordered array.  Which does respond to Peter's concern, since
the point was to keep the ordering of matches, which an array does; but
if we provide the positions, the SETOF way does as well.

On the other hand, I don't think it's impossible to have matches that
start earlier than others in the string, but are actually found later
(say, because they are a parentized expression that ends later).  So
giving the starting positions allows one to know where are they located,
rather than where were they reported.  (I don't really know if the
matches are sorted before reporting though.)

> If you look at the Perl and Tcl APIs for regexes, they return
> just the strings, not the numerical positions; and I've not heard anyone
> complaining about that.

I know, but that may be just because it would be too much extra
complexity for them (in terms of user API) to be returning the positions
along the text.  I know I'd be fairly annoyed if =~ in Perl returned an
array of hashes { text => 'foo', position => 42} instead of array of
text.  We don't have that problem.

In fact, I would claim that's much easier to deal with a SETOF function
than is to deal with text[].

Regarding the "nobody complains" argument, I don't find that
particularly compelling; witness how people gets used to working around
limitations in MySQL ... ;-)

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

pgsql-patches by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Autovacuum launcher
Next
From: Tom Lane
Date:
Subject: Re: Autovacuum launcher