Re: regexp_match() returning text - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: regexp_match() returning text
Date
Msg-id CAKFQuwZ7buEj2YM4AUu_CJ9OZf9FuMCdHK_mfW-9TrV5Vp6Q2g@mail.gmail.com
Whole thread Raw
In response to regexp_match() returning text  (Emre Hasegeli <emre@hasegeli.com>)
Responses Re: regexp_match() returning text
List pgsql-hackers
On Sun, May 29, 2016 at 1:28 PM, Emre Hasegeli <emre@hasegeli.com> wrote:
Attached patch adds regexp_match() function which is a simple variant of
regexp_matches() that doesn't return a set.  It is based on Tom Lane's
comment to bug #10889 [1].

[1] https://www.postgresql.org/message-id/23769.1404747766@sss.pgh.pa.us

Not sure if we need two functions or what but I dislike that:

"It ignores the parenthesized subexpressions in the pattern."

​The main problem being solved is the use of a SETOF result.  I'm inclined to prefer that the final, single, result is still an array.

Even if we return a single text value instead of an array it would be nice to return the first (only...) parenthesized subexpression so that the input pattern isn't constrained.

​I've got a style issue with the information_schema - I like to call it useless-use-of-E'' -  but that was there long before this patch...

/* user mustn't specify 'g' for regexp_split */ - do we add "or regexp_match" or just removed the extraneous detail?

There seems to be scope creep regarding "regexp_split_to_table" that I'm surprised to find.  Related to that is the unexpected removal of the "force_glob" parameter to setup_regexp_matches.  You took what was a single block of code and now duplicated it without any explanation in the commit message (a code comment wouldn't work for this kind of change).  The change to flags from passing a pointer to text to passing in a pointer to a previously derived pg_re_flags makes more sense on its face, and it is apparently a non-public API, but again constitutes a refactoring that at least would ideally be a separate commit from the one the introduces the new behavior.

Also, I see an assert for the "no subexpressions" policy but I have seemed to have overlooked where the non-assert-enabled user is informed of the prohibition.

Tom's opinion on all this will be needed - I am not a C code writer nor do I follow the patch writing process that closely generally, but I have a keen interest in this topic - but there seems to be a bit more here than simply having a function identical to the existing regexp_matches function that prohibits the global flag and only ever returns a single array per the existing API.

I'm good with the name, regexp_match - even with the behavior being changed to returning an array instead of text.  If there was any active plans to redo the API so that we'd return a composite type instead of an array I'd be more inclined to reserve "match" for that change and make this one a bit more verbose.

David J.

pgsql-hackers by date:

Previous
From: Dean Rasheed
Date:
Subject: Re: Re: Request for Patch Feedback: Lag & Lead Window Functions Can Ignore Nulls
Next
From: Andrew Gierth
Date:
Subject: Re: regexp_match() returning text