Re: regexp_match() returning text - Mailing list pgsql-hackers

From David G. Johnston
Subject Re: regexp_match() returning text
Date
Msg-id CAKFQuwYw3XJ_cnuqQ6QrmrRzrA83Fw-GvCoz+PVrPZQa++JJvA@mail.gmail.com
Whole thread Raw
In response to Re: regexp_match() returning text  (Emre Hasegeli <emre@hasegeli.com>)
Responses Re: regexp_match() returning text
List pgsql-hackers
On Saturday, June 4, 2016, Emre Hasegeli <emre@hasegeli.com> wrote:
> The main problem being solved is the use of a SETOF result.  I'm inclined to
> prefer that the final, single, result is still an array.

I have changed it like that.  New patch attached.

Good
 

> I've got a style issue with the information_schema - I like to call it
> useless-use-of-E'' -  but that was there long before this patch...

We better not touch it.

Agreed
 

> /* user mustn't specify 'g' for regexp_split */ - do we add "or
> regexp_match" or just removed the extraneous detail?

I don't think it would be a nice error message.

Just meant the comment.  I think they look good now though the part about force global in the split area just says what the code does and not why.  Namely that when splitting we find all matches so the input is completely split.  We disallow a user specification of global for some arbitrary reason; though I don't have any reason to change that and the present behavior doesn't cause people to complain.
 

> There seems to be scope creep regarding "regexp_split_to_table" that I'm
> surprised to find.  Related to that is the unexpected removal of the
> "force_glob" parameter to setup_regexp_matches.  You took what was a single
> block of code and now duplicated it without any explanation in the commit
> message (a code comment wouldn't work for this kind of change).  The change
> to flags from passing a pointer to text to passing in a pointer to a
> previously derived pg_re_flags makes more sense on its face, and it is
> apparently a non-public API, but again constitutes a refactoring that at
> least would ideally be a separate commit from the one the introduces the new
> behavior.

That check doesn't belong to setup_regexp_matches() in the first place.
The arguments of the function are organised to be caller agnostic,
and then it gives an error on behalf of regexp_split().  The check
fits better to the regexp_split() functions even with duplication.

I can split it to another patch, but I think these kind of changes most
often go together.

After reading this again I understand better what is going on and agree with the changes as written.
 

Would you mind adding yourself to the reviewers on the Commitfest app?
I think you have already read though it.


Done.

I didn't compile either patch but given the scope and complexity I'd say it is ready for committer without that confirmed.  Tom usually touches the regexp code and I'm pretty sure he'll look at this with an eye no one else has.  Though I wouldn't expect anything until work on 10 begins in earnest.

David J.

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg9.6 segfault using simple query (related to use fk for join estimates)
Next
From: "David G. Johnston"
Date:
Subject: Re: Logical decoding slots can go backwards when used from SQL, docs are wrong