Thread: Regexp named capture groups

Regexp named capture groups

From
Joel Jacobson
Date:
Hi hackers,

Is anyone working on this feature[1] also for PostgreSQL's regex engine?

I'm thinking it could work something like this:

joel=# \df regexp_match
                              List of functions
   Schema   |     Name     | Result data type | Argument data types |  Type
------------+--------------+------------------+---------------------+--------
 pg_catalog | regexp_match | jsonb            | text, text          | normal
 pg_catalog | regexp_match | jsonb            | text, text, text    | normal
(2 rows)

joel=#* SELECT regexp_match_named(
joel(#*    '2018-12-31',
joel(#*    '(?<year>[0-9]{4})-(?<month>[0-9]{2})-(?<day>[0-9]{2})'
joel(#* );
              regexp_match_named
----------------------------------------------
 {"day": "31", "year": "2018", "month": "12"}
(1 row)

I think this feature would be awesome, for the reasons mentioned in [1], quote:

"Referring to capture groups via numbers has several disadvantages:
1. Finding the number of a capture group is a hassle: you have to
count parentheses.
2. You need to see the regular expression if you want to understand
what the groups are for.
3. If you change the order of the capture groups, you also have to
change the matching code."

[1] http://2ality.com/2017/05/regexp-named-capture-groups.html

Best regards,

Joel Jacobson


Re: Regexp named capture groups

From
Pavel Stehule
Date:


2018-02-03 11:19 GMT+01:00 Joel Jacobson <joel@trustly.com>:
Hi hackers,

Is anyone working on this feature[1] also for PostgreSQL's regex engine?

I'm thinking it could work something like this:

joel=# \df regexp_match
                              List of functions
   Schema   |     Name     | Result data type | Argument data types |  Type
------------+--------------+------------------+---------------------+--------
 pg_catalog | regexp_match | jsonb            | text, text          | normal
 pg_catalog | regexp_match | jsonb            | text, text, text    | normal
(2 rows)

joel=#* SELECT regexp_match_named(
joel(#*    '2018-12-31',
joel(#*    '(?<year>[0-9]{4})-(?<month>[0-9]{2})-(?<day>[0-9]{2})'
joel(#* );
              regexp_match_named
----------------------------------------------
 {"day": "31", "year": "2018", "month": "12"}
(1 row)

I think this feature would be awesome, for the reasons mentioned in [1], quote:

"Referring to capture groups via numbers has several disadvantages:
1. Finding the number of a capture group is a hassle: you have to
count parentheses.
2. You need to see the regular expression if you want to understand
what the groups are for.
3. If you change the order of the capture groups, you also have to
change the matching code."

[1] http://2ality.com/2017/05/regexp-named-capture-groups.html

looks like nice feature

Pavel
 


Best regards,

Joel Jacobson


Re: Regexp named capture groups

From
Michael Paquier
Date:
On Sat, Feb 03, 2018 at 01:55:31PM +0100, Pavel Stehule wrote:
> 2018-02-03 11:19 GMT+01:00 Joel Jacobson <joel@trustly.com>:
>> Is anyone working on this feature[1] also for PostgreSQL's regex
>> engine?

Note that I know of.

>> I think this feature would be awesome, for the reasons mentioned in [1],
>> quote:
>>
>> "Referring to capture groups via numbers has several disadvantages:
>> 1. Finding the number of a capture group is a hassle: you have to
>> count parentheses.
>> 2. You need to see the regular expression if you want to understand
>> what the groups are for.
>> 3. If you change the order of the capture groups, you also have to
>> change the matching code."
>>
>> [1] http://2ality.com/2017/05/regexp-named-capture-groups.html
>
> looks like nice feature

Yes, it looks that this could allow the simplification of equivalent
queries, which I guess would use a CTE to achieve the same.
--
Michael

Attachment