Thread: BUG #12609: use of regexp_matches drops rows when there is no match

BUG #12609: use of regexp_matches drops rows when there is no match

From
trevor@adchemix.com
Date:
The following bug has been logged on the website:

Bug reference:      12609
Logged by:          Trevor J Peschek
Email address:      trevor@adchemix.com
PostgreSQL version: 9.3.3
Operating system:   Red Hat 4.6.3-2
Description:

When using regexp_matches while referencing a location in the array, if the
match fails, the entire row is skipped instead of just a null for the column
using regexp_matches.  Example:

select
        aColumn,
        (regexp_matches(queryCol, '(https?)://(.*?)/(.*?)\?(.*?)'))[3]
from myTable

Even if the column with the regexp_matches returns a null, we would expect
the row to be populated with aColumn and then a null.  What happens though
is nothing is returned.

Re: BUG #12609: use of regexp_matches drops rows when there is no match

From
David G Johnston
Date:
trevor wrote
> The following bug has been logged on the website:
>
> Bug reference:      12609
> Logged by:          Trevor J Peschek
> Email address:

> trevor@

> PostgreSQL version: 9.3.3
> Operating system:   Red Hat 4.6.3-2
> Description:
>
> When using regexp_matches while referencing a location in the array, if
> the
> match fails, the entire row is skipped instead of just a null for the
> column
> using regexp_matches.  Example:
>
> select
>         aColumn,
>         (regexp_matches(queryCol, '(https?)://(.*?)/(.*?)\?(.*?)'))[3]
> from myTable
>
> Even if the column with the regexp_matches returns a null, we would expect
> the row to be populated with aColumn and then a null.  What happens though
> is nothing is returned.

Working as designed though I'll agree that it is somewhat surprising.  The
main problem is that it doesn't "return NULL" but instead returns the empty
set which suppresses the row it is attached to.

You must wrap the regexp_matches into a scalar subquery to get this to work:

SELECT aColumn, (SELECT regexp_matches(queryCo, '...')) FROM myTable

Personally I would suggest writing a simple function, call it
regexp_matches_single(), that returns a single array instead of a set.  In
almost all cases when you are doing stuff like this you expect one match or
you wish to return null.  Returning multiple rows is typically a problem and
you will need to decide what to do inside your function if it occurs (e.g.
return first match or throw an exception).

David J.



--
View this message in context:
http://postgresql.nabble.com/BUG-12609-use-of-regexp-matches-drops-rows-when-there-is-no-match-tp5834794p5834795.html
Sent from the PostgreSQL - bugs mailing list archive at Nabble.com.