Re: Using POSIX Regular Expressions on xml type fields gives inconsistent results - Mailing list pgsql-general

From David Johnston
Subject Re: Using POSIX Regular Expressions on xml type fields gives inconsistent results
Date
Msg-id 009101cddf94$cc940920$65bc1b60$@yahoo.com
Whole thread Raw
In response to Re: Using POSIX Regular Expressions on xml type fields gives inconsistent results  (Denis Papathanasiou <denis.papathanasiou@banrai.com>)
List pgsql-general
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
> owner@postgresql.org] On Behalf Of Denis Papathanasiou
> Sent: Friday, December 21, 2012 10:43 AM
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] Using POSIX Regular Expressions on xml type fields
> gives inconsistent results
>
> On 12/21/2012 10:35 AM, David Johnston wrote:
> >>
> >> If you look at the four examples which follow the posix match table
> >> in the docs (http://www.postgresql.org/docs/9.1/static/functions-
> >> matching.html#FUNCTIONS-POSIX-TABLE),
> >> some of them work from the left side, e.g.:
> >>
> >> 'abc' ~ '(b|d)'  true
> >>
> >> In my original example, I found I could write this from left to right
> >> like
> > this,
> >> and it would still work:
> >>
> >> '(b|d)' ~ 'abc'  true
> >
> > Really???
> >
> > Testing on Windows 9.0.4 this expression returns FALSE, not true as
> > you claim.  Please try again and reply with detailed version
> > information and the exact query(s) used if you can get the behavior to
> repeat itself.
>
> => select id from form_d where '(kumar|gonzales)' ~* any(
> CAST(xpath('//PersonsList/PersonName/LastName/text()', data) as TEXT[])
> );
>
> This expression returned (correctly) the information I wanted.
>
> Shouldn't it *not* have worked b/c the '(kumar|gonzales)' regex is on the
> left?

It worked for three reasons:

1) The expression on the right-hand side was a valid regular expression
2) The value on the left-hand side contained a string that happened to
exactly match the expression on the right-hand side
3) The "(kumar|gonzales)" text, while it happens to look like a regular
expression, is just plain text because it appears on the left-hand side of
the operator.

Arguably "regular expression" should have its own data type just like JSON
and XML since RegEx is text-like but with special validation
characteristics.  Lacking that the system is unable to help in this
situation.  You may have gotten help if someone had a name that did not
resolve to a valid regular expression - in which case the match attempt
would have raised an error.

You said above that '(b|d') ~ 'abc' returned TRUE for you.  Did you actually
test that exact (simple) expression or did you immediately jump to your
convoluted example with XML and ANY(array)?

>
> I was hoping to be able to use the xml type field to be able to do '^a'
> type searches on names (e.g. a search for 'Fred' would also match
'Frederick',
> etc.) but since it seems that I cannot, I'll look at different ways of
solving this
> problem.
>

See Tom's suggestion of creating a custom function and operator that
reverses the order of the two text fields.

David J.




pgsql-general by date:

Previous
From: "jg"
Date:
Subject: Re: Coalesce bug ?
Next
From: Tom Lane
Date:
Subject: Re: Using POSIX Regular Expressions on xml type fields gives inconsistent results