Re: regexp idea - Mailing list pgsql-general

From David Johnston
Subject Re: regexp idea
Date
Msg-id 1377630939277-5768731.post@n5.nabble.com
Whole thread Raw
In response to regexp idea  (AI Rumman <rummandba@gmail.com>)
List pgsql-general
rummandba wrote
> Hi,
>
> I have a string like:
> Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A
> final, June 5, 2013
>
> I need to extract date part from the string.
>
> I used the follows:
> regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' )
>
> But it gives me result August as it stops at "Augustine".
>
> In my case, date can be in different formats, some record may use "," or
> some may not.
>
> Any idea to achieve this?
>
> Thanks.

Not sure how you expect to match "June" with that particular expression but
to solve the mis-matching of "Augustine" you can use the word-boundary
escapes "\m" (word-start) and "\M" (word-end).

Unless you need fuzzy matching on the month name you should simply list all
twelve months and possible recognized abbreviations as well.

^.*\m(June|July|August|September)\M[, a-zA-Z0-9]+

I'd consider helping more with forming an actual expression but a single
input sample with zero context on how such a string is created gives little
to work with.

Though after the month there likely cannot be a letter so a better
definition would be:

\m(August)[, ]+(\d)+[, ]+(\d+)

HTH

David J.




--
View this message in context: http://postgresql.1045698.n5.nabble.com/regexp-idea-tp5768725p5768731.html
Sent from the PostgreSQL - general mailing list archive at Nabble.com.


pgsql-general by date:

Previous
From: AI Rumman
Date:
Subject: regexp idea
Next
From: Alban Hertroys
Date:
Subject: Re: OLAP