Thread: regexp idea
Hi,
I have a string like:
Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013
I need to extract date part from the string.
I used the follows:
regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' )
But it gives me result August as it stops at "Augustine".
In my case, date can be in different formats, some record may use "," or some may not.
Any idea to achieve this?
Thanks.
rummandba wrote > Hi, > > I have a string like: > Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A > final, June 5, 2013 > > I need to extract date part from the string. > > I used the follows: > regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' ) > > But it gives me result August as it stops at "Augustine". > > In my case, date can be in different formats, some record may use "," or > some may not. > > Any idea to achieve this? > > Thanks. Not sure how you expect to match "June" with that particular expression but to solve the mis-matching of "Augustine" you can use the word-boundary escapes "\m" (word-start) and "\M" (word-end). Unless you need fuzzy matching on the month name you should simply list all twelve months and possible recognized abbreviations as well. ^.*\m(June|July|August|September)\M[, a-zA-Z0-9]+ I'd consider helping more with forming an actual expression but a single input sample with zero context on how such a string is created gives little to work with. Though after the month there likely cannot be a letter so a better definition would be: \m(August)[, ]+(\d)+[, ]+(\d+) HTH David J. -- View this message in context: http://postgresql.1045698.n5.nabble.com/regexp-idea-tp5768725p5768731.html Sent from the PostgreSQL - general mailing list archive at Nabble.com.
On 08/27/2013 12:44 PM, AI Rumman wrote: > Hi, > > I have a string like: > Gloucester Catholic vs. St. Augustine baseball, South Jersey > Non-Public A final, June 5, 2013 > > I need to extract date part from the string. > > I used the follows: > regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' ) > > But it gives me result August as it stops at "Augustine". > > In my case, date can be in different formats, some record may use "," > or some may not. > > Any idea to achieve this? > > Thanks. select regexp_replace('Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013', E'(^.*)(\\m(June|July|August|Sep))([, a-zA-Z0-9]+)', E'\\2 \\4' ); +----------------+ | regexp_replace | +----------------+ | June 5, 2013 | +----------------+ (1 row)
Thanks. That's awesome.
Do you have any good guide where I may get more knowledge on REGEXP?
On Tue, Aug 27, 2013 at 3:57 PM, Rob Sargent <robjsargent@gmail.com> wrote:
select regexp_replace('Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013', E'(^.*)(\\m(June|July|August|Sep))([, a-zA-Z0-9]+)', E'\\2 \\4' );On 08/27/2013 12:44 PM, AI Rumman wrote:Hi,
I have a string like:
Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013
I need to extract date part from the string.
I used the follows:
regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' )
But it gives me result August as it stops at "Augustine".
In my case, date can be in different formats, some record may use "," or some may not.
Any idea to achieve this?
Thanks.
+----------------+
| regexp_replace |
+----------------+
| June 5, 2013 |
+----------------+
(1 row)
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 08/27/2013 02:04 PM, AI Rumman wrote:
http://www.postgresql.org/docs/9.1/static/functions-matching.html is the best starting pointThanks. That's awesome.Do you have any good guide where I may get more knowledge on REGEXP?On Tue, Aug 27, 2013 at 3:57 PM, Rob Sargent <robjsargent@gmail.com> wrote:select regexp_replace('Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013', E'(^.*)(\\m(June|July|August|Sep))([, a-zA-Z0-9]+)', E'\\2 \\4' );On 08/27/2013 12:44 PM, AI Rumman wrote:Hi,
I have a string like:
Gloucester Catholic vs. St. Augustine baseball, South Jersey Non-Public A final, June 5, 2013
I need to extract date part from the string.
I used the follows:
regexp_matches(title,'[.* ]+\ (Jul|August|Sep)[, a-zA-Z0-9]+' )
But it gives me result August as it stops at "Augustine".
In my case, date can be in different formats, some record may use "," or some may not.
Any idea to achieve this?
Thanks.
+----------------+
| regexp_replace |
+----------------+
| June 5, 2013 |
+----------------+
(1 row)
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
AI Rumman escribió: > Thanks. That's awesome. > Do you have any good guide where I may get more knowledge on REGEXP? This book is awesome: http://regex.info/book.html -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services