Thread: How to extract a substring using Regex

How to extract a substring using Regex

From
"Postgres User"
Date:
Hi,

I'm new to Regex in Postgres.  Can someone give me a quick pointer on
how I'd SELECT the substring between   '<name>'    and  '</name>'  in
a field?

Sample field data:
address city here <name>Rogers, Jim</name> zip code place

and I'd like the SELECT to return only:
Rogers, Jim

Thanks!

Re: How to extract a substring using Regex

From
Michael Glaesemann
Date:
On Aug 23, 2007, at 19:33 , Postgres User wrote:

> I'm new to Regex in Postgres.  Can someone give me a quick pointer on
> how I'd SELECT the substring between   '<name>'    and  '</name>'  in
> a field?

Check out regexp_replace:

http://www.postgresql.org/docs/8.2/interactive/functions-
matching.html#FUNCTIONS-POSIX-REGEXP

One of the forms of substring might work for you, too.

Michael Glaesemann
grzm seespotcode net



Re: How to extract a substring using Regex

From
"Postgres User"
Date:
Yes, I read the manual.  I think I had a problem because of the
special chars (< / >) that I'm trying to search for...  Still looking
for the right syntax.

On 8/23/07, Michael Glaesemann <grzm@seespotcode.net> wrote:
>
> On Aug 23, 2007, at 19:33 , Postgres User wrote:
>
> > I'm new to Regex in Postgres.  Can someone give me a quick pointer on
> > how I'd SELECT the substring between   '<name>'    and  '</name>'  in
> > a field?
>
> Check out regexp_replace:
>
> http://www.postgresql.org/docs/8.2/interactive/functions-
> matching.html#FUNCTIONS-POSIX-REGEXP
>
> One of the forms of substring might work for you, too.
>
> Michael Glaesemann
> grzm seespotcode net
>
>
>

Re: How to extract a substring using Regex

From
Michael Glaesemann
Date:
[Please don't top post as it makes the discussion more difficult to
follow, and please reply to the list so that others may benefit from
and participate in the discussion.]

On Aug 23, 2007, at 19:49 , Postgres User wrote:

> Yes, I read the manual.  I think I had a problem because of the
> special chars (< / >) that I'm trying to search for...  Still looking
> for the right syntax.

Why don't you show us what you've tried and the errors you're
getting? That way we can help you figure out what you're doing wrong
rather than just give you an answer.

Michael Glaesemann
grzm seespotcode net



Re: How to extract a substring using Regex

From
"Postgres User"
Date:
> > Yes, I read the manual.  I think I had a problem because of the
> > special chars (< / >) that I'm trying to search for...  Still looking
> > for the right syntax.
>
> Why don't you show us what you've tried and the errors you're
> getting? That way we can help you figure out what you're doing wrong
> rather than just give you an answer.
>
> Michael Glaesemann

SELECT substring(data_field from '<name>(.)</name>')
FROM myTable

Re: How to extract a substring using Regex

From
Michael Glaesemann
Date:
On Aug 23, 2007, at 20:01 , Postgres User wrote:

>>> Yes, I read the manual.  I think I had a problem because of the
>>> special chars (< / >) that I'm trying to search for...  Still
>>> looking
>>> for the right syntax.
>>
>> Why don't you show us what you've tried and the errors you're
>> getting? That way we can help you figure out what you're doing wrong
>> rather than just give you an answer.
>>
>> Michael Glaesemann
>
> SELECT substring(data_field from '<name>(.)</name>')
> FROM myTable

Looks like you might want to brush up on regular expressions in
general. Your expression will match a single character between the
<name> tags. You might want to try something like .+ instead.

Michael Glaesemann
grzm seespotcode net



Re: How to extract a substring using Regex

From
"Postgres User"
Date:
On 8/23/07, Michael Glaesemann <grzm@seespotcode.net> wrote:
>
> On Aug 23, 2007, at 20:01 , Postgres User wrote:
>
> >>> Yes, I read the manual.  I think I had a problem because of the
> >>> special chars (< / >) that I'm trying to search for...  Still
> >>> looking
> >>> for the right syntax.
> >>
> >> Why don't you show us what you've tried and the errors you're
> >> getting? That way we can help you figure out what you're doing wrong
> >> rather than just give you an answer.
> >>
> >> Michael Glaesemann
> >
> > SELECT substring(data_field from '<name>(.)</name>')
> > FROM myTable
>
> Looks like you might want to brush up on regular expressions in
> general. Your expression will match a single character between the
> <name> tags. You might want to try something like .+ instead.
>
> Michael Glaesemann

You're right, that was a typo, I didn't copy and paste.
I found the problem, I was using 2 forward slashes instead of a
backslash + forward slash when pattern matching.  The correct regex to
extract my substring:

substring(data_field from '<name>(.+)<\/name>')

Re: How to extract a substring using Regex

From
Michael Glaesemann
Date:
On Aug 23, 2007, at 21:08 , Postgres User wrote:

> You're right, that was a typo, I didn't copy and paste.
> I found the problem, I was using 2 forward slashes instead of a
> backslash + forward slash when pattern matching.  The correct regex to
> extract my substring:
>
> substring(data_field from '<name>(.+)<\/name>')

I don't think the backslash is actually doing anything, (though two
forward slashes would definitely affect your result) as the slash
doesn't have a special meaning in the regexp.

test=# select substring('address city here <name>Rogers, Jim</name>
zip code place' from '<name>(.+)<\/name>');
   substring
-------------
Rogers, Jim
(1 row)

test=# select substring('address city here <name>Rogers, Jim</name>
zip code place' from '<name>(.+)</name>');
   substring
-------------
Rogers, Jim
(1 row)

Some scripting languages that use slashes to delimit regular
expressions, and therefore require slashes to be escaped, because
otherwise the slash would prematurely end the regexp.

In past versions of PostgreSQL, a backslash was used to escape single
quotes and enter other characters (e.g., \n). This is contrary to the
SQL spec, so you can now turn off this behavior by turning on
standard_conforming_strings. You'll see warnings if you use a
backslash in 8.2 with standard_conforming_strings off.

test=# show standard_conforming_strings;
standard_conforming_strings
-----------------------------
off
(1 row)

test=# select substring('address city here <name>Rogers, Jim</name>
zip code place' from '<name>(.+)<\/name>');
WARNING:  nonstandard use of escape in a string literal
LINE 1: ...ere <name>Rogers, Jim</name> zip code place' from '<name>(.
+...
                                                              ^
HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
   substring
-------------
Rogers, Jim
(1 row)

Michael Glaesemann
grzm seespotcode net


Re: How to extract a substring using Regex

From
Mark Cave-Ayland
Date:
On Thu, 2007-08-23 at 19:08 -0700, Postgres User wrote:
> On 8/23/07, Michael Glaesemann <grzm@seespotcode.net> wrote:
> >
> > On Aug 23, 2007, at 20:01 , Postgres User wrote:
> >
> > >>> Yes, I read the manual.  I think I had a problem because of the
> > >>> special chars (< / >) that I'm trying to search for...  Still
> > >>> looking
> > >>> for the right syntax.
> > >>
> > >> Why don't you show us what you've tried and the errors you're
> > >> getting? That way we can help you figure out what you're doing wrong
> > >> rather than just give you an answer.
> > >>
> > >> Michael Glaesemann
> > >
> > > SELECT substring(data_field from '<name>(.)</name>')
> > > FROM myTable
> >
> > Looks like you might want to brush up on regular expressions in
> > general. Your expression will match a single character between the
> > <name> tags. You might want to try something like .+ instead.
> >
> > Michael Glaesemann
>
> You're right, that was a typo, I didn't copy and paste.
> I found the problem, I was using 2 forward slashes instead of a
> backslash + forward slash when pattern matching.  The correct regex to
> extract my substring:
>
> substring(data_field from '<name>(.+)<\/name>')

FWIW, I find the following site extremely useful when trying to create
moderately complex regular expressions: http://www.rexv.org.


HTH,

Mark.

--
ILande - Open Source Consultancy
http://www.ilande.co.uk



Re: [OT - sorta] How to extract a substring using Regex

From
Josh Trutwin
Date:
On Fri, 24 Aug 2007 06:31:58 +0100
Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> wrote:

> > substring(data_field from '<name>(.+)<\/name>')
>
> FWIW, I find the following site extremely useful when trying to
> create moderately complex regular expressions: http://www.rexv.org.

Nice site - here's another good one for an installed app that is
excellent for even advanced regexp's:

http://weitz.de/regex-coach/

It used to be cross-platform but now he only develops a Windows
version.

Thanks,

Josh