Re: Regex problem - Mailing list pgsql-general

From Tom Lane
Subject Re: Regex problem
Date
Msg-id 12743.1215717734@sss.pgh.pa.us
Whole thread Raw
In response to Regex problem  ("Scott Marlowe" <scott.marlowe@gmail.com>)
Responses Re: Regex problem
List pgsql-general
"Scott Marlowe" <scott.marlowe@gmail.com> writes:
> ...Which is not surprising.  It's greedy.  So, I turn off the greediness
> of the first + with a ? and then I get this

> select substring (notes from E'LONG DB QUERY.+?time: [0-9]+.[0-9]+')
> from table where id=1;

> LONG DB QUERY (db1, 4.9376289844513): UPDATE force_session SET
> last_used_timestamp = 'now'::timestamp WHERE orgid = 15723 AND
> session_id = 'f5ca5ec95965e8ac99ec9bc31eca84c6New session created
> time: 5.0

> Now, I'm pretty sure that with the [0-9]+.[0-9]+ I should be getting
> 5.03999090194 at the end.

You're getting bit by the fact that the initial non-greedy quantifier
makes the entire regex non-greedy --- see rules in section 9.7.3.5:
http://developer.postgresql.org/pgdocs/postgres/functions-matching.html#POSIX-MATCHING-RULES

If you know that there will always be something after the first time
value, you could do something like

E'(LONG DB QUERY.+?time: [0-9]+\\.[0-9]+)[^0-9]'

to force the issue about how much the second and third quantifiers
match.

            regards, tom lane

pgsql-general by date:

Previous
From: Devrim GÜNDÜZ
Date:
Subject: Re: apache permission denied
Next
From: "Scott Marlowe"
Date:
Subject: Re: Regex problem