Home > mailing lists

Re: Pathological regexp match - Mailing list pgsql-hackers

From	Alvaro Herrera
Subject	Re: Pathological regexp match
Date	January 29, 2010 03:21:57
Msg-id	20100129042142.GF1793@alvh.no-ip.org Whole thread Raw
In response to	Re: Pathological regexp match (Michael Glaesemann <michael.glaesemann@myyearbook.com>)
Responses	Re: Pathological regexp match (Michael Glaesemann <michael.glaesemann@myyearbook.com>)
List	pgsql-hackers

Tree view

Michael Glaesemann wrote:

> However, as you point out, Postgres doesn't appear to take this into
> account:
> 
> postgres=# select regexp_replace('oooZQoooAoooQooQooQooo', $r$(Z(Q)
> [^Q]*A.*(\2))$r$, $s$X$s$);
>  regexp_replace
> ----------------
>  oooXooo
> (1 row)
> 
> postgres=# select regexp_replace('oooZQoooAoooQooQooQooo', $r$(Z(Q)
> [^Q]*A.*?(\2))$r$, $s$X$s$);
>  regexp_replace
> ----------------
>  oooXooo
> (1 row)

I think the reason for this is that the first * is greedy and thus the
entire expression is considered greedy.  The fact that you've made the
second * non-greedy does not ungreedify the RE ... Note the docs say:
The above rules associate greediness attributes not only withindividual quantified atoms, but with branches and entire
REsthatcontain quantified atoms. What that means is that thematching is done in such a way that the branch, or whole
RE,matchesthe longest or shortest possible substring as a whole.

It's late here so I'm not sure if this is what you're looking for:

alvherre=# select regexp_replace('oooZQoooAoooQooQooQooo', $r$(Z(Q)[^Q]*?A.*(\2))$r$, $s$X$s$);regexp_replace 
----------------oooXooQooQooo
(1 fila)

(Obviously the non-greediness has moved somewhere else) :-(

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

pgsql-hackers by date:

From: Andrew Dunstan
Date: 29 January 2010, 03:14:54
Subject: out-of-scope cursor errors

From: Michael Glaesemann
Date: 29 January 2010, 03:37:14
Subject: Re: Pathological regexp match

Re: Pathological regexp match - Mailing list pgsql-hackers

Previous

Next