Re: Regular expression question with Postgres - Mailing list pgsql-general

From Tom Lane
Subject Re: Regular expression question with Postgres
Date
Msg-id 18739.1406235427@sss.pgh.pa.us
Whole thread Raw
In response to Regular expression question with Postgres  (Mike Christensen <mike@kitchenpc.com>)
Responses Re: Regular expression question with Postgres  (Mike Christensen <mike@kitchenpc.com>)
List pgsql-general
Mike Christensen <mike@kitchenpc.com> writes:
> I'm curious why this query returns 0:
> SELECT 'AAA' ~ '^A{,4}$'

> Yet, this query returns 1:

> SELECT 'AAA' ~ '^A{0,4}$'

> Is this a bug with the regular expression engine?

Our regex documentation lists the following variants of bounds syntax:
    {m}
    {m,}
    {m,n}
Nothing about {,n}.  I rather imagine that the engine is deciding that
that's just literal text and not a bounds constraint ...

regression=# SELECT 'A{,4}' ~ '^A{,4}$';
 ?column?
----------
 t
(1 row)

... yup, apparently so.

A look at the POSIX standard says that it has the same idea of what
is a valid bounds constraint:

    When an ERE matching a single character or an ERE enclosed in
    parentheses is followed by an interval expression of the format
    "{m}", "{m,}", or "{m,n}", together with that interval expression
    it shall match what repeated consecutive occurrences of the ERE
    would match. The values of m and n are decimal integers in the
    range 0 <= m<= n<= {RE_DUP_MAX}, where m specifies the exact or
    minimum number of occurrences and n specifies the maximum number
    of occurrences. The expression "{m}" matches exactly m occurrences
    of the preceding ERE, "{m,}" matches at least m occurrences, and
    "{m,n}" matches any number of occurrences between m and n,
    inclusive.

            regards, tom lane


pgsql-general by date:

Previous
From: Mike Christensen
Date:
Subject: Re: Regular expression question with Postgres
Next
From: Mike Christensen
Date:
Subject: Re: Regular expression question with Postgres