Re: substring with this pattern works in 8.3.1; does not work in 8.3.4 - Mailing list pgsql-bugs

From Tomasz Ostrowski
Subject Re: substring with this pattern works in 8.3.1; does not work in 8.3.4
Date
Msg-id 49252E51.7080605@batory.org.pl
Whole thread Raw
In response to substring with this pattern works in 8.3.1; does not work in 8.3.4  ("Chris Wood" <chris.wood@bookitnow.ca>)
List pgsql-bugs
On 2008-11-12 19:07, Chris Wood wrote:

> here is where it works in 8.3.1:
>     CONSTRAINT public_phone_ch CHECK (((public_phone = ''::text) OR
> ("substring"(public_phone, '^[0-9]{10}(,[0-9]{10})*$'::text) IS NOT NULL))));
> protocalte=> insert into locn values(10, '1231231234') ;

8.3.1 had a bug in substring function which was corrected in 8.3.2:

http://www.postgresql.org/docs/8.3/static/release-8-3-2.html
Fix a corner case in regular-expression substring matching
(substring(string from pattern)) (Tom)

The problem occurs when there is a match to the pattern overall but the
user has specified a parenthesized subexpression and that subexpression
hasn't got a match. An example is substring('foo' from 'foo(bar)?').
This should return NULL, since (bar) isn't matched, but it was
mistakenly returning the whole-pattern match instead (ie, foo).

> and here is where it does not work in 8.3.4:
> ERROR:  new row for relation "locn" violates check constraint
> "public_phone_ch"

http://www.postgresql.org/docs/8.3/static/functions-matching.html#FUNCTIONS-POSIX-REGEXP
The substring function with two parameters, substring(string from
pattern), provides extraction of a substring that matches a POSIX
regular expression pattern. It returns null if there is no match,
otherwise the portion of the text that matched the pattern. But *if*
*the* *pattern* *contains* *any* *parentheses*, *the* *portion* *of*
*the* *text* *that* *matched* *the* *first* *parenthesized*
*subexpression* (the one whose left parenthesis comes first) *is*
*returned*. You can put parentheses around the whole expression if you
want to use parentheses within it without triggering this exception.

So your check should look like this:
  CONSTRAINT public_phone_ch
  CHECK (
    (
      (public_phone = ''::text)
      OR
      ("substring"(public_phone, '(^[0-9]{10}(,[0-9]{10})*$)'::text)
      IS NOT NULL))
    )
  );

Or much simpler and clear:
  CONSTRAINT public_phone_ch
  CHECK ( public_phone ~ '^$|^[0-9]{10}(,[0-9]{10})*$' )

> I apologize in advance for not testing  on 8.3.5,  but that would be
> very difficult for me.

It is not that difficult:

PGVERSION=2.3.5
mkdir /tmp/postgres
cd /tmp/postgres
wget \
ftp://ftp.postgresql.org/pub/source/v$PGVERSION/postgresql-$PGVERSION.tar.bz2
tar xjf postgresql-$PGVERSION.tar.bz2
cd postgresql-$PGVERSION
./configure --prefix=/tmp/postgres
make install
cd /tmp/postgres/bin
./initdb --no-locale -D /tmp/postgres/data
./postgres -p 54320 -D /tmp/postgres/data &
./psql -p 54320 postgres

postgres=# select version();
 PostgreSQL 8.3.5 on i686-pc-linux-gnu,
 compiled by GCC gcc (GCC) 4.1.2 20070925 (Red Hat 4.1.2-33)

This works for any user (besides root) on any Unix-like system.

Regards
Tometzky
--
...although Eating Honey was a very good thing to do, there was a
moment just before you began to eat it which was better than when you
were...
                                                      Winnie the Pooh

pgsql-bugs by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: BUG #4539: to_char(to_timestamp('...','....')) returns the wrong result
Next
From: Alexandra Nitzschke
Date:
Subject: could not read block 77 of relation 1663/16385/388818775