Thread: someone please explain this regex behaviour

someone please explain this regex behaviour

From
Martin Leja
Date:
Hi,

consider the following simple example, where i want to select a path by
using a where clause with the case insensitive operator "~*". Everything is
ok where i get a result of one row, but i don't understand the results with
0 rows:

backup=> select version();
version
-------------------------------------------------------------
PostgreSQL 6.5.3 on i686-pc-linux-gnu, compiled by gcc 2.95.2
(1 row)

backup=> create table foo (path varchar(300));
CREATE
backup=> insert into foo (path) values ('/My');
INSERT 29400 1
backup=> select path from foo where path ~* '/My';
path
----
/My
(1 row)

backup=> select path from foo where path ~* '^/My';
path
----
/My
(1 row)

backup=> select path from foo where path ~* '^/my';
path
----
(0 rows)

backup=> select path from foo where path ~* '/my';
path
----
/My
(1 row)

backup=> select path from foo where path ~* '/mY';
path
----
/My
(1 row)

backup=> select path from foo where path ~* '^/mY';
path
----
(0 rows)


Why e.g. does the statement "select path from foo where path ~* '^/my';"
not return the only entry "/My"? Can someone explain this?
--
Regards, martin@unix-ag.org


Re: someone please explain this regex behaviour

From
Tom Lane
Date:
Martin Leja <Martin.Leja@unix-ag.uni-siegen.de> writes:
> Why e.g. does the statement "select path from foo where path ~* '^/my';"
> not return the only entry "/My"? Can someone explain this?

I think you're getting bitten by the LIKE-index-optimization-in-non-
ASCII-locale problem.  Are you running the server in a locale other
than "C"?  See the many past threads about this type of issue ...

            regards, tom lane

Re: someone please explain this regex behaviour

From
Martin Leja
Date:
At 18:47 27.01.2001 -0500, Tom Lane wrote:
>Martin Leja <Martin.Leja@unix-ag.uni-siegen.de> writes:
> > Why e.g. does the statement "select path from foo where path ~* '^/my';"
> > not return the only entry "/My"? Can someone explain this?
>
>I think you're getting bitten by the LIKE-index-optimization-in-non-
>ASCII-locale problem.  Are you running the server in a locale other
>than "C"?  See the many past threads about this type of issue ...

i'm not quite familiar with this locale stuff, so i searched the docs and
found the following in doc/postgresql-doc/postgres/install12893.htm:
...
If you configure and compile Postgres with --enable-locale then you should
set the locale environment to "C" (or unset all "LC_*" variables) by
putting these additional lines to your login environment before starting
postmaster:
LC_COLLATE=C
LC_CTYPE=C
export LC_COLLATE LC_CTYPE
...

i then changed /etc/init.d/postgresql (postmaster start script in debian)
accordingly, restarted postmaster with the script, but unfortunetly i get
the same results.

I wonder if i disabled the LIKE-index-optimization-in-non-ASCII-locale with
the above action at all and if this is the problem of my select results.
Isn't there a "psql -c 'show ???'" command which can report the locale
setting to me?


--
Regards, martin@unix-ag.org