Thread: someone please explain this regex behaviour
Hi, consider the following simple example, where i want to select a path by using a where clause with the case insensitive operator "~*". Everything is ok where i get a result of one row, but i don't understand the results with 0 rows: backup=> select version(); version ------------------------------------------------------------- PostgreSQL 6.5.3 on i686-pc-linux-gnu, compiled by gcc 2.95.2 (1 row) backup=> create table foo (path varchar(300)); CREATE backup=> insert into foo (path) values ('/My'); INSERT 29400 1 backup=> select path from foo where path ~* '/My'; path ---- /My (1 row) backup=> select path from foo where path ~* '^/My'; path ---- /My (1 row) backup=> select path from foo where path ~* '^/my'; path ---- (0 rows) backup=> select path from foo where path ~* '/my'; path ---- /My (1 row) backup=> select path from foo where path ~* '/mY'; path ---- /My (1 row) backup=> select path from foo where path ~* '^/mY'; path ---- (0 rows) Why e.g. does the statement "select path from foo where path ~* '^/my';" not return the only entry "/My"? Can someone explain this? -- Regards, martin@unix-ag.org
Martin Leja <Martin.Leja@unix-ag.uni-siegen.de> writes: > Why e.g. does the statement "select path from foo where path ~* '^/my';" > not return the only entry "/My"? Can someone explain this? I think you're getting bitten by the LIKE-index-optimization-in-non- ASCII-locale problem. Are you running the server in a locale other than "C"? See the many past threads about this type of issue ... regards, tom lane
At 18:47 27.01.2001 -0500, Tom Lane wrote: >Martin Leja <Martin.Leja@unix-ag.uni-siegen.de> writes: > > Why e.g. does the statement "select path from foo where path ~* '^/my';" > > not return the only entry "/My"? Can someone explain this? > >I think you're getting bitten by the LIKE-index-optimization-in-non- >ASCII-locale problem. Are you running the server in a locale other >than "C"? See the many past threads about this type of issue ... i'm not quite familiar with this locale stuff, so i searched the docs and found the following in doc/postgresql-doc/postgres/install12893.htm: ... If you configure and compile Postgres with --enable-locale then you should set the locale environment to "C" (or unset all "LC_*" variables) by putting these additional lines to your login environment before starting postmaster: LC_COLLATE=C LC_CTYPE=C export LC_COLLATE LC_CTYPE ... i then changed /etc/init.d/postgresql (postmaster start script in debian) accordingly, restarted postmaster with the script, but unfortunetly i get the same results. I wonder if i disabled the LIKE-index-optimization-in-non-ASCII-locale with the above action at all and if this is the problem of my select results. Isn't there a "psql -c 'show ???'" command which can report the locale setting to me? -- Regards, martin@unix-ag.org