Re: Scadinavian characters in regular expressions - Mailing list pgsql-sql
From | Søren Vainio |
---|---|
Subject | Re: Scadinavian characters in regular expressions |
Date | |
Msg-id | 910513A5A944D5118BE900C04F67CB5A1F82C6@MAIL Whole thread Raw |
In response to | Scadinavian characters in regular expressions (Søren Vainio <sva@Netpointers.com>) |
Responses |
Re: Scadinavian characters in regular expressions
Re: Scadinavian characters in regular expressions |
List | pgsql-sql |
Using \s does produce FALSE for SELECT 'oneå two three' ~ '^[^\s]+[\s][^\s]+$'; But it also produces FALSE for any two-word string ex: SELECT 'one two' ~ '^[^\s]+[\s][^\s]+$'; where I would expect TRUE??? (I am using PostgreSQL 7.1.3) > -----Oprindelig meddelelse----- > Fra: pgsql-sql-owner@postgresql.org > [mailto:pgsql-sql-owner@postgresql.org]På vegne af Andreas > Joseph Krogh > Sendt: 9. april 2002 11:53 > Til: 'pgsql-sql@postgresql.org' > Emne: Re: [SQL] Scadinavian characters in regular expressions > > > On Tuesday 09 April 2002 10:51, Søren Vainio wrote: > > Can someone please explain the following? > > I am using a regular expression to find strings containing > two words (begin > > with one or more characters not being spaces followed by a > space followed > > by one or more characters not being spaces). > > But when scandinavian characters are included it returns > different results > > depending on where the character is positioned. > > The first two-word example returns TRUE as expected. > > The second three-word example returns FALSE as expected. > > But when I let an å (å å a-ring) traverse through > the string it > > unexpectedly returns TRUE when the character is positioned as the > > second-last or last character in the two first words. > > > > SELECT 'one two' ~ '^[^ ]+[ ][^ ]+$'; returns TRUE > > SELECT 'one two three' ~ '^[^ ]+[ ][^ ]+$'; returns FALSE > > SELECT 'åone two three' ~ '^[^ ]+[ ][^ ]+$'; returns FALSE > > SELECT 'oåne two three' ~ '^[^ ]+[ ][^ ]+$'; returns FALSE > > SELECT 'onåe two three' ~ '^[^ ]+[ ][^ ]+$'; returns TRUE > > SELECT 'oneå two three' ~ '^[^ ]+[ ][^ ]+$'; returns TRUE > > SELECT 'one åtwo three' ~ '^[^ ]+[ ][^ ]+$'; returns FALSE > > SELECT 'one tåwo three' ~ '^[^ ]+[ ][^ ]+$'; returns FALSE > > SELECT 'one twåo three' ~ '^[^ ]+[ ][^ ]+$'; returns TRUE > > SELECT 'one twoå three' ~ '^[^ ]+[ ][^ ]+$'; returns TRUE > > SELECT 'one two åthree' ~ '^[^ ]+[ ][^ ]+$'; returns FALSE > > SELECT 'one two tåhree' ~ '^[^ ]+[ ][^ ]+$'; returns FALSE > > SELECT 'one two thåree' ~ '^[^ ]+[ ][^ ]+$'; returns FALSE > > SELECT 'one two thråee' ~ '^[^ ]+[ ][^ ]+$'; returns FALSE > > SELECT 'one two threåe' ~ '^[^ ]+[ ][^ ]+$'; returns FALSE > > SELECT 'one two threeå' ~ '^[^ ]+[ ][^ ]+$'; returns FALSE > > > > Thank you for any response. > > > > Søren Vainio, Denmark > > I just tried the following which returned false as expected: > andreak=# SELECT 'oneå two three' ~ '^[^\s]+[\s][^\s]+$'; > ?column? > ---------- > f > (1 row) > > andreak=# select version(); > version > ----------------------------------------------------------- > PostgreSQL 7.2 on i686-pc-linux-gnu, compiled by GCC 2.96 > (1 row) > > NOTE: I replaced your [^ ] with the properly formated pattarn > for whitespace: > [^\s] > > -- > Andreas Joseph Krogh (Senior Software Developer) > <andreak@officenet.no> > A hen is an egg's way of making another egg. > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >