Re: Regexp matching - Mailing list pgsql-sql

From Osvaldo Kussama
Subject Re: Regexp matching
Date
Msg-id AANLkTim+hv-qrx1NVZpxfhn1Q-eTUy5=qOnYeix2drQQ@mail.gmail.com
Whole thread Raw
In response to Regexp matching  (Eduardas Kazakas <eduardas.kazakas@gmail.com>)
List pgsql-sql
2010/9/28 Eduardas Kazakas <eduardas.kazakas@gmail.com>:
> Hello, I have some problems using character class matching (e.g. [:alpha:]).
>
> For example I have a table:
>
> CREATE TABLE re_test (text_column character varying (50) NOT NULL);
>
> Notice, that there are some specific characters.
>
> INSERT INTO re_test VALUES ('AŠDF');
> INSERT INTO re_test VALUES ('AŠDF45');
> INSERT INTO re_test VALUES ('AŠDF FDŠA');
> INSERT INTO re_test VALUES ('ASDF FDŠA');
> INSERT INTO re_test VALUES ('58ASDF FDŠA');
> INSERT INTO re_test VALUES ('ašDf');
> INSERT INTO re_test VALUES ('aŠdf');
>
> SELECT * FROM re_test WHERE text_column ~ '[^[:alpha:]]' and text_column ~
> [:upper:];
>
> Goal:
> I want to write such statement which returns me only those records which
> have only one word and those words must be uppercase.
> So I expect this statement to return only one record where text_column =
> AŠDF.
>
> Maybe someone could give me more detail explanation how to use those regexp
> classes, because the documentation tells very little about this.
>
> Some more information:
>
> PostgreSQL9
>
> OS - Windows x86-32
> DB encoding - UTF-8
> lc_collate - English_United States.1252
> lc_ctype - English_United States.1252
> lc_messages - English_United States.1252
> lc_monetary - English_United States.1252
> lc_numeric - English_United States.1252
> lc_time - English_United States.1252


I believe that "Š" isn't an alphabetical character in English_United
States (LC_CTYPE).
http://www.postgresql.org/docs/current/interactive/locale.html

Osvaldo


pgsql-sql by date:

Previous
From: "Oliveiros d'Azevedo Cristina"
Date:
Subject: Re: identifying duplicates in table with redundancies
Next
From: Kenneth Marshall
Date:
Subject: Re: Question about PQexecParams