Thread: Locale support for postgresql regex (src)

Locale support for postgresql regex (src)

From
Antonello Nocchi
Date:
Hi,

I modified two files in postgresql-7.1.3/src/backend/regex/ and in
postgresql-7.1.3/src/include/regex/
so 'character class' (eg. [[:alnum:]], [[:alpha:]], etc.) now should
support 'locale' settings.

http://galaxy.metacerca.it/~anto/pgslq_7_1_3_regex_locale.tar.gz (~14
KB, 2 files)

It is not a great work and do not support multibyte, but, for me, is
sufficient to isolate, for example, an italian word containing ascii
char > 127.
For example: select T from tab where T ~*
'(^|[^[:alnum:]]+)citt[[:alnum:]]*([^[:alnum:]]+|$)';
    now match the word 'città' in a string like 'vado in città', 'città'
etc..

PS: excuse my poor english

Regards
Antonello

--
_______________________________________________________

Antonello Nocchi                        CERCA.COM S.r.l

Via dello Stadio, 19             Tel. +39-0578-75.77.77
53045 - Montepulciano (Siena)    Tel. +39-0578-71.67.09
ITALY                            Fax. +39-0578-71.51.89
antonello@cerca.com                http://www.cerca.com


Re: Locale support for postgresql regex (src)

From
Bruce Momjian
Date:
This has been saved for the 7.3 release:

    http://candle.pha.pa.us/cgi-bin/pgpatches2

---------------------------------------------------------------------------

Antonello Nocchi wrote:
> Hi,
>
> I modified two files in postgresql-7.1.3/src/backend/regex/ and in
> postgresql-7.1.3/src/include/regex/
> so 'character class' (eg. [[:alnum:]], [[:alpha:]], etc.) now should
> support 'locale' settings.
>
> http://galaxy.metacerca.it/~anto/pgslq_7_1_3_regex_locale.tar.gz (~14
> KB, 2 files)
>
> It is not a great work and do not support multibyte, but, for me, is
> sufficient to isolate, for example, an italian word containing ascii
> char > 127.
> For example: select T from tab where T ~*
> '(^|[^[:alnum:]]+)citt[[:alnum:]]*([^[:alnum:]]+|$)';
>     now match the word 'citt?' in a string like 'vado in citt?', 'citt?'
> etc..
>
> PS: excuse my poor english
>
> Regards
> Antonello
>
> --
> _______________________________________________________
>
> Antonello Nocchi                        CERCA.COM S.r.l
>
> Via dello Stadio, 19             Tel. +39-0578-75.77.77
> 53045 - Montepulciano (Siena)    Tel. +39-0578-71.67.09
> ITALY                            Fax. +39-0578-71.51.89
> antonello@cerca.com                http://www.cerca.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Locale support for postgresql regex (src)

From
Bruce Momjian
Date:
Your patch has been added to the PostgreSQL unapplied patches list at:

    http://candle.pha.pa.us/cgi-bin/pgpatches

I will try to apply it within the next 48 hours.

---------------------------------------------------------------------------


Antonello Nocchi wrote:
> Hi,
>
> I modified two files in postgresql-7.1.3/src/backend/regex/ and in
> postgresql-7.1.3/src/include/regex/
> so 'character class' (eg. [[:alnum:]], [[:alpha:]], etc.) now should
> support 'locale' settings.
>
> http://galaxy.metacerca.it/~anto/pgslq_7_1_3_regex_locale.tar.gz (~14
> KB, 2 files)
>
> It is not a great work and do not support multibyte, but, for me, is
> sufficient to isolate, for example, an italian word containing ascii
> char > 127.
> For example: select T from tab where T ~*
> '(^|[^[:alnum:]]+)citt[[:alnum:]]*([^[:alnum:]]+|$)';
>     now match the word 'citt?' in a string like 'vado in citt?', 'citt?'
> etc..
>
> PS: excuse my poor english
>
> Regards
> Antonello
>
> --
> _______________________________________________________
>
> Antonello Nocchi                        CERCA.COM S.r.l
>
> Via dello Stadio, 19             Tel. +39-0578-75.77.77
> 53045 - Montepulciano (Siena)    Tel. +39-0578-71.67.09
> ITALY                            Fax. +39-0578-71.51.89
> antonello@cerca.com                http://www.cerca.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Locale support for postgresql regex (src)

From
Bruce Momjian
Date:
This patch was rejected.  Please continue discussion on the hackers
list. Thanks.


---------------------------------------------------------------------------

Antonello Nocchi wrote:
> Hi,
>
> I modified two files in postgresql-7.1.3/src/backend/regex/ and in
> postgresql-7.1.3/src/include/regex/
> so 'character class' (eg. [[:alnum:]], [[:alpha:]], etc.) now should
> support 'locale' settings.
>
> http://galaxy.metacerca.it/~anto/pgslq_7_1_3_regex_locale.tar.gz (~14
> KB, 2 files)
>
> It is not a great work and do not support multibyte, but, for me, is
> sufficient to isolate, for example, an italian word containing ascii
> char > 127.
> For example: select T from tab where T ~*
> '(^|[^[:alnum:]]+)citt[[:alnum:]]*([^[:alnum:]]+|$)';
>     now match the word 'citt?' in a string like 'vado in citt?', 'citt?'
> etc..
>
> PS: excuse my poor english
>
> Regards
> Antonello
>
> --
> _______________________________________________________
>
> Antonello Nocchi                        CERCA.COM S.r.l
>
> Via dello Stadio, 19             Tel. +39-0578-75.77.77
> 53045 - Montepulciano (Siena)    Tel. +39-0578-71.67.09
> ITALY                            Fax. +39-0578-71.51.89
> antonello@cerca.com                http://www.cerca.com
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/users-lounge/docs/faq.html
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026