Re: regexp_replace and UTF8 - Mailing list pgsql-sql

From Harald Fuchs
Subject Re: regexp_replace and UTF8
Date
Msg-id pufxj0g5jb.fsf@srv.protecting.net
Whole thread Raw
In response to regexp_replace and UTF8  ("Bart Degryse" <Bart.Degryse@indicator.be>)
List pgsql-sql
In article <87ljstm4eq.fsf@oxford.xeocode.com>,
Gregory Stark <stark@enterprisedb.com> writes:

> "Bart Degryse" <Bart.Degryse@indicator.be> writes:
>> Hi,
>> I have a text field with data like this: 'de patiënt niet'

>> Can anyone help me fix this or point me to a better approach.
>> By the way, changing the way data is put into the field is
>> unfortunately not an option.

> You could use a plperl function to use one of the many html parsing perl
> modules?

Yes, either plperl or some external HTML tool.

>> Basically what I need to do (I think) is
>> - get rid of the &, # and ;
>> - convert the number to hex
>> - make a UTF8 from that (thus: \xEB)
>> - convert that to SQL_ASCII

You know that SQL_ASCII is a misnomer for "no encoding at all, and I
don't care"?  I'd use UTF8 or (if you stay in Western Europe) Latin9.



pgsql-sql by date:

Previous
From: Gerardo Herzig
Date:
Subject: dynamic OUT parameters?
Next
From: Craig Ringer
Date:
Subject: Re: dynamic OUT parameters?