Thread: Percent-encoding conversion to binary, %C2%A9 = ©

Percent-encoding conversion to binary, %C2%A9 = ©

From
Hans Schou
Date:
Hi

I have a little trouble with the chr() function.

I have a string like this:
"Copyright+%C2%A9+1856+Na%C3%AFve+retros%C2%AE"
which should be converted to binary string like:
"Copyright © 1856 Naïve retros®"

Is there an easy way to do this conversion?

I have tried to do it with a function, but it fails. The script is at
the end of the mail.

Description of percent encoding:
http://en.wikipedia.org/wiki/Percent-encoding

/hans
---------------------------------------------------------------

drop function percent2bin(text);
create function percent2bin(text) returns text as $Q$
declare
         intxt  alias for $1;
         res text;
         r text;
         idx int;
         c text;
         n1 int;
begin
         RAISE NOTICE 'Input: %',intxt;
         r := regexp_replace(intxt, E'\\+', ' ', 'g');
         idx := 1;
         res := '';
         while idx <= length(r) loop
                 c := substring(r from idx for 1);
                 if c = '%' and idx+2<=length(r) then
                         n1 := asciihex2bin(substring(r from idx+1 for 2));
                         RAISE NOTICE 'Char: %, %',substring(r from idx+1 for 2),n1;
                         if 0 <= n1 and n1 <= 255 then
                                 c := chr(n1); -- HERE IT GOES WRONG
                                 idx := idx + 2;
                         end if;
                 end if;
                 res := res || c;
                 idx := idx + 1;
         end loop;
         return res;
end;
$Q$ language PLpgSQL;

rop function asciihex2bin(text);
create function asciihex2bin(text) returns int as $Q$
declare
         intxt  alias for $1;
         n1 int;
         n2 int;
         idx int;
begin
         n1 := 0;
         for idx in 1..2 loop
                 -- get char and convert to binary nul, '0' to '9' will be 0 to 9
                 n2 := ascii(upper(substring(intxt from idx for 1)))-48;
                 -- if input was 'A' to 'F', subtract 7
                 if n2 > 9 then
                         n2 := n2 - 7;
                 end if;
                 -- if out of range, fail
                 if n2 < 0 or n2 > 15 then
                         RAISE NOTICE 'Input fault expected 0..15, got: % "%"', n2, intxt;
                         n2 := 1000;
                 end if;
                 if idx = 1 then
                         n1 := n2 * 16;
                 else
                         n1 := n1 + n2;
                 end if;
         end loop;
         return n1;
end;
$Q$ language PLpgSQL;

select asciihex2bin('00');
select asciihex2bin('FF');

select percent2bin('Copyright+%C2%A9+1856+Na%C3%AFve+retros%C2%AE');

Re: Percent-encoding conversion to binary, %C2%A9 = ©

From
Martijn van Oosterhout
Date:
On Thu, Mar 13, 2008 at 03:49:37PM +0100, Hans Schou wrote:
> Hi
>
> I have a little trouble with the chr() function.
>
> I have a string like this:
> "Copyright+%C2%A9+1856+Na%C3%AFve+retros%C2%AE"
> which should be converted to binary string like:
> "Copyright © 1856 Naïve retros®"
>
> Is there an easy way to do this conversion?

Looks like you have UTF-8 encoded with percent signs. Perhaps the right
approach is you covert the incoming text into a bytea array and then
use convert() to turn it in to a string.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

Attachment