Thread: HOw to convert unicode to string

HOw to convert unicode to string

From
"Abhijit Prusty -X (abprusty - UST Global at Cisco)"
Date:
<div class="WordSection1"><p class="MsoNormal">Hi,<p class="MsoNormal"> <p class="MsoNormal">I have a query in oracle
likethis mentioned below<p class="MsoNormal"> <p class="MsoNormal">Insert into TEST<p class="MsoNormal">  
(TEMPLATE_ID,TEMPLATE_NAME, CREATED_BY, CREATED_DT, UPDATED_BY, <p class="MsoNormal">    UPDATED_DT, TEMPLATE_KEY)<p
class="MsoNormal">Values<pclass="MsoNormal">   (1, UNISTR('\D3C9\BA85\B3C4 \B514\C2A4\D50C\B808\C774'), 'dmin',
SYSDATE,'admin', <p class="MsoNormal">    SYSDATE ,'FLOOR');<p class="MsoNormal"> <p class="MsoNormal">Now the oracle
usesthe UNISTR function to convert and insert the Unicode to string and store in database.<p class="MsoNormal"> <p
class="MsoNormal">But,how can we achieve the same using the PostgreSql .Can you please help me with the query<p
class="MsoNormal"> <pclass="MsoNormal"> <p class="MsoNormal">Thanks,<p class="MsoNormal">Abhijit</div> 

Re: HOw to convert unicode to string

From
Jasen Betts
Date:
On 2012-09-23, Abhijit Prusty -X (abprusty - UST Global at Cisco) <abprusty@cisco.com> wrote:
> --_000_8A2A33BFAA5E2F408D0BBB80844412720487D0xmbalnx03ciscocom_
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: quoted-printable
>
> Hi,
>
> I have a query in oracle like this mentioned below
>
> Insert into TEST
>    (TEMPLATE_ID, TEMPLATE_NAME, CREATED_BY, CREATED_DT, UPDATED_BY,
>     UPDATED_DT, TEMPLATE_KEY)
> Values
>    (1, UNISTR('\D3C9\BA85\B3C4 \B514\C2A4\D50C\B808\C774'), 'dmin', SYSDATE=
> , 'admin',
>     SYSDATE ,'FLOOR');
>
> Now the oracle uses the UNISTR function to convert and insert the Unicode to
> string and store in database.

oracle uNISTR-like UTF-16 can be written like this:
U&'\D3C9\BA85\B3C4 \B514\C2A4\D50C\B808\C774'

it's not a function, it a way of writing strings... if you need a
it probably wouldn't be hard to write.

but you can also write in UTF-8 (literal or escaped) or unicode escaped
see docs:

u&'\+021502'          -- unicode
u&'\D845\DD02'        -- utf16  (docs tell methis is legal with recent versions)
e'\xF0\xA1\x94\x82'   -- utf8 hex escape
e'\360\241\224\202'   -- utf8 octal escape
'𡔂'                  -- utf8 string literal

the first 2 can be intermixed as can the last three forms.

http://www.postgresql.org/docs/9.1/static/sql-syntax-lexical.html

select length('𡔂'), octet_length( '𡔂' ), length('test'),
octet_length('test');
length | octet_length | length | octet_length 
--------+--------------+--------+--------------     1 |            4 |      4 |            4      
-- 
⚂⚃ 100% natural