Home > mailing lists

Re: UNICODE - Mailing list pgsql-general

From	Tatsuo Ishii
Subject	Re: UNICODE
Date	October 29, 2001 20:09:19
Msg-id	20011030100139C.t-ishii@sra.co.jp Whole thread Raw
In response to	Re: UNICODE (Tatsuo Ishii <t-ishii@sra.co.jp>)
List	pgsql-general

Tree view

Can you please do not send me a personal mail?
Let's share info among people in the mailing list.
Anyway...

> I've tried that.  Still not writing the Chinese characters correctly.

I don't know what kind of Chinese character set you are using, but at
least your code will not work if the Chinese character set is Big5
since the second byte of it contains ascii characters.
To learn more about character sets, see
ftp://ftp.ora.com/pub/examples/nutshell/ujip/doc/cjk.inf
for example.
--
Tatsuo Ishii

> Here is the code:
>
>   contentTypeFromPost = getenv("CONTENT_TYPE");
>   contentTypeLength = getenv("CONTENT_LENGTH");
>   icontentLength = atoi(contentTypeLength);
>
>     if((queryString = malloc(icontentLength + 1)) == NULL)
>     {
>       postMessage("Cannot allocate memory", 0);
>       return(0);
>     }
>     for(i=0; *queryString; i++)
>     {
>       splitword(items.Item, queryString, '&');
>       unescape_url(items.Item);
>       splitword(items.name, items.Item, '=');
>
>  // items.Item contains double byte characters
>  // However, when write to database I get unrecognizable data
>     }
>
> void splitword(uchar *out, uchar *in, uchar stop)
> {
>    int i, j;
>
>    while(*in == ' ') in++; /* skip past any spaces */
>
>    for(i = 0; in[i] && (in[i] != stop); i++)
>       out[i] = in[i];
>
>    out[i] = '\0'; /* terminate it */
>    if(in[i]) ++i; /* position past the stop */
>
>    while(in[i] == ' ') i++; /* skip past any spaces */
>
>    for(j = 0; in[j]; )  /* shift the rest of the in */
>       in[j++] = in[i++];
> }
>
> uchar x2c(uchar *x)
> {
>    register uchar c;
>
>    /* note: (x & 0xdf) makes x upper case */
>    c  = (x[0] >= 'A' ? ((x[0] & 0xdf) - 'A') + 10 : (x[0] - '0'));
>    c *= 16;
>    c += (x[1] >= 'A' ? ((x[1] & 0xdf) - 'A') + 10 : (x[1] - '0'));
>    return(c);
> }
>
> void unescape_url(uchar *url)
> {
>    register int i, j;
>
>    for(i = 0, j = 0; url[j]; ++i, ++j)
>    {
>       if((url[i] = url[j]) == '%')
>       {
>          url[i] = x2c(&url[j + 1]);
>          j += 2;
>       }
>       else if (url[i] == '+')
>          url[i] = ' ';
>    }
>    url[i] = '\0';  /* terminate it at the new length */
> }
>
> -----Original Message-----
> From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp]
> Sent: Sunday, October 28, 2001 4:57 PM
> To: jklcom@mindspring.com
> Cc: pgsql-general@postgresql.org
> Subject: RE: [GENERAL] UNICODE
>
>
> > I'm also trying to write some Chinese data to postgresql database.  I'm
> > gibberish after it's written to the database.
> >
> > I recognize the problem is at the http request.  How do I retrieve double
> > byte characters through http request using C/C++? And how do I write it
> the
> > database?
>
> Nothing special. Just read/write one by one.
>
> > And how do I tell it what kind of encoding to use?
>
> set client_encoding.
> --
> Tatsuo Ishii
>

pgsql-general by date:

From: Doug McNaught
Date: 29 October 2001, 19:14:04
Subject: Re: Differential Backups

From: Alvaro Herrera
Date: 29 October 2001, 21:09:12
Subject: Re: Differential Backups

Re: UNICODE - Mailing list pgsql-general

Previous

Next