Thread: encoding question

encoding question

From
"Christopher Kings-Lynne"
Date:
Hi,

In phpPgAdmin, we automatically set the HTML page encoding to and encoding
that allows us to properly display the encoding of the current postgresql
database.  I have a small problem with SQL_ASCII.  Theoretically (and what
we currently do), we should set page encoding to US-ASCII.  However,
Postgres seems to allow unlauts and all sorts of extra 8 bit data in ASCII
databases, so what encoding should I use.  Is ISO-8859-1 a better choice?
Is SQL_ASCII basically equivalent to the LATIN1 encoding?

My other question is we play around with bytea fields to escape nulls and
chars < 32 and stuff so that when someone browses the table, they get
'\000<unknown>\000...', etc.  However, are the other field types for which
we have to do this?  Can you put nulls and stuff in text/varchar/char
fields?  What about other fields?

Thanks,

Chris



Re: encoding question

From
Rod Taylor
Date:
> My other question is we play around with bytea fields to escape nulls and
> chars < 32 and stuff so that when someone browses the table, they get
> '\000<unknown>\000...', etc.  However, are the other field types for which
> we have to do this?  Can you put nulls and stuff in text/varchar/char
> fields?  What about other fields?

pg_escape_string
pg_escape_bytea

Escape everything :)

Re: encoding question

From
"Christopher Kings-Lynne"
Date:
I don't think you see what I mean :)

I want to display the data on a webpage to the user.  This means that a
varchar containing the string "I don't want it", should not appear as "I
don''t want it".  So pg_escape_string isn't used there.  bytea is different
tho because the default display isn't terribly useful...

Chris

----- Original Message ----- 
From: "Rod Taylor" <rbt@rbt.ca>
To: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au>
Cc: "Hackers" <pgsql-hackers@postgresql.org>
Sent: Thursday, August 07, 2003 9:46 AM
Subject: Re: [HACKERS] encoding question

> My other question is we play around with bytea fields to escape nulls and
> chars < 32 and stuff so that when someone browses the table, they get
> '\000<unknown>\000...', etc.  However, are the other field types for which
> we have to do this?  Can you put nulls and stuff in text/varchar/char
> fields?  What about other fields?

pg_escape_string
pg_escape_bytea

Escape everything :)



Re: encoding question

From
Barry Lind
Date:
Chris,

SQL_ASCII means that the data could be anything.  It could be Latin1, 
UTF-8, Latin9, whatever the code inserting data sends to the server.  In 
general the server accepts anything as SQL_ASCII.  In general this 
doesn't cause any problems as long as all the clients have a common 
understanding on what the real encoding of the data is.  However if you 
set CLIENT_ENCODING then the server does assume that the data is really 
7bit ascii.

In the jdbc driver we only support US-ASCII data if the character set is 
SQL_ASCII since we use the CLIENT_ENCODING setting of UTF8 to have the 
server perform the necessary conversion for us since java needs unicode 
strings.  And if you store anything other than US-ASCII data in a 
SQL_ASCII database the server will return invalid UTF-8 data to the client.

thanks,
--Barry


Christopher Kings-Lynne wrote:
> Hi,
> 
> In phpPgAdmin, we automatically set the HTML page encoding to and encoding
> that allows us to properly display the encoding of the current postgresql
> database.  I have a small problem with SQL_ASCII.  Theoretically (and what
> we currently do), we should set page encoding to US-ASCII.  However,
> Postgres seems to allow unlauts and all sorts of extra 8 bit data in ASCII
> databases, so what encoding should I use.  Is ISO-8859-1 a better choice?
> Is SQL_ASCII basically equivalent to the LATIN1 encoding?
> 
> My other question is we play around with bytea fields to escape nulls and
> chars < 32 and stuff so that when someone browses the table, they get
> '\000<unknown>\000...', etc.  However, are the other field types for which
> we have to do this?  Can you put nulls and stuff in text/varchar/char
> fields?  What about other fields?
> 
> Thanks,
> 
> Chris
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faqs/FAQ.html
> 




Re: encoding question

From
Hannu Krosing
Date:
Christopher Kings-Lynne kirjutas N, 07.08.2003 kell 04:33:
> My other question is we play around with bytea fields to escape nulls and
> chars < 32 and stuff so that when someone browses the table, they get
> '\000<unknown>\000...', etc.

actually bytea *stores* char(0), you get \000 or \x0 or ¬@ or whatever
depending on whatever you use for displaying it.

the escaping i's done only to fit the data into a SQL statement when
inserting the data into the database. select returns straight bytes from
bytea.

>   However, are the other field types for which
> we have to do this?  Can you put nulls and stuff in text/varchar/char
> fields?

No. Nulls are not allowed in text/varchar fields.

-------------
Hannu