Thread: encoding question
Hi, In phpPgAdmin, we automatically set the HTML page encoding to and encoding that allows us to properly display the encoding of the current postgresql database. I have a small problem with SQL_ASCII. Theoretically (and what we currently do), we should set page encoding to US-ASCII. However, Postgres seems to allow unlauts and all sorts of extra 8 bit data in ASCII databases, so what encoding should I use. Is ISO-8859-1 a better choice? Is SQL_ASCII basically equivalent to the LATIN1 encoding? My other question is we play around with bytea fields to escape nulls and chars < 32 and stuff so that when someone browses the table, they get '\000<unknown>\000...', etc. However, are the other field types for which we have to do this? Can you put nulls and stuff in text/varchar/char fields? What about other fields? Thanks, Chris
> My other question is we play around with bytea fields to escape nulls and > chars < 32 and stuff so that when someone browses the table, they get > '\000<unknown>\000...', etc. However, are the other field types for which > we have to do this? Can you put nulls and stuff in text/varchar/char > fields? What about other fields? pg_escape_string pg_escape_bytea Escape everything :)
I don't think you see what I mean :) I want to display the data on a webpage to the user. This means that a varchar containing the string "I don't want it", should not appear as "I don''t want it". So pg_escape_string isn't used there. bytea is different tho because the default display isn't terribly useful... Chris ----- Original Message ----- From: "Rod Taylor" <rbt@rbt.ca> To: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> Cc: "Hackers" <pgsql-hackers@postgresql.org> Sent: Thursday, August 07, 2003 9:46 AM Subject: Re: [HACKERS] encoding question > My other question is we play around with bytea fields to escape nulls and > chars < 32 and stuff so that when someone browses the table, they get > '\000<unknown>\000...', etc. However, are the other field types for which > we have to do this? Can you put nulls and stuff in text/varchar/char > fields? What about other fields? pg_escape_string pg_escape_bytea Escape everything :)
Chris, SQL_ASCII means that the data could be anything. It could be Latin1, UTF-8, Latin9, whatever the code inserting data sends to the server. In general the server accepts anything as SQL_ASCII. In general this doesn't cause any problems as long as all the clients have a common understanding on what the real encoding of the data is. However if you set CLIENT_ENCODING then the server does assume that the data is really 7bit ascii. In the jdbc driver we only support US-ASCII data if the character set is SQL_ASCII since we use the CLIENT_ENCODING setting of UTF8 to have the server perform the necessary conversion for us since java needs unicode strings. And if you store anything other than US-ASCII data in a SQL_ASCII database the server will return invalid UTF-8 data to the client. thanks, --Barry Christopher Kings-Lynne wrote: > Hi, > > In phpPgAdmin, we automatically set the HTML page encoding to and encoding > that allows us to properly display the encoding of the current postgresql > database. I have a small problem with SQL_ASCII. Theoretically (and what > we currently do), we should set page encoding to US-ASCII. However, > Postgres seems to allow unlauts and all sorts of extra 8 bit data in ASCII > databases, so what encoding should I use. Is ISO-8859-1 a better choice? > Is SQL_ASCII basically equivalent to the LATIN1 encoding? > > My other question is we play around with bytea fields to escape nulls and > chars < 32 and stuff so that when someone browses the table, they get > '\000<unknown>\000...', etc. However, are the other field types for which > we have to do this? Can you put nulls and stuff in text/varchar/char > fields? What about other fields? > > Thanks, > > Chris > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html >
Christopher Kings-Lynne kirjutas N, 07.08.2003 kell 04:33: > My other question is we play around with bytea fields to escape nulls and > chars < 32 and stuff so that when someone browses the table, they get > '\000<unknown>\000...', etc. actually bytea *stores* char(0), you get \000 or \x0 or ¬@ or whatever depending on whatever you use for displaying it. the escaping i's done only to fit the data into a SQL statement when inserting the data into the database. select returns straight bytes from bytea. > However, are the other field types for which > we have to do this? Can you put nulls and stuff in text/varchar/char > fields? No. Nulls are not allowed in text/varchar fields. ------------- Hannu