Thread: Mixed client_encoding database question
Hi all, We have a product with a PostgreSQLdatabase (unicode) It has a (legacy) client app that uses client_encoding = win1521 The legacy app will be around for several years. It cannot change to unicode. There also is a new client app, written in java. It uses client_encoding = unicode. The problem: if someone inputs a non-win1252 character via the new java app, the legacy app fails. PostgreSQL returns an error: character blabla of encodig UTF-8 has no equivalent in WIN1252. What are my options to stop the legacy app from failing? -Is there a setting to tell PostgreSQL to fail the character gracefully by substuting it? -Should i filter out all non win1252 charachters at the new java app and substitute them? (not so nice, i'd like to have the chars in the db) -?more? Greetings, Joek
On 01/11/2011 02:32 AM, J. Hondius wrote: > What are my options to stop the legacy app from failing? IMO there aren't any good ones except "fix the legacy app not to be brain-dead about character encodings". > -Is there a setting to tell PostgreSQL to fail the character gracefully > by substuting it? The problem there is that a round-trip via the client is no longer a safe no-change operation. UPDATE tablename SET x = x WHERE ....; no longer means the same as: SELECT x FROM tablename WHERE ....; -- client reads value of 'x' UPDATE tablename SET x = 'value-from-client'; This is particularly critical in cases where the client may issue an UPDATE that lists values for all fields, changed and not, when making a change to a field. This is common in ORM systems, which usually list all fields mapped by the object without caring which ones might actually be different from the DB. If you can verify that the legacy app never UPDATEs fields that might contain textual data containing unicode chars inserted/updated via the Java app, you could probably get away with this approach. Alas, I don't think you can do it via Pg settings, you'd have to modify your legacy app's SQL to remap the chars, which you probably can't do. Maybe you could modify the Pg ODBC driver to do the remapping you desire - assuming your legacy app talks to the database via ODBC. > -Should i filter out all non win1252 charachters at the new java app and ... thus prolonging your pain. Once the legacy app is gone, you'll be stuck with the limitations of the Java app, which by then you probably won't able to or allowed to change to permit unicode for one reason or another. If you *do* decide to filter out unsupported chars, make sure it's configurable in the app without source code changes and that both modes get covered by your unit tests. Be sure it's ALWAYS ready to switch to unicode as soon as the legacy stuff is gone. This is probably what I'd do in your situation, if I couldn't fix or retire the legacy app. I'd just make really, really sure I could always enable full char encoding support later. > substitute them? (not so nice, i'd like to have the chars in the db) Ick. -- Craig Ringer
Hi, I suggest you to recreate database in WIN1521 encoding, so Java will not write unsupported characters. As well you can configure JDBC in way it will allow You to change client encoding, so you can change it to WIN. In both of above You will get error when you will try to write unsupported characters. You may also do http://www.postgresql.org/docs/8.4/interactive/sql-createconversion.html Radosław Smogura, http://softperience.eu On Mon, 10 Jan 2011 16:32:13 +0100, J. Hondius wrote: > Hi all, > > We have a product with a PostgreSQLdatabase (unicode) > It has a (legacy) client app that uses client_encoding = win1521 > The legacy app will be around for several years. > It cannot change to unicode. > > There also is a new client app, written in java. > It uses client_encoding = unicode. > > The problem: if someone inputs a non-win1252 character via the new > java app, the legacy app fails. > PostgreSQL returns an error: character blabla of encodig UTF-8 has no > equivalent in WIN1252. > > What are my options to stop the legacy app from failing? > -Is there a setting to tell PostgreSQL to fail the character > gracefully by substuting it? > -Should i filter out all non win1252 charachters at the new java app > and substitute them? (not so nice, i'd like to have the chars in the > db) > -?more? > > Greetings, Joek
On 10 Jan 2011, at 16:32, J. Hondius wrote: > Hi all, > > We have a product with a PostgreSQLdatabase (unicode) > It has a (legacy) client app that uses client_encoding = win1521 > The legacy app will be around for several years. > It cannot change to unicode. > > There also is a new client app, written in java. > It uses client_encoding = unicode. I think you may be able to set the client encoding at the user level. In that case you could use a different user for thelegacy app than you use for the java app and have the appropriate client encoding for both cases. try: ALTER ROLE foo SET client_encoding TO 'win1251'; Note: This is just off the top of my head, I haven't actually verified any of this. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll see there is no forest. !DSPAM:737,4d2c9b0a11871070112136!