Thread: upgrading a three year old server
Hi, I have just performed an upgrade on a 5 year old server (hardware) running 3 year old software - PG 7.2.x I have moved to PG 8.1.1 installed from FC4 rpms I have tried the following: createdb xxx -E LATIN1 <- old version had this encoding (intranet server dates from 2000 upgrade to 7.2 done in 2002) psql -d xxxx -U postgres -f xxxx.out.1 <- this being my daily dump When I connect to the database with either webmin or pgaccess I have multiple tables where it is shown as 0 tuples. The data is there (read using nedit on the dump file) but not being read by either of the interfaces I usually use for basic data control. And my JSP webapp is choking on the French accents. I have yet to try OpenOffice which is now the secondary data manipulation tool used by the client after the JSP app. After a long Google I have come here to ask of you some pointers =:-D Cheers Tony Grant -- Tony Grant www.tgds.net - vente d'ordinateurs mini-itx en ligne www.tgds.net/epiaix1.html - A fedora Core 4 based live CD for VIA Epia-M
On 18.12.2005 18:49, tony wrote: > createdb xxx -E LATIN1 <- old version had this encoding (intranet server > dates from 2000 upgrade to 7.2 done in 2002) > psql -d xxxx -U postgres -f xxxx.out.1 <- this being my daily dump Hm, there have been no error messages from psql during the import? > The data is there (read using nedit on the dump file) but not being read by either of the interfaces I usually use forbasic data control. You mean the data is in the dump file, but the restore failed? > And my JSP webapp is choking on the French accents. Are you sure about the LATIN1 encoding of your old dump? :) -- Regards, Hannes Dorbath
Le lundi 19 décembre 2005 à 09:20 +0100, Hannes Dorbath a écrit : > On 18.12.2005 18:49, tony wrote: > > createdb xxx -E LATIN1 <- old version had this encoding (intranet server > > dates from 2000 upgrade to 7.2 done in 2002) > > psql -d xxxx -U postgres -f xxxx.out.1 <- this being my daily dump > > Hm, there have been no error messages from psql during the import? OK I have run a vacuum on the database and that fixed the invisible tuples. > > The data is there (read using nedit on the dump file) but not being read by either of the interfaces I usually use forbasic data control. > > You mean the data is in the dump file, but the restore failed? Restore succeeded but there were error messages about encoding (see below). > > And my JSP webapp is choking on the French accents. > > Are you sure about the LATIN1 encoding of your old dump? :) Two apps out of three work OK now that I have vacuumed. The third one has encoding problems. Can I change the encoding to UTF-8 easily? Thanks Tony -- Tony Grant www.tgds.net - vente d'ordinateurs mini-itx en ligne www.tgds.net/epiaix1.html - A fedora Core 4 based live CD for VIA Epia-M
tony wrote: > Can I change the encoding to UTF-8 easily? I had to convert as well while upgrading to 8.1.0. We came from 8.0.3; for some reason all our db's were 'LATIN1' in PG, and while things worked OK, we wanted to get rid of this too. All the data in our db's came from web-apps and forms. All the pages that were used to enter that data were UTF-8, so essentially the data *was* correct UTF8. First I went for the dirty hack and tried to fool PG by simply changing the LATIN1 string in the dump to UTF8; it seems to work but later on pg_restore chokes on the blobs. After extensive research I found a solution that worked 100% for me. On the old server, dump db including blobs etc: pg_dump -Fc yourdb > yourdb.dump On our new server (debian), the installation seemed to have run initdb with -E LATIN1, so I needed to remove my data tree first and the properly recreate it: initd -E UTF-8 -D /var/lib/postgresql/8.1/main/data After that I created the db but with the old LATIN1 encoding: createdb -E LATIN1 yourdb Then restore the old dump into it: pg_restore yourdb yourdb.dump From this moment on, the data on my webpages looks OK (I looked at some really foreign stuff my users had filled in, like Arabic and Hebrew (no pun intended)), but the db is still LATIN1. Because 8.1 allows you to dump db's with blobs and all to TXT files, you dump the whole thing to TXT on your new server: pg_dump -Fp yourdb > yourdb.txt.dump Then edit the dumpfile manually and change the LATIN1 string to UTF8. If you are sure your db does not actually contain the string LATIN1, you could do: pg_dump -Fp yourdb | sed 's/LATIN1/UTF8/g' > yourdb.txt.dump Now drop the db and recreate it with the correct encoding: dropdb yourdb && createdb -E UTF8 yourdb Now import the dump; this needs to be done with psql because it is txt: psql yourdb < yourdb.txt.dump At this time you will have a proper UTF8 db. FYI, this solution worked for me, and I am pretty sure the reason why it worked it that our websites were 'UTF8-proof' from the beginning. If everything is setup correct, you might see really cool strings appear on user-filled-in forms: http://www.terena.nl/compendium/2005/basicinfo.php?nrenid=26 :) Best regards, -- * *** Dick Visser TIENHUIS Networking ** * * Touwbaan 68 P: +31206843731 * * *** 1018 HS Amsterdam F: +31208641420 * * * * The Netherlands M: +31622698108 * ** * IP-phone (SIP)/email: dick@tienhuis.nl * * * PGP-key: http://www.tienhuis.nl/gpg.txt * * * Webcam: http://www.tienhuis.nl/cam2.asx *** ***
Le lundi 19 décembre 2005 à 12:51 +0100, Dick Visser a écrit : > > Can I change the encoding to UTF-8 easily? > > I had to convert as well while upgrading to 8.1.0. <snip> WOW! that is a real howto. It belongs on the Postgresql techdocs site. I'll try it and let you know how it goes. Tony -- Tony Grant www.tgds.net - vente d'ordinateurs mini-itx en ligne www.tgds.net/epiaix1.html - A fedora Core 4 based live CD for VIA Epia-M
Le lundi 19 décembre 2005 à 12:51 +0100, Dick Visser a écrit : > FYI, this solution worked for me, and I am pretty sure the reason why it > worked it that our websites were 'UTF8-proof' from the beginning. OK I have found that the client has inserted some rogue characters... I also remember I was obliged to do some voodoo magic on catalina to get it to serve the right charset. I haven't touched this system since January 2002 so it is kind of hard to remember the tricks I used with encoding between PostgreSQL, JDBC and Tomcat. Cheers Tony -- Tony Grant www.tgds.net - vente d'ordinateurs mini-itx en ligne www.tgds.net/epiaix1.html - A fedora Core 4 based live CD for VIA Epia-M