Thread: pg_dump on local Windows, pg_restore on Linux?
Couple of questions with porting: 1. I have been playing around with my databases locally on Win XP so as not to hurt our website traffic. Now I would like to move the database to a Linux CentOS server. Can I use pg_dump on Windows and pg_restore it on Linux? If so, any tips on what I should keep in mind (e.g., manual seems to suggest that pg_restore prefers tar gzipped format...but I'm not sure if Windows can generate this?) 2. I would like my database to be UTF-8. I issue the command CREATE DATABASE mydb OWNER me ENCODING 'utf8'; Should I add anything else to it, such as collation? I did not find any option for that in here: http://www.postgresql.org/docs/8.2/static/sql-createdatabase.html 3. Also, just to confirm, if I have utf-8 database, then all tables in it should be able to take utf-8 data? I would just like these tables to take whatever I send to them. No error checking or encoding checking. Can I disable the painful error that PG keeps throwing if even a single erroneous byte shows up? I'd rather have 'garbage data' than not go through with the query. Thanks for any tips!
On 8/15/07, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > Couple of questions with porting: > > 1. I have been playing around with my databases locally on Win XP so > as not to hurt our website traffic. Now I would like to move the > database to a Linux CentOS server. Can I use pg_dump on Windows and > pg_restore it on Linux? If so, any tips on what I should keep in mind > (e.g., manual seems to suggest that pg_restore prefers tar gzipped > format...but I'm not sure if Windows can generate this?) > > 2. I would like my database to be UTF-8. I issue the command > > CREATE DATABASE mydb OWNER me ENCODING 'utf8'; > > Should I add anything else to it, such as collation? I did not find > any option for that in here: > http://www.postgresql.org/docs/8.2/static/sql-createdatabase.html > > 3. Also, just to confirm, if I have utf-8 database, then all tables in > it should be able to take utf-8 data? I would just like these tables > to take whatever I send to them. No error checking or encoding > checking. Can I disable the painful error that PG keeps throwing if > even a single erroneous byte shows up? I'd rather have 'garbage data' > than not go through with the query. Requirements 2 and 3 are exclusive. Either you want your database to be UTF-8, which means that invalid byte sequences should be rejected because they AREN'T utf-8, or you want your database to swallow whatever you throw at it, in which case, that's not UTF-8.
On 16/08/07, Scott Marlowe <scott.marlowe@gmail.com> wrote: > On 8/15/07, Phoenix Kiula <phoenix.kiula@gmail.com> wrote: > > Couple of questions with porting: > > > > 1. I have been playing around with my databases locally on Win XP so > > as not to hurt our website traffic. Now I would like to move the > > database to a Linux CentOS server. Can I use pg_dump on Windows and > > pg_restore it on Linux? If so, any tips on what I should keep in mind > > (e.g., manual seems to suggest that pg_restore prefers tar gzipped > > format...but I'm not sure if Windows can generate this?) > > > > 2. I would like my database to be UTF-8. I issue the command > > > > CREATE DATABASE mydb OWNER me ENCODING 'utf8'; > > > > Should I add anything else to it, such as collation? I did not find > > any option for that in here: > > http://www.postgresql.org/docs/8.2/static/sql-createdatabase.html > > > > 3. Also, just to confirm, if I have utf-8 database, then all tables in > > it should be able to take utf-8 data? I would just like these tables > > to take whatever I send to them. No error checking or encoding > > checking. Can I disable the painful error that PG keeps throwing if > > even a single erroneous byte shows up? I'd rather have 'garbage data' > > than not go through with the query. > > Requirements 2 and 3 are exclusive. Either you want your database to > be UTF-8, which means that invalid byte sequences should be rejected > because they AREN'T utf-8, or you want your database to swallow > whatever you throw at it, in which case, that's not UTF-8. Thanks. Is there an encoding that is so flexible that it will silently accept whatever I send to it without throwing an exception?
On Wednesday 15. August 2007, Phoenix Kiula wrote: >Couple of questions with porting: > >1. I have been playing around with my databases locally on Win XP so >as not to hurt our website traffic. Now I would like to move the >database to a Linux CentOS server. Can I use pg_dump on Windows and >pg_restore it on Linux? If so, any tips on what I should keep in mind >(e.g., manual seems to suggest that pg_restore prefers tar gzipped >format...but I'm not sure if Windows can generate this?) This is how I import my output from pg_dump: createdb --encoding=UNICODE $DB psql -U postgres -d $DB -f $INFILE > restore.log 2>&1 This way, you just feed psql with an uncompressed dump file. I've never used pg_restore. >2. I would like my database to be UTF-8. I issue the command > >CREATE DATABASE mydb OWNER me ENCODING 'utf8'; > >Should I add anything else to it, such as collation? I did not find >any option for that in here: >http://www.postgresql.org/docs/8.2/static/sql-createdatabase.html At least on a *nix system, collation is based on the value of the LC_ALL environment variable at dbinit time. There's nothing you can do about it in a live database. IMO that's a little awkward, and is what finally made me change the global from ISO-8859-1 to UTF-8 on my three Gentoo Linux machines. -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE My Jazz Jukebox: http://www.last.fm/user/leifbk/
> At least on a *nix system, collation is based on the value of the LC_ALL > environment variable at dbinit time. There's nothing you can do about > it in a live database. IMO that's a little awkward, and is what finally > made me change the global from ISO-8859-1 to UTF-8 on my three Gentoo > Linux machines. This is great info, thanks. Could you let me know how I could change the global values of "LC_ALL"? I am on Linux too, just CentOS, but I suppose it should be the same or similar?
"Phoenix Kiula" <phoenix.kiula@gmail.com> writes: > Thanks. Is there an encoding that is so flexible that it will silently > accept whatever I send to it without throwing an exception? SQL_ASCII does that. Whether it's a good idea to use it is questionable. One thing to think about is that you will be unable to provide translation to and from different client encodings --- the database will always just regurgitate the bytes it was given, since it doesn't really know what encoding they are in. regards, tom lane
On Wednesday 15. August 2007, Phoenix Kiula wrote: >> At least on a *nix system, collation is based on the value of the >> LC_ALL environment variable at dbinit time. There's nothing you can >> do about it in a live database. IMO that's a little awkward, and is >> what finally made me change the global from ISO-8859-1 to UTF-8 on >> my three Gentoo Linux machines. > >This is great info, thanks. Could you let me know how I could change >the global values of "LC_ALL"? I am on Linux too, just CentOS, but I >suppose it should be the same or similar? I don't have the foggiest idea how it's done in CentOS. In Gentoo, you just edit the contents of /etc/env.d/02locale: balapapa ~ # cat /etc/env.d/02locale LC_ALL="nb_NO.UTF-8" LANG="" LC_CTYPE="nb_NO.UTF.8" LC_NUMERIC="nb_NO.UTF.8" LC_TIME="nb_NO.UTF.8" LC_COLLATE="nb_NO.UTF.8" LC_MONETARY="nb_NO.UTF.8" LC_PAPER="nb_NO.UTF.8" LC_NAME="nb_NO.UTF.8" LC_ADDRESS="nb_NO.UTF.8" LC_TELEPHONE="nb_NO.UTF.8" LC_MEASUREMENT="nb_NO.UTF.8" LC_IDENTIFICATION="nb_NO.UTF.8" To update the environment settings globally, the Gentoo method is to issue the command "env-update && source /etc/profile" from root. But you should really Read The Fine Manual about this. If you for instance have filenames with non-ASCII characters in them, prepare yourself for some interesting challenges. -- Leif Biberg Kristensen | Registered Linux User #338009 http://solumslekt.org/ | Cruising with Gentoo/KDE My Jazz Jukebox: http://www.last.fm/user/leifbk/
Date: Wed, 15 Aug 2007 20:31:45 +0200 From: "Leif B. Kristensen" <leif@solumslekt.org> To: pgsql-general@postgresql.org Subject: Re: pg_dump on local Windows, pg_restore on Linux? Message-ID: <200708152031.45917.leif@solumslekt.org> On Wednesday 15. August 2007, Phoenix Kiula wrote: > >This is great info, thanks. Could you let me know how I could change >the global values of "LC_ALL"? I am on Linux too, just CentOS, but I >suppose it should be the same or similar? See: man local and man 1p locale and man 1p localedef and /etc/sysconfig/i18n Do NOT manually set LC_ALL on a CentOS system unless you know exactly what you are doing and what programs will be impacted. If set then LC_ALL overrides all of the individual LC_ settings. Regards, -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB@Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3
On 16/08/07, James B. Byrne <byrnejb@harte-lyne.ca> wrote: > > Date: Wed, 15 Aug 2007 20:31:45 +0200 > From: "Leif B. Kristensen" <leif@solumslekt.org> > To: pgsql-general@postgresql.org > Subject: Re: pg_dump on local Windows, pg_restore on Linux? > Message-ID: <200708152031.45917.leif@solumslekt.org> > > On Wednesday 15. August 2007, Phoenix Kiula wrote: > > > >This is great info, thanks. Could you let me know how I could change > >the global values of "LC_ALL"? I am on Linux too, just CentOS, but I > >suppose it should be the same or similar? > > See: man local and man 1p locale and man 1p localedef and /etc/sysconfig/i18n > > Do NOT manually set LC_ALL on a CentOS system unless you know exactly what > you are doing and what programs will be impacted. If set then LC_ALL > overrides all of the individual LC_ settings. > Thanks. Here's my locale information: > locale LANG=en_US.UTF-8 LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL= Is this enough to run a pgsql database that is supposed to house utf-8 content? Or should I somehow make sure LC_ALL carries the value of 'en_US.UTF-8' as well? If not manually, how would I do it, and how can I trace what other programs may be impacted. It sounds like an impossible task given that many applications/servers run on this machine, so I am hoping that either there is a workaround for postgresql, or that postgresql will work with just the above parameters as they are. TIA!
On Wed, August 15, 2007 21:15, Phoenix Kiula wrote: > > Thanks. Here's my locale information: > >> locale > LANG=en_US.UTF-8 > LC_CTYPE="en_US.UTF-8" > LC_NUMERIC="en_US.UTF-8" > LC_TIME="en_US.UTF-8" > LC_COLLATE="en_US.UTF-8" > LC_MONETARY="en_US.UTF-8" > LC_MESSAGES="en_US.UTF-8" > LC_PAPER="en_US.UTF-8" > LC_NAME="en_US.UTF-8" > LC_ADDRESS="en_US.UTF-8" > LC_TELEPHONE="en_US.UTF-8" > LC_MEASUREMENT="en_US.UTF-8" > LC_IDENTIFICATION="en_US.UTF-8" > LC_ALL= > > > Is this enough to run a pgsql database that is supposed to house utf-8 > content? Yes, those settings are entirely appropriate for CentOS where the system language is U.S. English. You do not need to override them by setting LC_ALL and would gain nothing thereby. The locale preface settings (en_US) determine how internationalized programs display information to users, whether by text presentation format masks or via character collation orders. The important thing for PostgreSQL is that any application host system that generates data stored by a UTF-8 database instance have its character encoding be UTF-8 or provide a means to convert it before submitting it to the DBMS. Otherwise you will get encoding errors when attempting to write data that otherwise appears to the user as perfectly sensible text. Sincerely, -- *** E-Mail is NOT a SECURE channel *** James B. Byrne mailto:ByrneJB@Harte-Lyne.ca Harte & Lyne Limited http://www.harte-lyne.ca 9 Brockley Drive vox: +1 905 561 1241 Hamilton, Ontario fax: +1 905 561 0757 Canada L8E 3C3