Thread: pg_dump on local Windows, pg_restore on Linux?

pg_dump on local Windows, pg_restore on Linux?

From
"Phoenix Kiula"
Date:
Couple of questions with porting:

1. I have been playing around with my databases locally on Win XP so
as not to hurt our website traffic. Now I would like to move the
database to a Linux CentOS server. Can I use pg_dump on Windows and
pg_restore it on Linux? If so, any tips on what I should keep in mind
(e.g., manual seems to suggest that pg_restore prefers tar gzipped
format...but I'm not sure if Windows can generate this?)

2. I would like my database to be UTF-8. I issue the command

CREATE DATABASE mydb OWNER me ENCODING 'utf8';

Should I add anything else to it, such as collation? I did not find
any option for that in here:
http://www.postgresql.org/docs/8.2/static/sql-createdatabase.html

3. Also, just to confirm, if I have utf-8 database, then all tables in
it should be able to take utf-8 data? I would just like these tables
to take whatever I send to them. No error checking or encoding
checking. Can I disable the painful error that PG keeps throwing if
even a single erroneous byte shows up? I'd rather have 'garbage data'
than not go through with the query.

Thanks for any tips!

Re: pg_dump on local Windows, pg_restore on Linux?

From
"Scott Marlowe"
Date:
On 8/15/07, Phoenix Kiula <phoenix.kiula@gmail.com> wrote:
> Couple of questions with porting:
>
> 1. I have been playing around with my databases locally on Win XP so
> as not to hurt our website traffic. Now I would like to move the
> database to a Linux CentOS server. Can I use pg_dump on Windows and
> pg_restore it on Linux? If so, any tips on what I should keep in mind
> (e.g., manual seems to suggest that pg_restore prefers tar gzipped
> format...but I'm not sure if Windows can generate this?)
>
> 2. I would like my database to be UTF-8. I issue the command
>
> CREATE DATABASE mydb OWNER me ENCODING 'utf8';
>
> Should I add anything else to it, such as collation? I did not find
> any option for that in here:
> http://www.postgresql.org/docs/8.2/static/sql-createdatabase.html
>
> 3. Also, just to confirm, if I have utf-8 database, then all tables in
> it should be able to take utf-8 data? I would just like these tables
> to take whatever I send to them. No error checking or encoding
> checking. Can I disable the painful error that PG keeps throwing if
> even a single erroneous byte shows up? I'd rather have 'garbage data'
> than not go through with the query.

Requirements 2 and 3 are exclusive.  Either you want your database to
be UTF-8, which means that invalid byte sequences should be rejected
because they AREN'T utf-8, or you want your database to swallow
whatever you throw at it, in which case, that's not UTF-8.

Re: pg_dump on local Windows, pg_restore on Linux?

From
"Phoenix Kiula"
Date:
On 16/08/07, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On 8/15/07, Phoenix Kiula <phoenix.kiula@gmail.com> wrote:
> > Couple of questions with porting:
> >
> > 1. I have been playing around with my databases locally on Win XP so
> > as not to hurt our website traffic. Now I would like to move the
> > database to a Linux CentOS server. Can I use pg_dump on Windows and
> > pg_restore it on Linux? If so, any tips on what I should keep in mind
> > (e.g., manual seems to suggest that pg_restore prefers tar gzipped
> > format...but I'm not sure if Windows can generate this?)
> >
> > 2. I would like my database to be UTF-8. I issue the command
> >
> > CREATE DATABASE mydb OWNER me ENCODING 'utf8';
> >
> > Should I add anything else to it, such as collation? I did not find
> > any option for that in here:
> > http://www.postgresql.org/docs/8.2/static/sql-createdatabase.html
> >
> > 3. Also, just to confirm, if I have utf-8 database, then all tables in
> > it should be able to take utf-8 data? I would just like these tables
> > to take whatever I send to them. No error checking or encoding
> > checking. Can I disable the painful error that PG keeps throwing if
> > even a single erroneous byte shows up? I'd rather have 'garbage data'
> > than not go through with the query.
>
> Requirements 2 and 3 are exclusive.  Either you want your database to
> be UTF-8, which means that invalid byte sequences should be rejected
> because they AREN'T utf-8, or you want your database to swallow
> whatever you throw at it, in which case, that's not UTF-8.



Thanks. Is there an encoding that is so flexible that it will silently
accept whatever I send to it without throwing an exception?

Re: pg_dump on local Windows, pg_restore on Linux?

From
"Leif B. Kristensen"
Date:
On Wednesday 15. August 2007, Phoenix Kiula wrote:
>Couple of questions with porting:
>
>1. I have been playing around with my databases locally on Win XP so
>as not to hurt our website traffic. Now I would like to move the
>database to a Linux CentOS server. Can I use pg_dump on Windows and
>pg_restore it on Linux? If so, any tips on what I should keep in mind
>(e.g., manual seems to suggest that pg_restore prefers tar gzipped
>format...but I'm not sure if Windows can generate this?)

This is how I import my output from pg_dump:

    createdb --encoding=UNICODE $DB
    psql -U postgres -d $DB -f $INFILE > restore.log 2>&1

This way, you just feed psql with an uncompressed dump file. I've never
used pg_restore.

>2. I would like my database to be UTF-8. I issue the command
>
>CREATE DATABASE mydb OWNER me ENCODING 'utf8';
>
>Should I add anything else to it, such as collation? I did not find
>any option for that in here:
>http://www.postgresql.org/docs/8.2/static/sql-createdatabase.html

At least on a *nix system, collation is based on the value of the LC_ALL
environment variable at dbinit time. There's nothing you can do about
it in a live database. IMO that's a little awkward, and is what finally
made me change the global from ISO-8859-1 to UTF-8 on my three Gentoo
Linux machines.
--
Leif Biberg Kristensen | Registered Linux User #338009
http://solumslekt.org/ | Cruising with Gentoo/KDE
My Jazz Jukebox: http://www.last.fm/user/leifbk/

Re: pg_dump on local Windows, pg_restore on Linux?

From
"Phoenix Kiula"
Date:
> At least on a *nix system, collation is based on the value of the LC_ALL
> environment variable at dbinit time. There's nothing you can do about
> it in a live database. IMO that's a little awkward, and is what finally
> made me change the global from ISO-8859-1 to UTF-8 on my three Gentoo
> Linux machines.


This is great info, thanks. Could you let me know how I could change
the global values of "LC_ALL"?  I am on Linux too, just CentOS, but I
suppose it should be the same or similar?

Re: pg_dump on local Windows, pg_restore on Linux?

From
Tom Lane
Date:
"Phoenix Kiula" <phoenix.kiula@gmail.com> writes:
> Thanks. Is there an encoding that is so flexible that it will silently
> accept whatever I send to it without throwing an exception?

SQL_ASCII does that.  Whether it's a good idea to use it is
questionable.  One thing to think about is that you will be unable to
provide translation to and from different client encodings --- the
database will always just regurgitate the bytes it was given, since it
doesn't really know what encoding they are in.

            regards, tom lane

Re: pg_dump on local Windows, pg_restore on Linux?

From
"Leif B. Kristensen"
Date:
On Wednesday 15. August 2007, Phoenix Kiula wrote:
>> At least on a *nix system, collation is based on the value of the
>> LC_ALL environment variable at dbinit time. There's nothing you can
>> do about it in a live database. IMO that's a little awkward, and is
>> what finally made me change the global from ISO-8859-1 to UTF-8 on
>> my three Gentoo Linux machines.
>
>This is great info, thanks. Could you let me know how I could change
>the global values of "LC_ALL"?  I am on Linux too, just CentOS, but I
>suppose it should be the same or similar?

I don't have the foggiest idea how it's done in CentOS. In Gentoo, you
just edit the contents of /etc/env.d/02locale:

balapapa ~ # cat /etc/env.d/02locale
LC_ALL="nb_NO.UTF-8"
LANG=""
LC_CTYPE="nb_NO.UTF.8"
LC_NUMERIC="nb_NO.UTF.8"
LC_TIME="nb_NO.UTF.8"
LC_COLLATE="nb_NO.UTF.8"
LC_MONETARY="nb_NO.UTF.8"
LC_PAPER="nb_NO.UTF.8"
LC_NAME="nb_NO.UTF.8"
LC_ADDRESS="nb_NO.UTF.8"
LC_TELEPHONE="nb_NO.UTF.8"
LC_MEASUREMENT="nb_NO.UTF.8"
LC_IDENTIFICATION="nb_NO.UTF.8"

To update the environment settings globally, the Gentoo method is to
issue the command

"env-update && source /etc/profile"

from root. But you should really Read The Fine Manual about this. If you
for instance have filenames with non-ASCII characters in them, prepare
yourself for some interesting challenges.
--
Leif Biberg Kristensen | Registered Linux User #338009
http://solumslekt.org/ | Cruising with Gentoo/KDE
My Jazz Jukebox: http://www.last.fm/user/leifbk/

Re: pg_dump on local Windows, pg_restore on Linux?

From
"James B. Byrne"
Date:
Date: Wed, 15 Aug 2007 20:31:45 +0200
From: "Leif B. Kristensen" <leif@solumslekt.org>
To: pgsql-general@postgresql.org
Subject: Re: pg_dump on local Windows, pg_restore on Linux?
Message-ID: <200708152031.45917.leif@solumslekt.org>

On Wednesday 15. August 2007, Phoenix Kiula wrote:
>
>This is great info, thanks. Could you let me know how I could change
>the global values of "LC_ALL"?  I am on Linux too, just CentOS, but I
>suppose it should be the same or similar?

See: man local and man 1p locale and man 1p localedef and /etc/sysconfig/i18n

Do NOT manually set LC_ALL on a CentOS system unless you know exactly what
you are doing and what programs will be impacted.  If set then LC_ALL
overrides all of the individual LC_ settings.


Regards,

--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3


Re: pg_dump on local Windows, pg_restore on Linux?

From
"Phoenix Kiula"
Date:
On 16/08/07, James B. Byrne <byrnejb@harte-lyne.ca> wrote:
>
> Date: Wed, 15 Aug 2007 20:31:45 +0200
> From: "Leif B. Kristensen" <leif@solumslekt.org>
> To: pgsql-general@postgresql.org
> Subject: Re: pg_dump on local Windows, pg_restore on Linux?
> Message-ID: <200708152031.45917.leif@solumslekt.org>
>
> On Wednesday 15. August 2007, Phoenix Kiula wrote:
> >
> >This is great info, thanks. Could you let me know how I could change
> >the global values of "LC_ALL"?  I am on Linux too, just CentOS, but I
> >suppose it should be the same or similar?
>
> See: man local and man 1p locale and man 1p localedef and /etc/sysconfig/i18n
>
> Do NOT manually set LC_ALL on a CentOS system unless you know exactly what
> you are doing and what programs will be impacted.  If set then LC_ALL
> overrides all of the individual LC_ settings.
>



Thanks. Here's my locale information:

> locale
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=


Is this enough to run a pgsql database that is supposed to house utf-8
content? Or should I somehow make sure LC_ALL carries the value of
'en_US.UTF-8' as well? If not manually, how would I do it, and how can
I trace what other programs may be impacted. It sounds like an
impossible task given that many applications/servers run on this
machine, so I am hoping that either there is a workaround for
postgresql, or that postgresql will work with just the above
parameters as they are.

TIA!

UTF-8 encoding

From
"James B. Byrne"
Date:
On Wed, August 15, 2007 21:15, Phoenix Kiula wrote:
>
> Thanks. Here's my locale information:
>
>> locale
> LANG=en_US.UTF-8
> LC_CTYPE="en_US.UTF-8"
> LC_NUMERIC="en_US.UTF-8"
> LC_TIME="en_US.UTF-8"
> LC_COLLATE="en_US.UTF-8"
> LC_MONETARY="en_US.UTF-8"
> LC_MESSAGES="en_US.UTF-8"
> LC_PAPER="en_US.UTF-8"
> LC_NAME="en_US.UTF-8"
> LC_ADDRESS="en_US.UTF-8"
> LC_TELEPHONE="en_US.UTF-8"
> LC_MEASUREMENT="en_US.UTF-8"
> LC_IDENTIFICATION="en_US.UTF-8"
> LC_ALL=
>
>
> Is this enough to run a pgsql database that is supposed to house utf-8
> content?

Yes, those settings are entirely appropriate for CentOS where the system
language is U.S. English. You do not need to override them by setting
LC_ALL and would gain nothing thereby.

The locale preface settings (en_US) determine how internationalized
programs display information to users, whether by text presentation format
masks or via character collation orders.  The important thing for
PostgreSQL is that any application host system that generates data stored
by a UTF-8 database instance have its character encoding be UTF-8 or
provide a means to convert it before submitting it to the DBMS.  Otherwise
you will get encoding errors when attempting to write data that otherwise
appears to the user as perfectly sensible text.


Sincerely,

--
***          E-Mail is NOT a SECURE channel          ***
James B. Byrne                mailto:ByrneJB@Harte-Lyne.ca
Harte & Lyne Limited          http://www.harte-lyne.ca
9 Brockley Drive              vox: +1 905 561 1241
Hamilton, Ontario             fax: +1 905 561 0757
Canada  L8E 3C3