Thread: Dumping/Restoring with constraints?

Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

27 August 2008, 10:21:31

Hello. I have a database dump. With data and schema, which includes
all the constraints and rules. But it seems the pg_dumpall command
does not copy data in such a way that foreign keys are satisfied upon
restoring. Because tables are inter-related, importing them keep
giving errors and eventually no data is imported. Neither pg_dumpall
nor pg_restore seems to have a "without constraints" or "delay
constraints check" type command. What am I missing? Thanks for any
advice.

Re: Dumping/Restoring with constraints?

From

Terry Lee Tucker

Date:

27 August 2008, 10:29:54

On Wednesday 27 August 2008 09:19, Phoenix Kiula wrote:
> Hello. I have a database dump. With data and schema, which includes
> all the constraints and rules. But it seems the pg_dumpall command
> does not copy data in such a way that foreign keys are satisfied upon
> restoring. Because tables are inter-related, importing them keep
> giving errors and eventually no data is imported. Neither pg_dumpall
> nor pg_restore seems to have a "without constraints" or "delay
> constraints check" type command. What am I missing? Thanks for any
> advice.

We have all sorts of constraints and foreign keys and we have never had any
problem with pg_restore related to dumping such that foreign keys are
satisfied. You must have data already in the database that violates the
restraints. You can restore in two phases; that is, by restoring the schema,
and then the data using --disable-triggers. I'm assuming you are doing a
binary dump. See the man page for pg_restore.

HTH
--
Terry Lee Tucker
Turbo's IT Manager
Turbo, division of Ozburn-Hessey Logistics
2251 Jesse Jewell Pkwy NE
Gainesville, GA 30501
Tel: (336) 372-6812  Fax: (336) 372-6812  Cell: (336) 404-6987
terry@turbocorp.com
www.turbocorp.com

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

27 August 2008, 10:36:48

On Wed, Aug 27, 2008 at 9:29 PM, Terry Lee Tucker <terry@chosen-ones.org> wrote:

> We have all sorts of constraints and foreign keys and we have never had any
> problem with pg_restore related to dumping such that foreign keys are
> satisfied. You must have data already in the database that violates the
> restraints. You can restore in two phases; that is, by restoring the schema,
> and then the data using --disable-triggers. I'm assuming you are doing a
> binary dump. See the man page for pg_restore.

Thanks for this. I don't have any foreign key violations in my
existing database. I think the violation is happening because upon
restoring the table that is being populated checks in another table
that doesn't yet have data.

I am not using pg_restore. I am just using "psql --file=FILENAME"
syntax. Is that an issue?

Re: Dumping/Restoring with constraints?

From

Tom Lane

Date:

27 August 2008, 10:56:23

"Phoenix Kiula" <phoenix.kiula@gmail.com> writes:
> Hello. I have a database dump. With data and schema, which includes
> all the constraints and rules. But it seems the pg_dumpall command
> does not copy data in such a way that foreign keys are satisfied upon
> restoring. Because tables are inter-related, importing them keep
> giving errors and eventually no data is imported.

This shouldn't be possible in a regular dump, at least not with any
remotely modern version of PG.  However, if you are trying to restore a
data-only dump into an already-created set of tables, it is possible
because pg_dump doesn't know how to order the data loads in that case.
(The problem may in fact be insoluble, since circular foreign key
relationships are allowed.)

The recommended solution is to use a regular schema-and-data dump.
If you really have to separate schema and data, the --disable-triggers
switch might help, though you open yourself to the possibility of
loading inconsistent data that way.

            regards, tom lane

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

27 August 2008, 11:13:03

> The recommended solution is to use a regular schema-and-data dump.
> If you really have to separate schema and data, the --disable-triggers
> switch might help, though you open yourself to the possibility of
> loading inconsistent data that way.


Thanks Tom.

This is the dump command being used on a 8.2.3 database on Linux:

   $ pg_dumpall > mydb.sql
   $ umask 077
   $ gzip mydb.sql

Then I download the mydb.sql.gz file into my local computer (Mac OSX
with 8.3.3) and unzip it to mydb.sql.

The local database is already created, with all tables and constraints and all.

Here is the command I use to restore:

   $ psql -d mydb -U myuser -h localhost --file=mydb.sql

Is this not the recommended method?

Re: Dumping/Restoring with constraints?

From

Terry Lee Tucker

Date:

27 August 2008, 11:15:14

On Wednesday 27 August 2008 09:36, Phoenix Kiula wrote:
> On Wed, Aug 27, 2008 at 9:29 PM, Terry Lee Tucker <terry@chosen-ones.org>
wrote:
> > We have all sorts of constraints and foreign keys and we have never had
> > any problem with pg_restore related to dumping such that foreign keys are
> > satisfied. You must have data already in the database that violates the
> > restraints. You can restore in two phases; that is, by restoring the
> > schema, and then the data using --disable-triggers. I'm assuming you are
> > doing a binary dump. See the man page for pg_restore.
>
> Thanks for this. I don't have any foreign key violations in my
> existing database. I think the violation is happening because upon
> restoring the table that is being populated checks in another table
> that doesn't yet have data.
>
> I am not using pg_restore. I am just using "psql --file=FILENAME"
> syntax. Is that an issue?

The errors you are having, then, must be related to your own trigger code.  It
sounds like you will need to prevent those triggers from firing and the only
way I know how to accomplish that is to do a binary dump and then use
pg_restore as I indicated earlier. There is no way to disable triggers in
your method referenced above.
--
Terry Lee Tucker
Turbo's IT Manager
Turbo, division of Ozburn-Hessey Logistics
2251 Jesse Jewell Pkwy NE
Gainesville, GA 30501
Tel: (336) 372-6812  Fax: (336) 372-6812  Cell: (336) 404-6987
terry@turbocorp.com
www.turbocorp.com

Re: Dumping/Restoring with constraints?

From

Bill Moran

Date:

27 August 2008, 11:23:49

In response to "Phoenix Kiula" <phoenix.kiula@gmail.com>:

> > The recommended solution is to use a regular schema-and-data dump.
> > If you really have to separate schema and data, the --disable-triggers
> > switch might help, though you open yourself to the possibility of
> > loading inconsistent data that way.
>
> Thanks Tom.
>
> This is the dump command being used on a 8.2.3 database on Linux:
>
>    $ pg_dumpall > mydb.sql
>    $ umask 077
>    $ gzip mydb.sql
>
> Then I download the mydb.sql.gz file into my local computer (Mac OSX
> with 8.3.3) and unzip it to mydb.sql.
>
> The local database is already created, with all tables and constraints and all.

Don't do that.  Do one of the following:
*) Allow the dump file to create all tables.
*) In the early step dump the data only with pg_dumpall --disable-triggers -a

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

27 August 2008, 11:26:48

Thanks Bill for this recommendation:


> *) In the early step dump the data only with pg_dumpall --disable-triggers -a


Dumb question. Will this kind of pg_dumpall lead to downtime, I mean
is there a database lock during this time?

Thanks!

Re: Dumping/Restoring with constraints?

From

Bill Moran

Date:

27 August 2008, 11:34:14

In response to "Phoenix Kiula" <phoenix.kiula@gmail.com>:

> Thanks Bill for this recommendation:
>
>
> > *) In the early step dump the data only with pg_dumpall --disable-triggers -a
>
>
> Dumb question. Will this kind of pg_dumpall lead to downtime, I mean
> is there a database lock during this time?

No.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

27 August 2008, 11:40:48

On Wed, Aug 27, 2008 at 10:34 PM, Bill Moran
<wmoran@collaborativefusion.com> wrote:
>>
>>
>> Dumb question. Will this kind of pg_dumpall lead to downtime, I mean
>> is there a database lock during this time?
>
> No.

Thanks. But there seems to be a tangible slowdown of DB operations
during the time that pg_dump is running. Perhaps some of my
postgresql.conf variables are not geared to this. (I vaguely recall
setting it up so that the wal_buffers or checkpoint_segments needed to
be set upwards for maintenance tasks).

My question: is it possible to interactively set up these variables so
that pg_dumpall can work very fast? And behind the scenes, without
slowing stuff down?

Thanks

Re: Dumping/Restoring with constraints?

From

Tom Lane

Date:

27 August 2008, 11:43:35

"Phoenix Kiula" <phoenix.kiula@gmail.com> writes:
> This is the dump command being used on a 8.2.3 database on Linux:

>    $ pg_dumpall > mydb.sql
>    $ umask 077
>    $ gzip mydb.sql

> Then I download the mydb.sql.gz file into my local computer (Mac OSX
> with 8.3.3) and unzip it to mydb.sql.

That's all fine ...

> The local database is already created, with all tables and constraints and all.

... but this isn't.  A pg_dump script expects to restore into an empty
database.  For pg_dumpall, you shouldn't even have created the
databases, just start from a virgin installation.

            regards, tom lane

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

27 August 2008, 11:51:53

On Wed, Aug 27, 2008 at 10:42 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> For pg_dumpall, you shouldn't even have created the
> databases, just start from a virgin installation.

Should I have CREATEd the db at least without the table structure?

I dropped the database locally. Entirely. Gone.

Then I tried this:

  $  pg_restore -h localhost mydb.sql

This tells me:

  pg_restore: [archiver] input file does not appear to be a valid archive

What gives?

Re: Dumping/Restoring with constraints?

From

Tom Lane

Date:

27 August 2008, 11:53:32

"Phoenix Kiula" <phoenix.kiula@gmail.com> writes:
> Then I tried this:
>   $  pg_restore -h localhost mydb.sql
> This tells me:
>   pg_restore: [archiver] input file does not appear to be a valid archive

No, you were right the first time: just feed the dump script to psql.

            regards, tom lane

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

27 August 2008, 12:21:08

On Wed, Aug 27, 2008 at 10:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

>
> No, you were right the first time: just feed the dump script to psql.
>

Ok. Tried that. Because there's no database, I have to execute the
psql command as "postgres" otherwise it doesn't work.

So here's my command:

  $ psql -d postgres -U postgres -h localhost < mydb.sql

This seems to be it. This is what I should be executing, except that
it spews out many errors, like:

   ERROR:  invalid byte sequence for encoding "UTF8": 0x80

This prevents my main table from being copied - Why can't the dump and
the restore just copy the file as-is, including the encoding and such?
What am I not doing right?

Thanks

Re: Dumping/Restoring with constraints?

From

Andrew Sullivan

Date:

27 August 2008, 12:32:51

On Wed, Aug 27, 2008 at 10:40:41PM +0800, Phoenix Kiula wrote:

> Thanks. But there seems to be a tangible slowdown of DB operations
> during the time that pg_dump is running.

Yes.  Pg_dump copies all the data out, so it puts load on your
database and disks.

A
--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Dumping/Restoring with constraints?

From

Andrew Sullivan

Date:

27 August 2008, 12:41:20

On Wed, Aug 27, 2008 at 11:20:55PM +0800, Phoenix Kiula wrote:

> This seems to be it. This is what I should be executing, except that
> it spews out many errors, like:
>
>    ERROR:  invalid byte sequence for encoding "UTF8": 0x80

You have bad data in your database.  Apparently, you have an encoding
of UTF-8, but you have data in there that's not UTF-8 data.  I'll bet
your other encoding is SQL_ASCII.

> This prevents my main table from being copied - Why can't the dump and
> the restore just copy the file as-is, including the encoding and such?
> What am I not doing right?

My bet is that you did initdb on one system with a locale of C and on
another with a locale of utf-8 (somehow).  You can use pg_controldata
to find out: run it against the data areas on each system.

If I'm right, then you probably want to run initidb again on the
target system.

A

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Dumping/Restoring with constraints?

From

Bill Moran

Date:

27 August 2008, 12:41:30

In response to "Phoenix Kiula" <phoenix.kiula@gmail.com>:

> On Wed, Aug 27, 2008 at 10:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> >
> > No, you were right the first time: just feed the dump script to psql.
> >
>
> Ok. Tried that. Because there's no database, I have to execute the
> psql command as "postgres" otherwise it doesn't work.
>
> So here's my command:
>
>   $ psql -d postgres -U postgres -h localhost < mydb.sql
>
> This seems to be it. This is what I should be executing, except that
> it spews out many errors, like:
>
>    ERROR:  invalid byte sequence for encoding "UTF8": 0x80

In psql, do a \l on both the original and the DB you're restoring to.
The encoding should be the same, if it's not, then fix that before
restoring.  If you're using pg_dump (which you don't mention ... it's
getting difficult to follow what you're doing in this thread) then
you can also use the -C option to have pg_dump add a CREATE DATABASE
statement to the dump file for you.

If you actually want to switch database encodings, that's an entirely
different question thread unto itself.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

27 August 2008, 12:54:35

Ok. Here is what it is.

OLD DB:
Since 2002. May contain non-UTF8 data. But I thought I had modified it
all when I changed it to UTF-8 pgsql database (it was originally
Mysql). The database works very well on a very busy website.
Everything on that website is now UTF-8. I wish to mirror this
database locally on my home machine. Server is linux with 6GB ram and
pgsql 8.2.3, home machine is osx with imac of 4GM ram and 8.3.3 (new
pg).

NEW DB:
Just installed fresh. There's nothing in it. I can basically wipe all
data out.

WHAT I AM DOING:
1. On the server, I am executing "pg_dumpall > mydb.out". Simple.
2. FTP that mydb.out file to local home machine.
3. Here, locally, I do this:    "psql -h localhost -d postgres -U
postgres -f mydb.out". This is what gives me the error.

MY QUESTION:
What can I do to:
(a) Make sure all data on pgsql on server (OLD DB) is utf-8. Is there
a command I can execute to convert all data? I thought I had converted
it all to utf-8 using PHP sometime ago, which went through each and
every row and column!
(b) Once that data is utf8-ed, how can I bring it home and have a
mirror of the db.


Thanks.

Re: Dumping/Restoring with constraints?

From

Andrew Sullivan

Date:

27 August 2008, 13:43:34

On Wed, Aug 27, 2008 at 11:54:27PM +0800, Phoenix Kiula wrote:

> OLD DB:
> Since 2002. May contain non-UTF8 data. But I thought I had modified it
> all when I changed it to UTF-8 pgsql database (it was originally
> Mysql). The database works very well on a very busy website.
> Everything on that website is now UTF-8. I wish to mirror this
> database locally on my home machine. Server is linux with 6GB ram and
> pgsql 8.2.3, home machine is osx with imac of 4GM ram and 8.3.3 (new
> pg).

Somehow, you're getting non-UTF-8 chars in there, either because your
conversion didn't work, or because there's still bugs in your
application that send non-UTF-8 data.  If your database encoding is
not UTF-8, then it is possible to get non-UTF-8 data anyway.  That's
why people asked about the database encoding.  SQL_ASCII, please note,
does not enforce that you're in the bottom 7 bits: it'll take anything
you put in there.  So if someone put (say) ISO 8859-1 in, you'll get
in trouble.

> WHAT I AM DOING:
> 1. On the server, I am executing "pg_dumpall > mydb.out". Simple.
> 2. FTP that mydb.out file to local home machine.
> 3. Here, locally, I do this:    "psql -h localhost -d postgres -U
> postgres -f mydb.out". This is what gives me the error.

Right.  So the file includes data that doesn't match the encoding of
the target database.  AFAIR -- and my memory's not what it used to be,
so check the release notes -- the UTF-8 checking in 8.2 was as good as
in 8.3.  One good test of this would be to install 8.2 on your home
machine, and try restoring that too.  If it works, then we know more.

> MY QUESTION:
> What can I do to:
> (a) Make sure all data on pgsql on server (OLD DB) is utf-8. Is there
> a command I can execute to convert all data? I thought I had converted
> it all to utf-8 using PHP sometime ago, which went through each and
> every row and column!

The usual advice is to use iconv.  Your Mac should have it installed.

> (b) Once that data is utf8-ed, how can I bring it home and have a
> mirror of the db.

If you run iconv on the data dump before you load it, then it should
work.  This is not a trivial job, however.

A

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Dumping/Restoring with constraints?

From

"Scott Marlowe"

Date:

27 August 2008, 14:42:01

On Wed, Aug 27, 2008 at 8:40 AM, Phoenix Kiula <phoenix.kiula@gmail.com> wrote:
> Thanks. But there seems to be a tangible slowdown of DB operations
> during the time that pg_dump is running. Perhaps some of my
> postgresql.conf variables are not geared to this. (I vaguely recall
> setting it up so that the wal_buffers or checkpoint_segments needed to
> be set upwards for maintenance tasks).
>
> My question: is it possible to interactively set up these variables so
> that pg_dumpall can work very fast? And behind the scenes, without
> slowing stuff down?

Nope, it's not a software problem, it's a hardware problem.  You're
drive subsystem can only deliver data so fast.  The fix is usually
better hardware.  Big RAID controller and lots of drives.

Re: Dumping/Restoring with constraints?

From

Tino Wildenhain

Date:

27 August 2008, 17:42:26

Hi,

Phoenix Kiula wrote:
> Hello. I have a database dump. With data and schema, which includes
> all the constraints and rules. But it seems the pg_dumpall command
> does not copy data in such a way that foreign keys are satisfied upon
> restoring. Because tables are inter-related, importing them keep
> giving errors and eventually no data is imported. Neither pg_dumpall
> nor pg_restore seems to have a "without constraints" or "delay
> constraints check" type command. What am I missing? Thanks for any
> advice.

Apart from disabling triggers temporarely, if you have enough
space in xlog, you could try to add BEGIN work; to the start
of the restore SQL and COMMIT; a the end.

Regards
Tino

Attachment

smime.p7s

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

27 August 2008, 20:52:06

Is there a mechanism to check exactly which row may have characters
that are not UTF8? I am trying with this kind of a command:

select id from employee
where modify_date between '2008-03-01' and '2008-07-01'
and joint_field_of_name_etc <> convert(joint_field_of_name_etc, 'UTF8')

This is of course not the best way of doing so. Is there a better
system-provided way of checking for rows that may not be UTF8?  Or to
even have a constraint to this effect, to disallow any non-UTF8 data
from getting in there?

Thanks

Re: Dumping/Restoring with constraints?

From

Andrew Sullivan

Date:

28 August 2008, 14:47:56

On Thu, Aug 28, 2008 at 07:51:57AM +0800, Phoenix Kiula wrote:
> system-provided way of checking for rows that may not be UTF8?  Or to
> even have a constraint to this effect, to disallow any non-UTF8 data
> from getting in there?

The latter should be in effect if your database encoding is UTF-8.
What's the database encoding?

A

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

29 August 2008, 02:53:33

On 8/29/08, Andrew Sullivan <ajs@commandprompt.com> wrote:
>
> The latter should be in effect if your database encoding is UTF-8.
>  What's the database encoding?

My database encoding on BOTH the sides is UTF8. The dumped DB is UTF8.
The one I am importing into is also UTF8. So why is it showing me this
error? Do I also have to look at stuff like "collation"?

Also, how can I ensure that my dumped database on the server has only
UTF8 data? What constraint or function can I use?

Thanks

Re: Dumping/Restoring with constraints?

From

Andrew Sullivan

Date:

29 August 2008, 10:34:56

On Fri, Aug 29, 2008 at 01:53:28PM +0800, Phoenix Kiula wrote:

> My database encoding on BOTH the sides is UTF8. The dumped DB is UTF8.

Are you sure?  I know people asked up-thread for the output of \l, but
I haven't seen it.  If your encoding is UTF-8, then I don't understand
how you got non-UTF-8 characters in there, unless there's some bug in
the 8.2 series of which I'm unaware.

> The one I am importing into is also UTF8. So why is it showing me this
> error? Do I also have to look at stuff like "collation"?

You shouldn't have to.  That's a constraint on sorting and such like.
It doesn't control what can get into the database.

Hmm.  I wonder if the problem is the locale you're using for pg_dump
vs. what you're using when handing the data back in.  If the locale
were different, it might be possible that your client_encoding was
different.  I have no idea if this is the case, but you might want to
try it.  Check your locale at the command line on each system.

> Also, how can I ensure that my dumped database on the server has only
> UTF8 data? What constraint or function can I use?

As I said before, if the database encoding is UTF-8, then it's
supposed to be _impossible_ that you get non-UTF-8 data in there.
That's the whole point of having the database encoding.

A

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

29 August 2008, 11:26:21

Thanks Andrew.

On the server (the DB to be dumped) everything is "UTF8".

On my home server (where I would like to mirror the DB), this is the output:


=# \l
            List of databases
   Name    |      Owner      | Encoding
-----------+-----------------+-----------
 postgres  | postgres        | SQL_ASCII
 pkiula    | pkiula_pkiula   | UTF8
 template0 | postgres        | SQL_ASCII
 template1 | postgres        | SQL_ASCII
(4 rows)



This is a fresh install as you can see. The database into which I am
importing ("pkiula") is in fact listed as UTF8! Is this not enough?

Re: Dumping/Restoring with constraints?

From

Andrew Sullivan

Date:

29 August 2008, 11:53:13

On Fri, Aug 29, 2008 at 10:26:14PM +0800, Phoenix Kiula wrote:
> Thanks Andrew.
>
> On the server (the DB to be dumped) everything is "UTF8".
>
> On my home server (where I would like to mirror the DB), this is the output:

Hrm.  Well, what about the locale, as I suggested?  I have no idea if
it's relevant here, but give it a whack.  Other than that, I'm not
sure.  You might want to troll the release notes to see if there was
an encoding bug fixed in the intermediate releases between
8.2.whatever-you-converted-on and 8.2.current.

A

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

29 August 2008, 12:16:54

On 8/29/08, Andrew Sullivan <ajs@commandprompt.com> wrote:
>
>
> Hrm.  Well, what about the locale, as I suggested?  I have no idea if


How should I check for the locale?

Re: Dumping/Restoring with constraints?

From

Andrew Sullivan

Date:

29 August 2008, 12:23:05

On Fri, Aug 29, 2008 at 11:12:35PM +0800, Phoenix Kiula wrote:
>
> How should I check for the locale?

As the user that you issue the pg_dump with on the machine you
generated it, and as the user you're trying to use to restore the
data, at the command line type

 $ locale

You end up with something like this:

stovepipe:~ ajs$ locale
LANG="en_CA.UTF-8"
LC_COLLATE="en_CA.UTF-8"
LC_CTYPE="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME="en_CA.UTF-8"
LC_ALL=

I can't remember whether the client encoding is going to be affected
by this during dump and restore, and whether in these versions that is
captured in the dump file.

A

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Dumping/Restoring with constraints?

From

Adrian Klaver

Date:

29 August 2008, 12:29:12

On Friday 29 August 2008 8:12:35 am Phoenix Kiula wrote:
> On 8/29/08, Andrew Sullivan <ajs@commandprompt.com> wrote:
> > Hrm.  Well, what about the locale, as I suggested?  I have no idea if
>
> How should I check for the locale?

On Linux:

aklaver@tucker:~$ locale
LANG=en_US.UTF-8
LANGUAGE=en
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=


--
Adrian Klaver
aklaver@comcast.net

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

29 August 2008, 12:33:44

Thanks so much! I think we may be on to something:


On the pg_dump machine:

LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=



On the pg_restore machine:


LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=



So how can I change the locale on my local OSX machine? This could be
it. Thanks!

Re: Dumping/Restoring with constraints?

From

Andrew Sullivan

Date:

29 August 2008, 12:37:19

On Fri, Aug 29, 2008 at 11:29:55PM +0800, Phoenix Kiula wrote:
> So how can I change the locale on my local OSX machine? This could be
> it. Thanks!

Before you start restoring, put a LANG environment variable.  So
something like

LANG=en_US.UTF-8 psql

or else

LANG=en_US.UTF-8; export LANG
[other stuff]

A

--
Andrew Sullivan
ajs@commandprompt.com
+1 503 667 4564 x104
http://www.commandprompt.com/

Re: Dumping/Restoring with constraints?

From

Adrian Klaver

Date:

29 August 2008, 13:05:06

On Friday 29 August 2008 8:29:55 am Phoenix Kiula wrote:
> Thanks so much! I think we may be on to something:
>
>
> On the pg_dump machine:
>
> LANG=en_US.UTF-8
> LC_CTYPE="en_US.UTF-8"
> LC_NUMERIC="en_US.UTF-8"
> LC_TIME="en_US.UTF-8"
> LC_COLLATE="en_US.UTF-8"
> LC_MONETARY="en_US.UTF-8"
> LC_MESSAGES="en_US.UTF-8"
> LC_PAPER="en_US.UTF-8"
> LC_NAME="en_US.UTF-8"
> LC_ADDRESS="en_US.UTF-8"
> LC_TELEPHONE="en_US.UTF-8"
> LC_MEASUREMENT="en_US.UTF-8"
> LC_IDENTIFICATION="en_US.UTF-8"
> LC_ALL=
>
>
>
> On the pg_restore machine:
>
>
> LANG=
> LC_COLLATE="C"
> LC_CTYPE="UTF-8"
> LC_MESSAGES="C"
> LC_MONETARY="C"
> LC_NUMERIC="C"
> LC_TIME="C"
> LC_ALL=
>
>
>
> So how can I change the locale on my local OSX machine? This could be
> it. Thanks!

Might also be useful to do SHOW ALL from within the database cluster you have
on your restore machine. Will show what choices initdb made.

test=# SHOW all;
<Snip>
 lc_collate                      | en_US.UTF-8
 lc_ctype                        | en_US.UTF-8
 lc_messages                     | en_US.UTF-8
 lc_monetary                     | en_US.UTF-8
 lc_numeric                      | en_US.UTF-8
 lc_time                         | en_US.UTF-8
<Snip>
--
Adrian Klaver
aklaver@comcast.net

Re: Dumping/Restoring with constraints?

From

"Phoenix Kiula"

Date:

29 August 2008, 13:07:36

On 8/30/08, Adrian Klaver <aklaver@comcast.net> wrote:
>
> Might also be useful to do SHOW ALL from within the database cluster you have
>  on your restore machine. Will show what choices initdb made.
>
>  test=# SHOW all;
>  <Snip>
>   lc_collate                      | en_US.UTF-8
>   lc_ctype                        | en_US.UTF-8
>   lc_messages                     | en_US.UTF-8
>   lc_monetary                     | en_US.UTF-8
>   lc_numeric                      | en_US.UTF-8
>   lc_time                         | en_US.UTF-8
>  <Snip>



Thanks. I did this and LC_COLLATE etc have the values "C". How can I
change this to UTF8?

Re: Dumping/Restoring with constraints?

From

aklaver@comcast.net (Adrian Klaver)

Date:

29 August 2008, 14:22:35

-------------- Original message ----------------------
From: "Phoenix Kiula" <phoenix.kiula@gmail.com>
> On 8/30/08, Adrian Klaver <aklaver@comcast.net> wrote:
> >
> > Might also be useful to do SHOW ALL from within the database cluster you have
> >  on your restore machine. Will show what choices initdb made.
> >
> >  test=# SHOW all;
> >  <Snip>
> >   lc_collate                      | en_US.UTF-8
> >   lc_ctype                        | en_US.UTF-8
> >   lc_messages                     | en_US.UTF-8
> >   lc_monetary                     | en_US.UTF-8
> >   lc_numeric                      | en_US.UTF-8
> >   lc_time                         | en_US.UTF-8
> >  <Snip>
>
>
>
> Thanks. I did this and LC_COLLATE etc have the values "C". How can I
> change this to UTF8?
>

You have to start from scratch. LC_COLLATE is set at the initdb stage and cannot be changed.

From the manual-

http://www.postgresql.org/docs/8.2/interactive/multibyte.html
"
Important:  Although you can specify any encoding you want for a database, it is unwise to choose an encoding that is
notwhat is expected by the locale you have selected. The LC_COLLATE and LC_CTYPE settings imply a particular encoding,
andlocale-dependent operations (such as sorting) are likely to misinterpret data that is in an incompatible encoding. 

Since these locale settings are frozen by initdb, the apparent flexibility to use different encodings in different
databasesof a cluster is more theoretical than real. It is likely that these mechanisms will be revisited in future
versionsof PostgreSQL. 

One way to use multiple encodings safely is to set the locale to C or POSIX during initdb, thus disabling any real
localeawareness. " 

From above the presence of a C locale should not  be causing a problem.  In fact when I try something similar on my
machineI do not have any problems, but I am working on a Linux box. There have been previous postings dealing with
locale/encodingissues on OS X. I don't know if a 'answer' was found. You might want to start a new thread 'OS X
encoding/locale'to attract the attention of those that have dealt with the problem before. 

--
Adrian Klaver
aklaver@comcast.net

Re: Dumping/Restoring with constraints?

From

Marco Colombo

Date:

29 August 2008, 16:45:05

Phoenix Kiula wrote:
> Thanks Andrew.
>
> On the server (the DB to be dumped) everything is "UTF8".
>
> On my home server (where I would like to mirror the DB), this is the output:
>
>
> =# \l
>             List of databases
>    Name    |      Owner      | Encoding
> -----------+-----------------+-----------
>  postgres  | postgres        | SQL_ASCII
>  pkiula    | pkiula_pkiula   | UTF8
>  template0 | postgres        | SQL_ASCII
>  template1 | postgres        | SQL_ASCII
> (4 rows)
>
>
>
> This is a fresh install as you can see. The database into which I am
> importing ("pkiula") is in fact listed as UTF8! Is this not enough?
>

You said you're getting these errors:
ERROR:  invalid byte sequence for encoding "UTF8": 0x80

those 0x80 bytes are inside the mydb.sql file, you may find it easier to
  look for them there and identify the offending string(s). Try (on the
linux machine):

zcat mydb.sql.gz | iconv -f utf8 > /dev/null

should tell you something like:

illegal input sequence at position xxx

BTW, 0x80 is usually found in windows encoding, such as windows-1250,
where it stands for the EURO symbol:

echo -n "€" | iconv -t windows-1250 | hexdump -C
00000000  80                                                |.|
00000001

FYI, you *can* get non UTF-8 data from an UTF-8 database, if (and only
if) your client encoding is something different (either because you
explicitly set it so, or because of your client defaults).

Likewise, you can insert non UTF-8 data (such as your mydb.sql) into an
UTF-8 database, provided you set your client encoding accordingly.
PostgreSQL clients handle encoding conversions, but there's no way to
guess (reliabily) the encoding of a text file.

OTOH, from a SQL_ASCII database you can get all sort of data, even mixed
  encoding text (which you need to fix somehow). If your mydb.sql
contains data from a SQL_ASCII database, you simply know nothing about
the encoding.

I have seen SQL_ASCII databases containg data inserted from HTTP forms,
both in UTF-8 and windows-1250 encoding. Displaying, dumping, restoring
that correctly is impossible, you need to fix it somehow before
processing it as text.

.TM.