Re: Problem loading pg_dump file - Mailing list pgsql-general

From Mason Hale
Subject Re: Problem loading pg_dump file
Date
Msg-id 8bca3aa10701300957v1c45dd37ia9c7532e3f5e6a9c@mail.gmail.com
Whole thread Raw
In response to Re: Problem loading pg_dump file  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Problem loading pg_dump file
Re: Problem loading pg_dump file
List pgsql-general
I've done a bit more digging into this, here's what I've found --

The text db dump file is much too big to edit by hand (~37GB), so I ran the import in single-step mode:

psql -U bdu -s bdu_01_21_07 < bduprod_2-01-21-07

Here's the first error I run across:

***(Single step mode: verify command)*******************************************
COPY blocked_info (id, created_at, reason_code, note, do_count_links) FROM stdin;
***(press return to proceed or enter x and return to cancel)********************
ERROR:  invalid input syntax for integer: "2006-10-09 22:55:58"
CONTEXT:  COPY blocked_info, line 1, column id: "2006-10-09 22:55:58"

The part of the script that is responsible for this error is:

COPY blocked_info (id, created_at, reason_code, note, do_count_links) FROM stdin;
1       2006-10-09 22:55:58     0       \N      \N
2       2006-10-09 22:55:58     0       \N      \N
3       2006-10-09 22:55:58     0       \N      \N
4       2006-10-09 22:55:58     0       \N      \N
5       2006-10-09 22:55:58     0       \N      \N
6       2006-10-10 13:03:27     \N      \N      \N
7       2006-10-10 13:06:28     \N      \N      \N
8       2006-10-10 13:09:37     \N      \N      \N
9       2006-10-27 22:39:49     \N      \N      \N
10      2006-10-27 22:39:58     \N      \N      \N
11      2006-11-09 04:48:18     \N      \N      \N
12      2006-11-30 16:03:58     \N      \N      f
13      2006-12-02 15:11:42     \N      \N      f
14      2006-12-04 12:31:20     \N      \N      f
15      2006-12-05 00:11:30     \N      \N      f
16      2006-12-05 00:15:45     \N      \N      f
17      2006-12-05 03:02:29     \N      \N      f
18      2006-12-05 12:03:10     \N      \N      f
20      2006-12-05 16:20:15     \N      \N      f
19      2006-12-05 16:20:15     \N      \N      f
21      2006-12-16 16:13:24     \N      \N      f
22      2006-12-19 16:06:43     \N      \N      f
23      2006-12-20 01:33:51     \N      \N      f
24      2006-12-21 18:38:56     \N      \N      f
25      2006-12-22 15:06:15     \N      \N      f
26      2006-12-23 09:43:17     \N      \N      f
27      2007-01-04 12:37:50     \N      \N      f
28      2007-01-08 17:33:26     \N      \N      f
29      2007-01-08 17:38:47     \N      \N      f
30      2007-01-13 15:32:34     \N      \N      f
\.

Now, I'm not too familiar with the copy command, but the above looks correct to me.

When I let subsequent statements run, I get similar errors such as:

ERROR:  invalid input syntax for integer: "0.261191951289869"
ERROR:  invalid input syntax for type real: "2006-08-23 22:54: 11.24"
ERROR:  invalid input syntax for integer: "0.99655325708605502"
ERROR:  invalid input syntax for type boolean: "1401353"

Any ideas what is causing this, or how to track this issue down?
------------------

In addition to the above, I did a new pg_dump using the -Fc option to generate an archive in binary/compressed format.

I also verified that the versions of pg_dump and pg_restore on both machines where the same ( 8.1.5).

I then copy that file over to our dev server an load it doing a pg_restore -v (verbose mode). This time, I get an error when loading data into a different table in the database (not the blocked_info table mentioned above).

The error message is:
pg_restore: restoring data for table "rawfeed"
pg_restore: [custom archiver] could not read data block -- expected 4096, got 3448
pg_restore: *** aborted because of error

I've tried running both with and without the -e (stop on errors) option, and either way the pg_restore stops when it hits this error.

When I list the database contents using pg_restore -l  -- the table where error occurs is table listed as #14 out of 23. This particular table includes a bytea column that contains gzipped data.

If I try to selectively restore any of the individual tables 1-13 (as listed by pg_restore -l), using the -t option, I do not encounter any errors. But if I try to restore any individual tables 14-23, I get the same error as above (but with a different table name).

In addition, if I try to generate a sql script from pg_restore using the -f option, for any of the individual tables 14-23, I get the same error: "could not read data block -- expected 4096, got 3448".

> pg_restore -t rawfeed -f rawfeed.sql bduprod_2-01-25-07
pg_restore: [custom archiver] could not read data block -- expected 4096, got 3448

In the case of table 14 (rawfeed), an output file is generated, it is mid-way through a copy command. I can run that partially generated script against the db without error. I can also verify that the last record in the script is successfully added to the db.

Note: this generated file to restore this one table is huge: 9.2G by itself. Is there an upper limit to the amount of data copy can load at one time?

If I generate a sql script for any of tables 15-23, it takes a long time for the command to finally return the same error (about the same amount of time to run the 'pg_restore -t rawfeed ...' variation above), and when it does an output file is generated, but the file only contains the schema creation commands and the first line of the copy command. It does not contain any of the table data. This is true even for a table that contains only 1 row of data.

----

At this point I'm not sure how to proceed.

My suspicion is still that this has something to do with the encoding of data in the database. So I'm trying to narrow down the location of the problem so that I can try to clean it up.

From trying to load the data from the pg_dump text format export, it looks like something is wrong with the copy command when loading the blocked_info table. However when trying to load the pg_dump -Fc binary format export, it appears there's some problem with the rawfeed table.

Any ideas on what to try next will be greatly appreciated.

thanks in advance,
Mason



On 1/25/07, Tom Lane < tgl@sss.pgh.pa.us> wrote:
"Mason Hale" < masonhale@gmail.com> writes:
> I'm having a problem loading a recent pg_dump of our production database.

> However, when trying to load the file for this month's snapshot, we are (for
> the first time) seeing a slew of errors, such as:

> invalid command \N
> invalid command \N
> ERROR:  syntax error at or near ""/>\n  <img alt="" style="" at character 1
> LINE 1: "/>\n  <img alt="" style="border: 0;
>         ^

You need to look at the very first error, and ignore the slew following
it.  What seems to have happened here is that an error in the COPY command
caused psql to fall out of copy mode (or perhaps never enter it in the
first place) and start trying to treat lines of COPY data as SQL
commands.  So, tons of noise.  What was the first error?

                        regards, tom lane

pgsql-general by date:

Previous
From: "Andrus"
Date:
Subject: Re: How to allow users to log on only from my application
Next
From: Erik Jones
Date:
Subject: Re: pg migrator