Thread: pg_dump problems

pg_dump problems

From
Martín Marqués
Date:
Hi all,

I'm having problems with pg_dump, apparently, from one of our servers.

The scenario is like this. I tried today to dump a DB from our
production server to load on our development server and got some
errors, which I show below.

Production server:

SELECT version();
                                            version
-----------------------------------------------------------------------------------------------
 PostgreSQL 8.3.14 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real
(Debian 4.3.2-1.1) 4.3.2


Development server:

SELECT version();
                                               version
-----------------------------------------------------------------------------------------------------
 PostgreSQL 8.4.7 on i486-pc-linux-gnu, compiled by GCC gcc-4.4.real
(Debian 4.4.5-10) 4.4.5, 32-bit

Anyway, the problem is in the output of the pg_dump in the production server:

$ pg_dump -V
pg_dump (PostgreSQL) 8.3.14

The first error looks like related to encoding of bytea:

ERROR:  sintaxis de entrada no válida para tipo bytea
CONTEXTO:  COPY novedades_fotos, línea 23, columna foto:
«\377\330\377\340\000\020JFIF\000\001\002\001\000H\000H\000\000\377\355\020\006Photoshop
3.0\0008BIM\...»

But more strange is what comes later:

ERROR:  la sintaxis de entrada no es válida para integer: «954516ILa
educación superior en el sector aducativo del mercosur»
CONTEXTO:  COPY objeto_datos_rep, línea 1627732, columna codiobjeto:
«954516ILa educación superior en el sector aducativo del mercosur»

Checking that register in the production DB I get this:

SELECT * from objeto_datos_rep where codigo = 2357634;
 codigo  | codiobjeto |                        descripcion
           | tcampo
---------+------------+-----------------------------------------------------------+--------
 2357634 |     954516 | La educación superior en el sector aducativo
del mercosur |      5

So, why didn't pg_dump add the tabs between 954516 and "La educación..."?

Right now I'm trying to do dump with INSERTs instead of COPY command
to see what happens.

--
Martín Marqués
select 'martin.marques' || '@' || 'gmail.com'
DBA, Programador, Administrador

Re: pg_dump problems

From
Vick Khera
Date:
2011/3/23 Martín Marqués <martin.marques@gmail.com>:
> The scenario is like this. I tried today to dump a DB from our
> production server to load on our development server and got some
> errors, which I show below.


Try using the pg_dump from the 8.4 install to create your dump file.

Also make sure your 8.3 installation doesn't have a non-default bytea
encoding setting (not sure when that setting was introduced, so there
may not even be such a setting).

Re: pg_dump problems

From
Martín Marqués
Date:
Now I'm getting weirder things. I did a pg_dump with -d option and the
dump adds strange caracters:

ERROR:  error de sintaxis en o cerca de «I»
LÍNEA 1: I^NSERT INTO objeto_datos_rep VALUES (1172339, 387943, 'Wilde...

Could it be binary corruption (pg_dump for example)?

2011/3/23 Vick Khera <vivek@khera.org>:
> 2011/3/23 Martín Marqués <martin.marques@gmail.com>:
>> The scenario is like this. I tried today to dump a DB from our
>> production server to load on our development server and got some
>> errors, which I show below.
>
>
> Try using the pg_dump from the 8.4 install to create your dump file.

I'll try this.

> Also make sure your 8.3 installation doesn't have a non-default bytea
> encoding setting (not sure when that setting was introduced, so there
> may not even be such a setting).

bytea has encoding? I thought it was plain binary data.

--
Martín Marqués
select 'martin.marques' || '@' || 'gmail.com'
DBA, Programador, Administrador

Re: pg_dump problems

From
Adrian Klaver
Date:

On Wednesday, March 23, 2011 6:57:35 am Martín Marqués wrote:

> Now I'm getting weirder things. I did a pg_dump with -d option and the

> dump adds strange caracters:

As previously stated try using the 8.4 version of pg_dump to dump the 8.3 database. FYI in 8.4+ there is no -d option anymore, it has become --inserts to reduce confusion with other commands where -d specified the database.

>

> ERROR: error de sintaxis en o cerca de «I»

> LÍNEA 1: I^NSERT INTO objeto_datos_rep VALUES (1172339, 387943, 'Wilde...

>

> Could it be binary corruption (pg_dump for example)?

>

> 2011/3/23 Vick Khera <vivek@khera.org>:

> > 2011/3/23 Martín Marqués <martin.marques@gmail.com>:

> >> The scenario is like this. I tried today to dump a DB from our

> >> production server to load on our development server and got some

> >> errors, which I show below.

> >

> > Try using the pg_dump from the 8.4 install to create your dump file.

>

> I'll try this.

>

> > Also make sure your 8.3 installation doesn't have a non-default bytea

> > encoding setting (not sure when that setting was introduced, so there

> > may not even be such a setting).

>

> bytea has encoding? I thought it was plain binary data.

I think he was talking about bytea_output, which appeared in 9.0.

In any event you may want to verify that the encodings/locales for the two databases are the same.

--

Adrian Klaver

adrian.klaver@gmail.com

Re: pg_dump problems

From
Martín Marqués
Date:
El día 23 de marzo de 2011 11:18, Adrian Klaver
<adrian.klaver@gmail.com> escribió:
> On Wednesday, March 23, 2011 6:57:35 am Martín Marqués wrote:
>
>> Now I'm getting weirder things. I did a pg_dump with -d option and the
>
>> dump adds strange caracters:
>
> As previously stated try using the 8.4 version of pg_dump to dump the 8.3
> database. FYI in 8.4+ there is no -d option anymore, it has become --inserts
> to reduce confusion with other commands where -d specified the database.


Ok, did a remote dump (from the development server which has pg 8.4)
and I got a similar error:

psql:siprebi-bu.sql:27374: ERROR:  sintaxis de entrada no
v<C3><A1>lida para tipo bytea
CONTEXT:  COPY novedades_fotos, l<C3><AD>nea 273, columna foto:
<C2><AB>\377\330\377\340\000\020JFIF\000\001\001\001\000`\000`\000\000\377\341<\310Exif\000\000II
*\000\010\0...<C2><BB>
psql:siprebi-bu.sql:3477848: ERROR:  la sintaxis de entrada no es
v<C3><A1>lida para integer: <C2><AB>q43116<C2><BB>
CONTEXT:  COPY objetos, l<C3><AD>nea 99185, columna codigo:
<C2><AB>q43116<C2><BB>

>
>> > Also make sure your 8.3 installation doesn't have a non-default bytea
>
>> > encoding setting (not sure when that setting was introduced, so there
>
>> > may not even be such a setting).
>
>>
>
>> bytea has encoding? I thought it was plain binary data.
>
> I think he was talking about bytea_output, which appeared in 9.0.
>
> In any event you may want to verify that the encodings/locales for the two
> databases are the same.
>

All configurations have UTF-8

--
Martín Marqués
select 'martin.marques' || '@' || 'gmail.com'
DBA, Programador, Administrador

Re: pg_dump problems

From
Adrian Klaver
Date:
On 03/23/2011 07:56 AM, Martín Marqués wrote:
> El día 23 de marzo de 2011 11:18, Adrian Klaver
> <adrian.klaver@gmail.com>  escribió:
>> On Wednesday, March 23, 2011 6:57:35 am Martín Marqués wrote:
>>
>>> Now I'm getting weirder things. I did a pg_dump with -d option and the
>>
>>> dump adds strange caracters:
>>
>> As previously stated try using the 8.4 version of pg_dump to dump the 8.3
>> database. FYI in 8.4+ there is no -d option anymore, it has become --inserts
>> to reduce confusion with other commands where -d specified the database.
>
>
> Ok, did a remote dump (from the development server which has pg 8.4)
> and I got a similar error:
>
> psql:siprebi-bu.sql:27374: ERROR:  sintaxis de entrada no
> v<C3><A1>lida para tipo bytea
> CONTEXT:  COPY novedades_fotos, l<C3><AD>nea 273, columna foto:
> <C2><AB>\377\330\377\340\000\020JFIF\000\001\001\001\000`\000`\000\000\377\341<\310Exif\000\000II
> *\000\010\0...<C2><BB>
> psql:siprebi-bu.sql:3477848: ERROR:  la sintaxis de entrada no es
> v<C3><A1>lida para integer:<C2><AB>q43116<C2><BB>
> CONTEXT:  COPY objetos, l<C3><AD>nea 99185, columna codigo:
> <C2><AB>q43116<C2><BB>
>

So you are feeding a plain text dump to psql correct?
Does the data in the file look correct for the affected tables?
Are there any other errors being reported?


--
Adrian Klaver
adrian.klaver@gmail.com

Re: pg_dump problems

From
Martín Marqués
Date:
El día 23 de marzo de 2011 14:03, Adrian Klaver
<adrian.klaver@gmail.com> escribió:
> On 03/23/2011 07:56 AM, Martín Marqués wrote:
>>
>>
>> Ok, did a remote dump (from the development server which has pg 8.4)
>> and I got a similar error:
>>
>> psql:siprebi-bu.sql:27374: ERROR:  sintaxis de entrada no
>> v<C3><A1>lida para tipo bytea
>> CONTEXT:  COPY novedades_fotos, l<C3><AD>nea 273, columna foto:
>>
>> <C2><AB>\377\330\377\340\000\020JFIF\000\001\001\001\000`\000`\000\000\377\341<\310Exif\000\000II
>> *\000\010\0...<C2><BB>
>> psql:siprebi-bu.sql:3477848: ERROR:  la sintaxis de entrada no es
>> v<C3><A1>lida para integer:<C2><AB>q43116<C2><BB>
>> CONTEXT:  COPY objetos, l<C3><AD>nea 99185, columna codigo:
>> <C2><AB>q43116<C2><BB>
>>
>
> So you are feeding a plain text dump to psql correct?

How can I verify this? It's already dificult to manage the file, asi
it has over 700Mb.

The second error is very curious, as it looks like pg_dump changed 1
for a 'q' in an integer field:

psql:siprebi-bu.sql:2219245: ERROR:  la sintaxis de entrada no es
válida para integer: «q721695»
CONTEXT:  COPY objeto_datos_rep, línea 991894, columna codigo: «q721695»

Looking at the dump file I see this:

1721693 657083  Eclairage...    5
1721694 657083   photometrie    5
q721695 657084  Keitz, H. A. E. 2
1721696 657084  Ligth calculations and measurements...  5
1721697 657085  La Toison, M.   2

> Does the data in the file look correct for the affected tables?
> Are there any other errors being reported?

Thats the first error. After that there are similar errores and others
related to that register not gentting inserted (foreign keys that
don't have there corresponding primary key).

--
Martín Marqués
select 'martin.marques' || '@' || 'gmail.com'
DBA, Programador, Administrador

Re: pg_dump problems

From
Vick Khera
Date:
2011/3/23 Martín Marqués <martin.marques@gmail.com>:
> The second error is very curious, as it looks like pg_dump changed 1
> for a 'q' in an integer field:
>
> psql:siprebi-bu.sql:2219245: ERROR:  la sintaxis de entrada no es
> válida para integer: «q721695»
> CONTEXT:  COPY objeto_datos_rep, línea 991894, columna codigo: «q721695»
>

That's a single-bit error.  I'm voting hardware problems.  Does your
memory have ECC?  How reliable is your disk?

Re: pg_dump problems

From
Adrian Klaver
Date:
On 03/23/2011 10:33 AM, Martín Marqués wrote:
> El día 23 de marzo de 2011 14:03, Adrian Klaver
> <adrian.klaver@gmail.com>  escribió:
>> On 03/23/2011 07:56 AM, Martín Marqués wrote:
>>>
>>>
>>> Ok, did a remote dump (from the development server which has pg 8.4)
>>> and I got a similar error:
>>>
>>> psql:siprebi-bu.sql:27374: ERROR:  sintaxis de entrada no
>>> v<C3><A1>lida para tipo bytea
>>> CONTEXT:  COPY novedades_fotos, l<C3><AD>nea 273, columna foto:
>>>
>>> <C2><AB>\377\330\377\340\000\020JFIF\000\001\001\001\000`\000`\000\000\377\341<\310Exif\000\000II
>>> *\000\010\0...<C2><BB>
>>> psql:siprebi-bu.sql:3477848: ERROR:  la sintaxis de entrada no es
>>> v<C3><A1>lida para integer:<C2><AB>q43116<C2><BB>
>>> CONTEXT:  COPY objetos, l<C3><AD>nea 99185, columna codigo:
>>> <C2><AB>q43116<C2><BB>
>>>
>>
>> So you are feeding a plain text dump to psql correct?
>
> How can I verify this? It's already dificult to manage the file, asi
> it has over 700Mb.

One way would be to dump only the table(s) that are causing the problem.

>
> The second error is very curious, as it looks like pg_dump changed 1
> for a 'q' in an integer field:

My guess is that it is combining data from two fields. In other words
the 'q' is the end of the previous field.

>
> psql:siprebi-bu.sql:2219245: ERROR:  la sintaxis de entrada no es
> válida para integer: «q721695»
> CONTEXT:  COPY objeto_datos_rep, línea 991894, columna codigo: «q721695»
>
> Looking at the dump file I see this:
>
> 1721693 657083  Eclairage...    5
> 1721694 657083   photometrie    5
> q721695 657084  Keitz, H. A. E. 2
> 1721696 657084  Ligth calculations and measurements...  5
> 1721697 657085  La Toison, M.   2
>
>> Does the data in the file look correct for the affected tables?
>> Are there any other errors being reported?
>
> Thats the first error. After that there are similar errores and others
> related to that register not gentting inserted (foreign keys that
> don't have there corresponding primary key).
>


--
Adrian Klaver
adrian.klaver@gmail.com

Re: pg_dump problems

From
Martín Marqués
Date:
2011/3/23 Vick Khera <vivek@khera.org>:
> 2011/3/23 Martín Marqués <martin.marques@gmail.com>:
>> The second error is very curious, as it looks like pg_dump changed 1
>> for a 'q' in an integer field:
>>
>> psql:siprebi-bu.sql:2219245: ERROR:  la sintaxis de entrada no es
>> válida para integer: «q721695»
>> CONTEXT:  COPY objeto_datos_rep, línea 991894, columna codigo: «q721695»
>>
>
> That's a single-bit error.  I'm voting hardware problems.  Does your
> memory have ECC?  How reliable is your disk?
>

The production server is a Compaq Prolaint, with ECC memory and RAID
by hardware.

How can I discard, or affirm that we are dealing with hardware problems?

--
Martín Marqués
select 'martin.marques' || '@' || 'gmail.com'
DBA, Programador, Administrador

Re: pg_dump problems

From
Martín Marqués
Date:
Just to to finish this thread:

We had a corrupted memory bank in the development server, and that
was, for some reason corrupting the data that got written to disk.

All is good now. Sorry for the noise.

El día 23 de marzo de 2011 16:28, Martín Marqués
<martin.marques@gmail.com> escribió:
> 2011/3/23 Vick Khera <vivek@khera.org>:
>> 2011/3/23 Martín Marqués <martin.marques@gmail.com>:
>>> The second error is very curious, as it looks like pg_dump changed 1
>>> for a 'q' in an integer field:
>>>
>>> psql:siprebi-bu.sql:2219245: ERROR:  la sintaxis de entrada no es
>>> válida para integer: «q721695»
>>> CONTEXT:  COPY objeto_datos_rep, línea 991894, columna codigo: «q721695»
>>>
>>
>> That's a single-bit error.  I'm voting hardware problems.  Does your
>> memory have ECC?  How reliable is your disk?
>>
>
> The production server is a Compaq Prolaint, with ECC memory and RAID
> by hardware.
>
> How can I discard, or affirm that we are dealing with hardware problems?
>
> --
> Martín Marqués
> select 'martin.marques' || '@' || 'gmail.com'
> DBA, Programador, Administrador
>



--
Martín Marqués
select 'martin.marques' || '@' || 'gmail.com'
DBA, Programador, Administrador