Thread: [GENERAL] Invalid field size

[GENERAL] Invalid field size

From
Moreno Andreo
Date:
I've implemented a backup procedure in C# with Npgsql (using COPY TO I
dump all tables in a compressed file) that's been working well in the
last 5 years (and it's still working, since this is a single, isolated
case).

OS: Windows 7
PG: 9.1.6 (I know, it's EOL, but I think it's not matter here)

While restoring (with COPY FROM) I get this error:
2017-07-04 12:55:27 CEST ERROR:  invalid field size
2017-07-04 12:55:27 CEST CONTEXT:  COPY tab, line 619, column thumbnail
2017-07-04 12:55:27 CEST STATEMENT:  COPY
tab(cod,guid,data,blob,thumbnail,descr,type,url,user,home,codrec,table,op,dagg,last)
FROM STDIN WITH BINARY

with this table definition:

CREATE TABLE public.tab
(
   cod uuid NOT NULL DEFAULT uuid_generate_v4(),
   guid uuid NOT NULL,
   data timestamp without time zone NOT NULL,
   blob bytea,
   thumbnail bytea,
   descr character varying(255) DEFAULT NULL::character varying,
   type character varying(50) DEFAULT NULL::character varying,
   url character varying(255) DEFAULT NULL::character varying,
   user character varying(255) DEFAULT NULL::character varying,
   home character varying(255) DEFAULT NULL::character varying,
   codrec uuid,
   table character varying(30) DEFAULT NULL::character varying,
   op character(1) DEFAULT NULL::bpchar,
   dagg timestamp without time zone,
   last character varying(16) DEFAULT NULL::character varying
)

As you can see I have 2 bytea fields, blob and thumbnail (the one it
seems it's giving the error), but AFAIK the former is never used, so it
should be always null.
Googling around did not help.

Any ideas? As for many error I got in the past I assume we are trying to
COPY FROM corrupted data (when using cheap pendrives we get often this
error). Should it be reasonable or I have to search elsewhere?

Thanks in advance
Moreno.



Re: [GENERAL] Invalid field size

From
Glyn Astill
Date:

>On Tuesday, 4 July 2017, 12:16:57 GMT+1, Moreno Andreo <moreno.andreo@evolu-s.it> wrote:
>
>
> Any ideas? As for many error I got in the past I assume we are trying to
> COPY FROM corrupted data (when using cheap pendrives we get often this
> error). Should it be reasonable or I have to search elsewhere?

I'd start by looking at the data on line 619 of your file, perhaps you could post it?


Re: [GENERAL] Invalid field size

From
Adrian Klaver
Date:
On 07/04/2017 04:16 AM, Moreno Andreo wrote:
> I've implemented a backup procedure in C# with Npgsql (using COPY TO I
> dump all tables in a compressed file) that's been working well in the
> last 5 years (and it's still working, since this is a single, isolated
> case).
>
> OS: Windows 7
> PG: 9.1.6 (I know, it's EOL, but I think it's not matter here)

Are you restoring to same as above or to another machine or Postgres
instance?

>
> While restoring (with COPY FROM) I get this error:
> 2017-07-04 12:55:27 CEST ERROR:  invalid field size
> 2017-07-04 12:55:27 CEST CONTEXT:  COPY tab, line 619, column thumbnail
> 2017-07-04 12:55:27 CEST STATEMENT:  COPY
> tab(cod,guid,data,blob,thumbnail,descr,type,url,user,home,codrec,table,op,dagg,last)
> FROM STDIN WITH BINARY
>
> with this table definition:
>
> CREATE TABLE public.tab
> (
>    cod uuid NOT NULL DEFAULT uuid_generate_v4(),
>    guid uuid NOT NULL,
>    data timestamp without time zone NOT NULL,
>    blob bytea,
>    thumbnail bytea,
>    descr character varying(255) DEFAULT NULL::character varying,
>    type character varying(50) DEFAULT NULL::character varying,
>    url character varying(255) DEFAULT NULL::character varying,
>    user character varying(255) DEFAULT NULL::character varying,
>    home character varying(255) DEFAULT NULL::character varying,
>    codrec uuid,
>    table character varying(30) DEFAULT NULL::character varying,
>    op character(1) DEFAULT NULL::bpchar,
>    dagg timestamp without time zone,
>    last character varying(16) DEFAULT NULL::character varying
> )
>
> As you can see I have 2 bytea fields, blob and thumbnail (the one it
> seems it's giving the error), but AFAIK the former is never used, so it
> should be always null.
> Googling around did not help.
>
> Any ideas? As for many error I got in the past I assume we are trying to
> COPY FROM corrupted data (when using cheap pendrives we get often this
> error). Should it be reasonable or I have to search elsewhere?
>
> Thanks in advance
> Moreno.
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 04/07/2017 16:30, Glyn Astill ha scritto:

>On Tuesday, 4 July 2017, 12:16:57 GMT+1, Moreno Andreo <moreno.andreo@evolu-s.it> wrote:
>
>
> Any ideas? As for many error I got in the past I assume we are trying to
> COPY FROM corrupted data (when using cheap pendrives we get often this
> error). Should it be reasonable or I have to search elsewhere?

I'd start by looking at the data on line 619 of your file, perhaps you could post it?


Unfortunately no, because it's about 3 GB in size, and binary encoded, so I have no idea of what I'd be searching for....
Also it's not on my computer but on a customer's.
If needed, I can try to contact him and extract that line from the file (not sure that in a binary encoded file I can exactly find start and finish of a given line).

Thanks
Moreno.

Re: [GENERAL] Invalid field size

From
Tom Lane
Date:
Moreno Andreo <moreno.andreo@evolu-s.it> writes:
> I've implemented a backup procedure in C# with Npgsql (using COPY TO I
> dump all tables in a compressed file) that's been working well in the
> last 5 years (and it's still working, since this is a single, isolated
> case).
> OS: Windows 7
> PG: 9.1.6 (I know, it's EOL, but I think it's not matter here)
> [ got corrupted data with: ]
> 2017-07-04 12:55:27 CEST STATEMENT:  COPY
> tab(cod,guid,data,blob,thumbnail,descr,type,url,user,home,codrec,table,op,dagg,last)
> FROM STDIN WITH BINARY

Pushing binary data around on Windows is always a hazardous proposition.
I'd bet something somewhere did a newline format conversion on your
data, either adding or removing CR characters.  There might not have
been any CR or LF bytes in the data fields proper, but it'd be quite
plausible for some of the length words used in binary-COPY format to
contain such bytes.

You might be better off using plain text COPY format; it can withstand
this sort of thing much better.

            regards, tom lane


Re: [SPAM] Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 04/07/2017 16:36, Adrian Klaver ha scritto:
> On 07/04/2017 04:16 AM, Moreno Andreo wrote:
>> I've implemented a backup procedure in C# with Npgsql (using COPY TO
>> I dump all tables in a compressed file) that's been working well in
>> the last 5 years (and it's still working, since this is a single,
>> isolated case).
>>
>> OS: Windows 7
>> PG: 9.1.6 (I know, it's EOL, but I think it's not matter here)
>
> Are you restoring to same as above or to another machine or Postgres
> instance?
Yes, that's what this function is intended to do... users can move their
data from a computer to another one, having the application installed.
(OS: always windows, Postgresql: always 9.1.6)
Unfortunately I can't restore this file in the source machine (user is
away).



Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 04/07/2017 16:51, Tom Lane ha scritto:
> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>> I've implemented a backup procedure in C# with Npgsql (using COPY TO I
>> dump all tables in a compressed file) that's been working well in the
>> last 5 years (and it's still working, since this is a single, isolated
>> case).
>> OS: Windows 7
>> PG: 9.1.6 (I know, it's EOL, but I think it's not matter here)
>> [ got corrupted data with: ]
>> 2017-07-04 12:55:27 CEST STATEMENT:  COPY
>> tab(cod,guid,data,blob,thumbnail,descr,type,url,user,home,codrec,table,op,dagg,last)
>> FROM STDIN WITH BINARY
> Pushing binary data around on Windows is always a hazardous proposition.
> I'd bet something somewhere did a newline format conversion on your
> data, either adding or removing CR characters.  There might not have
> been any CR or LF bytes in the data fields proper, but it'd be quite
> plausible for some of the length words used in binary-COPY format to
> contain such bytes.
>
> You might be better off using plain text COPY format; it can withstand
> this sort of thing much better.
>
>             regards, tom lane
>
When we wrote this function, we first used plain COPY format, but we
were not satisfied by the file size we got (too big compared to data
size), so we switched to BINARY (I don't remember if there was also some
performance matter involved).
So what you are saying is "in the last 5 years you've been extremely
lucky?" :-)

Thanks
Moreno.



Re: [GENERAL] Invalid field size

From
Tom Lane
Date:
Moreno Andreo <moreno.andreo@evolu-s.it> writes:
> Il 04/07/2017 16:51, Tom Lane ha scritto:
>> Pushing binary data around on Windows is always a hazardous proposition.

> So what you are saying is "in the last 5 years you've been extremely
> lucky?" :-)

Yup, particularly now that you mention moving the files between machines.
What did you do that with exactly?

            regards, tom lane


Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 04/07/2017 17:25, Tom Lane ha scritto:
> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>> Il 04/07/2017 16:51, Tom Lane ha scritto:
>>> Pushing binary data around on Windows is always a hazardous proposition.
>> So what you are saying is "in the last 5 years you've been extremely
>> lucky?" :-)
> Yup, particularly now that you mention moving the files between machines.
> What did you do that with exactly?
Trying to answer your question (I hope I understood correctly, English
is not my mother tongue)
What I do is, given a database, to COPY every table to a file, and then
pack them up in one with a zip (except this table, that's been excluded
from compression for its size and consequent compression time), so my
backup is made up by 2 files, one with "normal data" and one with the
result of COPYing this table to a file.

A question that comes while I'm writing: but pg_dump with custom format
is not using COPY with binary format?

Thanks
Moreno



Re: [GENERAL] Invalid field size

From
Adrian Klaver
Date:
On 07/04/2017 08:19 AM, Moreno Andreo wrote:
> Il 04/07/2017 16:51, Tom Lane ha scritto:
>> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>>> I've implemented a backup procedure in C# with Npgsql (using COPY TO I
>>> dump all tables in a compressed file) that's been working well in the
>>> last 5 years (and it's still working, since this is a single, isolated
>>> case).
>>> OS: Windows 7
>>> PG: 9.1.6 (I know, it's EOL, but I think it's not matter here)
>>> [ got corrupted data with: ]
>>> 2017-07-04 12:55:27 CEST STATEMENT:  COPY
>>> tab(cod,guid,data,blob,thumbnail,descr,type,url,user,home,codrec,table,op,dagg,last)
>>>
>>> FROM STDIN WITH BINARY
>> Pushing binary data around on Windows is always a hazardous proposition.
>> I'd bet something somewhere did a newline format conversion on your
>> data, either adding or removing CR characters.  There might not have
>> been any CR or LF bytes in the data fields proper, but it'd be quite
>> plausible for some of the length words used in binary-COPY format to
>> contain such bytes.
>>
>> You might be better off using plain text COPY format; it can withstand
>> this sort of thing much better.
>>
>>             regards, tom lane
>>
> When we wrote this function, we first used plain COPY format, but we
> were not satisfied by the file size we got (too big compared to data
> size), so we switched to BINARY (I don't remember if there was also some
> performance matter involved).
> So what you are saying is "in the last 5 years you've been extremely
> lucky?" :-)

Your original post went back and forth on whether you where lucky in the
past:

"... that's been working well in the last 5 years (and it's still
working, since this is a single, isolated case)"

"As for many error I got in the past I assume we are trying to COPY FROM
corrupted data (when using cheap pendrives we get often this error)."


>
> Thanks
> Moreno.
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [GENERAL] Invalid field size

From
Adrian Klaver
Date:
On 07/04/2017 08:37 AM, Moreno Andreo wrote:
> Il 04/07/2017 17:25, Tom Lane ha scritto:
>> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>>> Il 04/07/2017 16:51, Tom Lane ha scritto:
>>>> Pushing binary data around on Windows is always a hazardous
>>>> proposition.
>>> So what you are saying is "in the last 5 years you've been extremely
>>> lucky?" :-)
>> Yup, particularly now that you mention moving the files between machines.
>> What did you do that with exactly?
> Trying to answer your question (I hope I understood correctly, English
> is not my mother tongue)

I believe what Tom was asking is what mechanism/tools do you use to move
the 2 files below from one machine to another?

> What I do is, given a database, to COPY every table to a file, and then
> pack them up in one with a zip (except this table, that's been excluded
> from compression for its size and consequent compression time), so my
> backup is made up by 2 files, one with "normal data" and one with the
> result of COPYing this table to a file.
>
> A question that comes while I'm writing: but pg_dump with custom format
> is not using COPY with binary format?
>
> Thanks
> Moreno
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [GENERAL] Invalid field size

From
"Daniel Verite"
Date:
    Moreno Andreo wrote:

> As you can see I have 2 bytea fields, blob and thumbnail (the one it
> seems it's giving the error), but AFAIK the former is never used, so it
> should be always null.
> Googling around did not help.

In COPY BINARY, NULL is represented as -1 (all bits set)
in the 32-bit length word for the corresponding field.

So if any bit from this word except the bit sign would get flipped
by a hardware error, you'd get the error you mentioned because the
resulting length would come out as negative. From the source code:

  if (!CopyGetInt32(cstate, &fld_size))
    ereport(ERROR,
    (errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
     errmsg("unexpected EOF in COPY data")));
  if (fld_size == -1)
  {
    *isnull = true;
    return ReceiveFunctionCall(flinfo, NULL, typioparam, typmod);
  }
  if (fld_size < 0)
    ereport(ERROR,
    (errcode(ERRCODE_BAD_COPY_FILE_FORMAT),
     errmsg("invalid field size")));

Despite the auto-correction mechanisms in place in modern drives [1],
the probability of a non-correctable error is not negligible,
so it's plausible that it's what you're experiencing.

If that's the case and only byte is wrong in the whole file, you could
theorically fix it by finding the offset of the offending length and patch
the wrong byte with a 0xff value.


[1]
https://en.wikipedia.org/wiki/Hard_disk_drive#Error_rates_and_handling


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: [SPAM] Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 04/07/2017 17:39, Adrian Klaver ha scritto:
> On 07/04/2017 08:19 AM, Moreno Andreo wrote:
>> Il 04/07/2017 16:51, Tom Lane ha scritto:
>>> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>>>> I've implemented a backup procedure in C# with Npgsql (using COPY TO I
>>>> dump all tables in a compressed file) that's been working well in the
>>>> last 5 years (and it's still working, since this is a single, isolated
>>>> case).
>>>> OS: Windows 7
>>>> PG: 9.1.6 (I know, it's EOL, but I think it's not matter here)
>>>> [ got corrupted data with: ]
>>>> 2017-07-04 12:55:27 CEST STATEMENT:  COPY
>>>> tab(cod,guid,data,blob,thumbnail,descr,type,url,user,home,codrec,table,op,dagg,last)
>>>>
>>>> FROM STDIN WITH BINARY
>>> Pushing binary data around on Windows is always a hazardous
>>> proposition.
>>> I'd bet something somewhere did a newline format conversion on your
>>> data, either adding or removing CR characters.  There might not have
>>> been any CR or LF bytes in the data fields proper, but it'd be quite
>>> plausible for some of the length words used in binary-COPY format to
>>> contain such bytes.
>>>
>>> You might be better off using plain text COPY format; it can withstand
>>> this sort of thing much better.
>>>
>>>             regards, tom lane
>>>
>> When we wrote this function, we first used plain COPY format, but we
>> were not satisfied by the file size we got (too big compared to data
>> size), so we switched to BINARY (I don't remember if there was also
>> some performance matter involved).
>> So what you are saying is "in the last 5 years you've been extremely
>> lucky?" :-)
>
> Your original post went back and forth on whether you where lucky in
> the past:
>
> "... that's been working well in the last 5 years (and it's still
> working, since this is a single, isolated case)"
>
> "As for many error I got in the past I assume we are trying to COPY
> FROM corrupted data (when using cheap pendrives we get often this
> error)."
The bunch of errors I mention here is related to file management (issues
with file copying or unzipping), sometines I had errors like
"unrecognized Unicode character: 0xFF", and making a new backup always
resolved the error.
This is the very first time we have this kind of error.
If I had the source machine I'd try to make a new backup...




Re: [SPAM] Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 04/07/2017 17:42, Adrian Klaver ha scritto:
> On 07/04/2017 08:37 AM, Moreno Andreo wrote:
>> Il 04/07/2017 17:25, Tom Lane ha scritto:
>>> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>>>> Il 04/07/2017 16:51, Tom Lane ha scritto:
>>>>> Pushing binary data around on Windows is always a hazardous
>>>>> proposition.
>>>> So what you are saying is "in the last 5 years you've been extremely
>>>> lucky?" :-)
>>> Yup, particularly now that you mention moving the files between
>>> machines.
>>> What did you do that with exactly?
>> Trying to answer your question (I hope I understood correctly,
>> English is not my mother tongue)
>
> I believe what Tom was asking is what mechanism/tools do you use to
> move the 2 files below from one machine to another?
Sorry :-)
Files are copied simply with Windows Explorer or the backup is directly
written on the external device, if requested by the user.



Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 04/07/2017 17:42, Daniel Verite ha scritto:
>     Moreno Andreo wrote:
>
>> As you can see I have 2 bytea fields, blob and thumbnail (the one it
>> seems it's giving the error), but AFAIK the former is never used, so it
>> should be always null.
>> Googling around did not help.
>
> Despite the auto-correction mechanisms in place in modern drives [1],
> the probability of a non-correctable error is not negligible,
> so it's plausible that it's what you're experiencing.
>
> If that's the case and only byte is wrong in the whole file, you could
> theorically fix it by finding the offset of the offending length and patch
> the wrong byte with a 0xff value.
>
>
> [1]
> https://en.wikipedia.org/wiki/Hard_disk_drive#Error_rates_and_handling
>
So if it's the case (hardware error), recalling a new backup should
reproduce the error, right?
When the user comes back from holidays I'll call him and check this.

Thanks
Moreno.



Re: [GENERAL] Invalid field size

From
Adrian Klaver
Date:
On 07/04/2017 08:37 AM, Moreno Andreo wrote:
> Il 04/07/2017 17:25, Tom Lane ha scritto:
>> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>>> Il 04/07/2017 16:51, Tom Lane ha scritto:
>>>> Pushing binary data around on Windows is always a hazardous
>>>> proposition.
>>> So what you are saying is "in the last 5 years you've been extremely
>>> lucky?" :-)
>> Yup, particularly now that you mention moving the files between machines.
>> What did you do that with exactly?
> Trying to answer your question (I hope I understood correctly, English
> is not my mother tongue)
> What I do is, given a database, to COPY every table to a file, and then
> pack them up in one with a zip (except this table, that's been excluded
> from compression for its size and consequent compression time), so my
> backup is made up by 2 files, one with "normal data" and one with the
> result of COPYing this table to a file.
>
> A question that comes while I'm writing: but pg_dump with custom format
> is not using COPY with binary format?

A quick look through the source indicates to me that it is not using
BINARY. Then again I am not a C programmer, so take that into account.
It would stand to reason that it would not use BINARY as using
pg_dump/pg_restore is supposed to be portable across OS, machine
architecture and to a certain degree Postgres versions. COPY WITH BINARY
would work against that:

https://www.postgresql.org/docs/9.1/static/sql-copy.html

"The binary format option causes all data to be stored/read as binary
format rather than as text. It is somewhat faster than the text and CSV
formats, but a binary-format file is less portable across machine
architectures and PostgreSQL versions.

>
> Thanks
> Moreno
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [SPAM] Re: [GENERAL] Invalid field size

From
Adrian Klaver
Date:
On 07/04/2017 09:02 AM, Moreno Andreo wrote:
> Il 04/07/2017 17:39, Adrian Klaver ha scritto:

>>> So what you are saying is "in the last 5 years you've been extremely
>>> lucky?" :-)
>>
>> Your original post went back and forth on whether you where lucky in
>> the past:
>>
>> "... that's been working well in the last 5 years (and it's still
>> working, since this is a single, isolated case)"
>>
>> "As for many error I got in the past I assume we are trying to COPY
>> FROM corrupted data (when using cheap pendrives we get often this
>> error)."
> The bunch of errors I mention here is related to file management (issues
> with file copying or unzipping), sometines I had errors like
> "unrecognized Unicode character: 0xFF", and making a new backup always
> resolved the error.
> This is the very first time we have this kind of error.

One could say your current error is just a variation of the above.

> If I had the source machine I'd try to make a new backup...

That would be a useful data point, though given the above if it succeeds
it mainly proves Tom's point, that using BINARY in your situation is a
hit and miss exercise.

Have you tried doing something like?:

pg_dump -d production -U postgres -t projection  -a > proj_txt.sql

pg_dump -d production -U postgres -t projection  -a  -Z 5 > proj_txt.sql.gz


l  -h proj_txt.sql*
-rw-r--r-- 1 aklaver users 3.2M Jul  4 09:23 proj_txt.sql
-rw-r--r-- 1 aklaver users 560K Jul  4 09:23 proj_txt.sql.gz


>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [GENERAL] Invalid field size

From
"Daniel Verite"
Date:
    Moreno Andreo wrote:

> So if it's the case (hardware error), recalling a new backup should
> reproduce the error, right?

If the error happened when writing the file, I wouldn't expect
any other backup having the same error (assuming an error in
the bit-flip category).

And if it was a transient read error, a second run of the import
could even work. I don't quite see from your posts whether that
particular file to import was tried and failed only once or retried
and failed again.

Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 04/07/2017 18:55, Daniel Verite ha scritto:
>   I don't quite see from your posts whether that
> particular file to import was tried and failed only once or retried
> and failed again.
Only once, and until the user will not return from holidays I'll not be
able to reproduce it.

Cheers
Moreno



Re: [SPAM] Re: [SPAM] Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 04/07/2017 18:25, Adrian Klaver ha scritto:
> On 07/04/2017 09:02 AM, Moreno Andreo wrote:
>> Il 04/07/2017 17:39, Adrian Klaver ha scritto:
>
>>>> So what you are saying is "in the last 5 years you've been
>>>> extremely lucky?" :-)
>>>
>>> Your original post went back and forth on whether you where lucky in
>>> the past:
>>>
>>> "... that's been working well in the last 5 years (and it's still
>>> working, since this is a single, isolated case)"
>>>
>>> "As for many error I got in the past I assume we are trying to COPY
>>> FROM corrupted data (when using cheap pendrives we get often this
>>> error)."
>> The bunch of errors I mention here is related to file management
>> (issues with file copying or unzipping), sometines I had errors like
>> "unrecognized Unicode character: 0xFF", and making a new backup
>> always resolved the error.
>> This is the very first time we have this kind of error.
>
> One could say your current error is just a variation of the above.
On the basis of what Daniel wrote, I think you're absolutely right.
>
>> If I had the source machine I'd try to make a new backup...
>
> That would be a useful data point, though given the above if it
> succeeds it mainly proves Tom's point, that using BINARY in your
> situation is a hit and miss exercise.
>
> Have you tried doing something like?:
>
> pg_dump -d production -U postgres -t projection  -a > proj_txt.sql
>
> pg_dump -d production -U postgres -t projection  -a  -Z 5 >
> proj_txt.sql.gz
>
>
> l  -h proj_txt.sql*
> -rw-r--r-- 1 aklaver users 3.2M Jul  4 09:23 proj_txt.sql
> -rw-r--r-- 1 aklaver users 560K Jul  4 09:23 proj_txt.sql.gz

So the hint is to abandon manual COPY and let pg_dump do the hard work?
It means rewriting the whole backup logic, but if it has to be done,
I'll manage to do it.

Thanks!
Moreno




Re: [SPAM] Re: [SPAM] Re: [GENERAL] Invalid field size

From
Adrian Klaver
Date:
On 07/04/2017 10:13 AM, Moreno Andreo wrote:
> Il 04/07/2017 18:25, Adrian Klaver ha scritto:
>> On 07/04/2017 09:02 AM, Moreno Andreo wrote:
>>> Il 04/07/2017 17:39, Adrian Klaver ha scritto:
>>
>>>>> So what you are saying is "in the last 5 years you've been
>>>>> extremely lucky?" :-)
>>>>
>>>> Your original post went back and forth on whether you where lucky in
>>>> the past:
>>>>
>>>> "... that's been working well in the last 5 years (and it's still
>>>> working, since this is a single, isolated case)"
>>>>
>>>> "As for many error I got in the past I assume we are trying to COPY
>>>> FROM corrupted data (when using cheap pendrives we get often this
>>>> error)."
>>> The bunch of errors I mention here is related to file management
>>> (issues with file copying or unzipping), sometines I had errors like
>>> "unrecognized Unicode character: 0xFF", and making a new backup
>>> always resolved the error.
>>> This is the very first time we have this kind of error.
>>
>> One could say your current error is just a variation of the above.
> On the basis of what Daniel wrote, I think you're absolutely right.
>>
>>> If I had the source machine I'd try to make a new backup...
>>
>> That would be a useful data point, though given the above if it
>> succeeds it mainly proves Tom's point, that using BINARY in your
>> situation is a hit and miss exercise.
>>
>> Have you tried doing something like?:
>>
>> pg_dump -d production -U postgres -t projection  -a > proj_txt.sql
>>
>> pg_dump -d production -U postgres -t projection  -a  -Z 5 >
>> proj_txt.sql.gz
>>
>>
>> l  -h proj_txt.sql*
>> -rw-r--r-- 1 aklaver users 3.2M Jul  4 09:23 proj_txt.sql
>> -rw-r--r-- 1 aklaver users 560K Jul  4 09:23 proj_txt.sql.gz
>
> So the hint is to abandon manual COPY and let pg_dump do the hard work?

Not necessarily, you could modify your existing code to use the text COPY.


> It means rewriting the whole backup logic, but if it has to be done,
> I'll manage to do it.
>
> Thanks!
> Moreno
>
>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [SPAM] Re: [SPAM] Re: [GENERAL] Invalid field size

From
Tom Lane
Date:
Moreno Andreo <moreno.andreo@evolu-s.it> writes:
> So the hint is to abandon manual COPY and let pg_dump do the hard work?

If it is a newline-conversion problem, compressed pg_dump archives would
be just as subject to corruption as your binary COPY file is.  I'd say
the hint is to be more careful about how you do the cross-machine file
transfers.  I suspect what is really happening is you're not always
doing that the same way, and that some of the methods allow a newline
conversion to happen to the file while others don't.

            regards, tom lane


Re: [SPAM] Re: [SPAM] Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 04/07/2017 19:28, Tom Lane ha scritto:
> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>> So the hint is to abandon manual COPY and let pg_dump do the hard work?
> If it is a newline-conversion problem, compressed pg_dump archives would
> be just as subject to corruption as your binary COPY file is.  I'd say
> the hint is to be more careful about how you do the cross-machine file
> transfers.  I suspect what is really happening is you're not always
> doing that the same way, and that some of the methods allow a newline
> conversion to happen to the file while others don't.
>
>             regards, tom lane
>
>
Well, I have no control on how the user transfers back and forth among
machines.
Imagine you have a zip file where you backup your daily work. After
you've done your backup, you put it on a pendrive and go home. When
you're at home you copy this file to your computer and decompress it.
Our application works exactly the same way, except that it does not work
with raw files, but with PostgreSQL data.
So I don't know how a user handles its backup files once he has made his
backup...



Re: [GENERAL] Invalid field size

From
"Daniel Verite"
Date:
    Tom Lane wrote:

> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
> > So the hint is to abandon manual COPY and let pg_dump do the hard work?
>
> If it is a newline-conversion problem, compressed pg_dump archives would
> be just as subject to corruption as your binary COPY file is.

It's mentioned in [1] that the signature at the beginning of these files
embed a CRLF to detect this newline-conversion problem early on,
so I would expect COPY IN to stumble on a corrupted signature
and abort earlier in the process, if that conversion occurred.
Instead the report says it fails after a number of tuples:

> ERROR:  invalid field size
> CONTEXT:  COPY tab, line 619, column thumbnail

[1] https://www.postgresql.org/docs/current/static/sql-copy.htm

The file header consists of 15 bytes of fixed fields, followed by a
variable-length header extension area. The fixed fields are:

Signature
  11-byte sequence PGCOPY\n\377\r\n\0 — note that the zero byte is a
  required part of the signature. (The signature is designed to allow
  easy identification of files that have been munged by a
  non-8-bit-clean transfer. This signature will be changed by
  end-of-line-translation filters, dropped zero bytes, dropped high
  bits, or parity changes.)
  ...


Best regards,
--
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite


Re: [GENERAL] Invalid field size

From
Tom Lane
Date:
"Daniel Verite" <daniel@manitou-mail.org> writes:
>     Tom Lane wrote:
>> If it is a newline-conversion problem, compressed pg_dump archives would
>> be just as subject to corruption as your binary COPY file is.

> It's mentioned in [1] that the signature at the beginning of these files
> embed a CRLF to detect this newline-conversion problem early on,

Oh, I'd forgotten about that.

> so I would expect COPY IN to stumble on a corrupted signature
> and abort earlier in the process, if that conversion occurred.

Right.  I'm probably barking up the wrong tree, then.

            regards, tom lane


Re: [SPAM] Re: [SPAM] Re: Invalid field size

From
Adrian Klaver
Date:
On 07/04/2017 10:57 AM, Moreno Andreo wrote:
> Il 04/07/2017 19:28, Tom Lane ha scritto:
>> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>>> So the hint is to abandon manual COPY and let pg_dump do the hard work?
>> If it is a newline-conversion problem, compressed pg_dump archives would
>> be just as subject to corruption as your binary COPY file is.  I'd say
>> the hint is to be more careful about how you do the cross-machine file
>> transfers.  I suspect what is really happening is you're not always
>> doing that the same way, and that some of the methods allow a newline
>> conversion to happen to the file while others don't.
>>
>>             regards, tom lane
>>
>>
> Well, I have no control on how the user transfers back and forth among
> machines.
> Imagine you have a zip file where you backup your daily work. After
> you've done your backup, you put it on a pendrive and go home. When
> you're at home you copy this file to your computer and decompress it.
> Our application works exactly the same way, except that it does not work
> with raw files, but with PostgreSQL data.
> So I don't know how a user handles its backup files once he has made his
> backup...

Well that leads to four observations:

1) How the user handles their backup files is something that might need
to be known.

2) By using your own backup code procedure you have taken possession of
any resultant bugs:( The list might be able to help with those anyway,
if it is possible for you to share the code you use to create the backups.

3) 1) and 2) could be moot if Daniel's hardware corruption theory is
correct.

4) This is probably not going to be solved until you are able to access
the actual file(s) in question.

>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 04/07/2017 20:51, Daniel Verite ha scritto:
>     Tom Lane wrote:
>
>> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>>> So the hint is to abandon manual COPY and let pg_dump do the hard work?
>> If it is a newline-conversion problem, compressed pg_dump archives would
>> be just as subject to corruption as your binary COPY file is.
> It's mentioned in [1] that the signature at the beginning of these files
> embed a CRLF to detect this newline-conversion problem early on,
> so I would expect COPY IN to stumble on a corrupted signature
> and abort earlier in the process, if that conversion occurred.
> Instead the report says it fails after a number of tuples:
Given what you said, can I assume it's a file transfer or an
hardware-driven (pendrive) problem?



Re: [GENERAL] Invalid field size

From
Adrian Klaver
Date:
On 07/05/2017 01:05 AM, Moreno Andreo wrote:
> Il 04/07/2017 20:51, Daniel Verite ha scritto:
>>     Tom Lane wrote:
>>
>>> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>>>> So the hint is to abandon manual COPY and let pg_dump do the hard work?
>>> If it is a newline-conversion problem, compressed pg_dump archives would
>>> be just as subject to corruption as your binary COPY file is.
>> It's mentioned in [1] that the signature at the beginning of these files
>> embed a CRLF to detect this newline-conversion problem early on,
>> so I would expect COPY IN to stumble on a corrupted signature
>> and abort earlier in the process, if that conversion occurred.
>> Instead the report says it fails after a number of tuples:
> Given what you said, can I assume it's a file transfer or an
> hardware-driven (pendrive) problem?

Daniel also mentioned the harddrive as a possible source of error. I
would say monitoring where and when the issues appear may help with
determining the source.

>
>
>


--
Adrian Klaver
adrian.klaver@aklaver.com


Re: [SPAM] Re: [GENERAL] Invalid field size

From
Moreno Andreo
Date:
Il 05/07/2017 16:33, Adrian Klaver ha scritto:
> On 07/05/2017 01:05 AM, Moreno Andreo wrote:
>> Il 04/07/2017 20:51, Daniel Verite ha scritto:
>>>     Tom Lane wrote:
>>>
>>>> Moreno Andreo <moreno.andreo@evolu-s.it> writes:
>>>>> So the hint is to abandon manual COPY and let pg_dump do the hard
>>>>> work?
>>>> If it is a newline-conversion problem, compressed pg_dump archives
>>>> would
>>>> be just as subject to corruption as your binary COPY file is.
>>> It's mentioned in [1] that the signature at the beginning of these
>>> files
>>> embed a CRLF to detect this newline-conversion problem early on,
>>> so I would expect COPY IN to stumble on a corrupted signature
>>> and abort earlier in the process, if that conversion occurred.
>>> Instead the report says it fails after a number of tuples:
>> Given what you said, can I assume it's a file transfer or an
>> hardware-driven (pendrive) problem?
>
> Daniel also mentioned the harddrive as a possible source of error. I
> would say monitoring where and when the issues appear may help with
> determining the source.
Yeah, trying to restore the same file on another machine should help
determine the possible culprit.