Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling - Mailing list pgsql-hackers

From Stas Kelvich
Subject Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling
Date
Msg-id 962E8012-78DB-421C-AFF3-A85DE39E469C@postgrespro.ru
Whole thread Raw
In response to Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling
List pgsql-hackers
> On 12 Apr 2017, at 20:23, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Apr 12, 2017 at 1:18 PM, Nicolas Barbier
> <nicolas.barbier@gmail.com> wrote:
>> 2017-04-11 Robert Haas <robertmhaas@gmail.com>:
>>> If the data quality is poor (say, 50% of lines have errors) it's
>>> almost impossible to avoid runaway XID consumption.
>>
>> Yup, that seems difficult to work around with anything similar to the
>> proposed. So the docs might need to suggest not to insert a 300 GB
>> file with 50% erroneous lines :-).
>
> Yep.  But it does seem reasonably likely that someone might shoot
> themselves in the foot anyway.  Maybe we just live with that.
>

Moreover if that file consists of one-byte lines (plus one byte of newline char)
then during its import xid wraparound will happens 18 times =)

I think it’s reasonable at least to have something like max_errors parameter
to COPY, that will be set by default to 1000 for example. If user will hit that
limit then it is a good moment to put a warning about possible xid consumption
in case of bigger limit.

However I think it worth of quick research whether it is possible to create special
code path for COPY in which errors don’t cancel transaction. At least when
COPY called outside of transaction block.


Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company





pgsql-hackers by date:

Previous
From: Álvaro Hernández Tortosa
Date:
Subject: Re: [HACKERS] Some thoughts about SCRAM implementation
Next
From: Heikki Linnakangas
Date:
Subject: Re: [HACKERS] Some thoughts about SCRAM implementation