Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling - Mailing list pgsql-hackers

From Craig Ringer
Subject Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling
Date
Msg-id CAMsr+YH5mYTM_C-WrT10=G1HEVE9Xsgig=WeFKqrNJ8+-ChoHg@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] GSOC'17 project introduction: Parallel COPY executionwith errors handling  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: [HACKERS] GSOC'17 project introduction: Parallel COPY execution with errors handling  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 3 March 2018 at 13:08, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
On 1/22/18 21:33, Craig Ringer wrote:
> We don't have much in the way of rules about what input functions can or
> cannot do, so you can't assume much about their behaviour and what must
> / must not be cleaned up. Nor can you just reset the state in a heavy
> handed manner like (say) plpgsql does.

I think one thing to try would to define a special kind of exception
that can safely be caught and ignored.  Then, input functions can
communicate benign parse errors by doing their own cleanup first, then
throwing this exception, and then the COPY subsystem can deal with it.

That makes sense. We'd only use the error code range in question when it was safe to catch without re-throw, and we'd have to enforce rules around using a specific memory context. Of course no LWLocks could be held, but that's IIRC true when throwing anyway unless you plan to proc_exit() in your handler.

People will immediately ask for it to handle RI errors too, so something similar would be needed there. But frankly, Pg's RI handling for bulk loading desperately needs a major change in how it works to make it efficient anyway, the current model of individual row triggers is horrible for bulk load performance.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-hackers by date:

Previous
From: Erik Rijkers
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions
Next
From: Masahiko Sawada
Date:
Subject: Re: psql tab completion for ALTER INDEX SET