Re: Bulkloading using COPY - ignore duplicates? - Mailing list pgsql-hackers

From Zeugswetter Andreas SB SD
Subject Re: Bulkloading using COPY - ignore duplicates?
Date
Msg-id 46C15C39FEB2C44BA555E356FBCD6FA41EB39F@m0114.s-mxs.net
Whole thread Raw
In response to Bulkloading using COPY - ignore duplicates?  (Lee Kindness <lkindness@csl.co.uk>)
List pgsql-hackers
> > Would this seem a reasonable thing to do? Does anyone rely on COPY
> > FROM causing an ERROR on duplicate input?
> 
> Yes.  This change will not be acceptable unless it's made an optional
> (and not default, IMHO, though perhaps that's negotiable) feature of
> COPY.
> 
> The implementation might be rather messy too.  I don't much 
> care for the
> notion of a routine as low-level as bt_check_unique knowing that the
> context is or is not COPY.  We might have to do some restructuring.
> 
> > Would:
> >  WITH ON_DUPLICATE = CONTINUE|TERMINATE (or similar)
> > need to be added to the COPY command (I hope not)?
> 
> It occurs to me that skip-the-insert might be a useful option for
> INSERTs that detect a unique-key conflict, not only for COPY.  (Cf.
> the regular discussions we see on whether to do INSERT first or
> UPDATE first when the key might already exist.)  Maybe a SET variable
> that applies to all forms of insertion would be appropriate.

Imho yes, but:
I thought that the problem was, that you cannot simply skip the 
insert, because at that time the tuple (pointer) might have already 
been successfully inserted into an other index/heap, and thus this was 
only sanely possible with savepoints/undo.

An idea would probably be to at once mark the new tuple dead, and
proceed
normally?

Andreas


pgsql-hackers by date:

Previous
From: Thomas Swan
Date:
Subject: Re: Bulkloading using COPY - ignore duplicates?
Next
From: Oleg Bartunov
Date:
Subject: cvs problem