Lee Kindness wrote:
> Tom Lane writes:
> > Lee Kindness <lkindness@csl.co.uk> writes:
> > > In an ideal world 'COPY FROM' would only be used with data output by
> > > 'COPY TO' and it would be nice and sanitised. However in some fields
> > > this often is not a possibility due to performance constraints!
> > Of course, the more bells and whistles we add to COPY, the slower it
> > will get, which rather defeats the purpose no?
>
> Indeed, but as I've mentioned in this thread in the past, the code
> path for COPY FROM already does a check against the unique index (if
> there is one) but bombs-out rather than handling it...
>
> It wouldn't add any execution time if there were no duplicates in the
> input!
I know many purists object to allowing COPY to discard invalid rows in
COPY input, but it seems we have lots of requests for this feature, with
few workarounds except pre-processing the flat file. Of course, if they
use INSERT, they will get errors that they can just ignore. I don't see
how allowing errors in COPY is any more illegal, except that COPY is one
command while multiple INSERTs are separate commands.
Seems we need to allow such a capability, if only crudely. I don't
think we can create a discard file because of the problem with remote
COPY.
I think we can allow something like:
COPY FROM '/tmp/x' WITH ERRORS 2
meaning we will allow at most two errors and will report the error line
numbers to the user. I think this syntax clearly indicates that errors
are being accepted in the input. An alternate syntax would allow an
unlimited number of errors:
COPY FROM '/tmp/x' WITH ERRORS
The errors can be non-unique errors, or even CHECK constraint errors.
Unless I hear complaints, I will add it to TODO.
-- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610)
853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill,
Pennsylvania19026