Re: Bulkloading using COPY - ignore duplicates? - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Bulkloading using COPY - ignore duplicates?
Date
Msg-id 200201022340.g02NeKd12500@candle.pha.pa.us
Whole thread Raw
In response to Re: Bulkloading using COPY - ignore duplicates?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Bulkloading using COPY - ignore duplicates?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I think we can allow something like:
> > COPY FROM '/tmp/x' WITH ERRORS 2
> 
> > Yes, I realize we need subtransactions or something, but we should add
> > it to the TODO list if it is a valid request, right?
> 
> Well, I don't like that particular API in any case.  Why would I think
> that 2 errors are okay and 3 are not, if I'm loading a
> many-thousand-line COPY file?  Wouldn't it matter *what* the errors

I threw the count idea in as a possible compromise.  :-)

> are, at least as much as how many there are?  "Discard duplicate rows"
> is one thing, but "ignore bogus data" (eg, unrecognizable timestamps)
> is not the same animal at all.

Yes, when we have error codes, it would be nice to specify certain
errors to ignore.

> As someone already remarked, the correct, useful form of such a feature
> is to echo the rejected lines to some sort of output file that I can
> look at afterwards.  How many errors there are is not the issue.

How about for TODO:
* Allow COPY to report error lines and continue; requiresnested transactions;  optionally allow error codes to be
specified


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Laurette Cisneros
Date:
Subject: Re: bug in join?
Next
From: Hannu Krosing
Date:
Subject: Re: problems with new vacuum (??)