Re: Bulkloading using COPY - ignore duplicates? - Mailing list pgsql-hackers

From Daniel Kalchev
Subject Re: Bulkloading using COPY - ignore duplicates?
Date
Msg-id 200201040807.KAA29714@dcave.digsys.bg
Whole thread Raw
In response to Re: Bulkloading using COPY - ignore duplicates?  ("Vadim Mikheev" <vmikheev@sectorbase.com>)
List pgsql-hackers
>>>"Vadim Mikheev" said:> 1. I prefer Oracle' (and others, I believe) way - put statement(s) in PL> block and define>
forwhat exceptions (errors) what actions should be taken (ie IGNORE for> NON_UNIQ_KEY> error, etc).
 

Some people prefer 'pure' SQL. Anyway, it can be argued which is worse - the 
usage of non-SQL language, or usage of extended SQL language. I guess the SQL 
standard does not provide for such functionality?
> 2. For INSERT ... SELECT statement one can put DISTINCT in select' target> list.

With this construct, you are effectively copying rows from one table to 
another - or constructing rows from various sources (constants, other tables 
etc) and inserting these in the table. If the target table has unique indexes 
(or constraints), and some of the rows returned by SELECT violate the 
restrictions - you are supposed to get errors - and unfortunately the entire 
INSERT is aborted. I fail to see how DISTINCT can help here... Perhaps it is 
possible to include checking for already existing tuples in the destination 
table in the select... but this will significantly increase the runtime, 
especially when the destination table is huge.

My idea is to let this INSERT statement insert as much of its rows as 
possible, eventually returning NOTICEs or ignoring the errors (with an IGNORE 
ERRORS syntax for example :)

I believe all this functionality will have to consider the syntax firts.

Daniel



pgsql-hackers by date:

Previous
From: "Vadim Mikheev"
Date:
Subject: Re: Bulkloading using COPY - ignore duplicates?
Next
From: Karel Zak
Date:
Subject: Re: datetime error?