Re: Bulkloading using COPY - ignore duplicates? - Mailing list pgsql-hackers

From Lee Kindness
Subject Re: Bulkloading using COPY - ignore duplicates?
Date
Msg-id 15384.52447.737930.610518@elsick.csl.co.uk
Whole thread Raw
In response to Re: Bulkloading using COPY - ignore duplicates?  (Patrick Welche <prlw1@newn.cam.ac.uk>)
Responses Re: Bulkloading using COPY - ignore duplicates?  ("Ross J. Reedstrom" <reedstrm@rice.edu>)
List pgsql-hackers
Patrick Welche writes:> On Thu, Dec 13, 2001 at 01:25:11PM +0000, Lee Kindness wrote:> > That's what I'm currently
doingas a workaround - a SELECT DISTINCT> > from a temporary table into the real table with the unique index on> > it.
Howeverthis takes absolute ages - say 5 seconds for the copy> > (which is the ballpark figure I aiming toward and can
achievewith> > Ingres) plus another 30ish seconds for the SELECT DISTINCT.> Then your column really isn't unique,
 

That's another discussion entirely ;) - it's spat out by a real-time
system which doesn't have the time or resources to check this. Further
precision loss later in the data's life adds more duplicates...
> so how about dropping the unique index, import the data, fix the> duplicates, recreate the unique index - just as
anotherpossible> work around ;) 
 

This is just going to be the same(ish) time, no?
CREATE TABLE tab (p1 INT, p2 INT, other1 INT, other2 INT);COPY tab FROM 'file';DELETE FROM tab WHERE p1, p2 NOT IN
(SELECTDISTINCT p1, p2                                     FROM tab);CREATE UNIQUE INDEX tab_idx ON tab USING BTREE(p1,
p2);

or am I missing something?

Thanks, Lee.

-- Lee Kindness, Senior Software Engineer, Concept Systems Limited.http://services.csl.co.uk/ http://www.csl.co.uk/ +44
1315575595
 


pgsql-hackers by date:

Previous
From: "Zeugswetter Andreas SB SD"
Date:
Subject: Re: [CYGWIN] Platform Testing - Cygwin
Next
From: Thomas Lockhart
Date:
Subject: Platform testing (last call?)