Home > mailing lists

Re: Skipping duplicate records? - Mailing list pgsql-general

From	Marc SCHAEFER
Subject	Re: Skipping duplicate records?
Date	June 8, 2001 12:30:36
Msg-id	Pine.LNX.3.96.1010607094151.988E-100000@defian.alphanet.ch Whole thread Raw
In response to	Skipping duplicate records? (Steve Micallef <stevenm@ot.com.au>)
List	pgsql-general

Tree view

On Thu, 7 Jun 2001, Steve Micallef wrote:

> 'mysqlimport' has the ability to skip duplicate records when doing bulk
> imports from non-binary files. PostgreSQL doesn't seem to have this
> feature, and it causes a problem for me as I import extremely large
> amounts of data into Postgres using 'copy' and it rejects the whole file
> if one record breaches the primary key.

As a quick comment, I personnally find the above to be a *feature*. If
something goes wrong during the COPY, I really want this to be handled in
a transactional manner, and just not do anything. Else it's a pain to find
out WHAT was really inserted, etc.

Your problem is really that your input data is incorrect: it doesn't
respect the constraints you want on the data.

You could:

   - import the data in an non-constrained table (no UNIQUE nor
     PRIMARY KEY), when import is complete, remove the duplicates

     assuming id is your to-be-primary-key:

        SELECT t1.id
        FROM temp_table t1, temp_table t2
        WHERE (t1.id = t2.id) AND (t2.oid != t2.oid);

     And now it's up to you to think and see which one of those records
     with the duplicate IDs are the one to keep.

pgsql-general by date:

From: Julien Jehannet
Date: 08 June 2001, 12:22:15
Subject: Re: Re: why unsigned numbers don't exist ?

From: Olivier Cherrier
Date: 08 June 2001, 12:32:17
Subject: compile error using libpq

Re: Skipping duplicate records? - Mailing list pgsql-general

Previous

Next