Home > mailing lists

Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY" - Mailing list pgsql-general

From	Oliver Elphick
Subject	Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY"
Date	May 24, 2015 13:16:23
Msg-id	1432473376.21861.80.camel@linda Whole thread Raw
In response to	Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY" (Arup Rakshit <aruprakshit@rocketmail.com>)
List	pgsql-general

Tree view

On Sun, 2015-05-24 at 18:25 +0630, Arup Rakshit wrote:
> >
> > Assuming you are using Unix, or can install Unix tools, run the input
> > files through
> >
> >   sort -u
> >
> > before passing them to COPY.
> >
> > Oliver Elphick
> >
>
> I think I need to ask more specific way. I have a table say `table1`,
> where I feed data from different CSV files. Now suppose I have
> inserted N records to my table `table1` from csv file `c1`. This is
> ok, next time when again I am importing from a different CSV file say
> `c2` to `table1`, I just don't want reinsert any record from this new
> CSV file to table `table1`, if the current CSV data already table has.
>
> How to do this?

Unix tools are still the easiest way to deal with it, I think.

Ensure the total input is unique as above and stored in file1.

Use COPY to output the existing table to another text file (file2) with
similar format to file1.  Then

  cat file1 file2 | sort | uniq -d >file3

This will only output lines that exist in both file1 and file2.  Then

  cat file1 file3 | sort | uniq -u >newinputfile

This will eliminate from file1 lines that are already in file2.

It will only eliminate lines that are entirely identical; it won't stop
duplicate primary keys.

Oliver Elphick

pgsql-general by date:

From: Ravi Krishna
Date: 24 May 2015, 13:15:23
Subject: Re: PG and undo logging

From: rob stone
Date: 24 May 2015, 13:23:29
Subject: Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY"

Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY" - Mailing list pgsql-general

Previous

Next