Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY" - Mailing list pgsql-general

From Arup Rakshit
Subject Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY"
Date
Msg-id 1720349.nDRETasvJp@linux-wzza.aruprakshit
Whole thread Raw
In response to Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY"  (Adrian Klaver <adrian.klaver@aklaver.com>)
Responses Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY"
List pgsql-general
On Sunday, May 24, 2015 07:24:41 AM you wrote:
> On 05/24/2015 04:55 AM, Arup Rakshit wrote:
> > On Sunday, May 24, 2015 02:52:47 PM you wrote:
> >> On Sun, 2015-05-24 at 16:56 +0630, Arup Rakshit wrote:
> >>> Hi,
> >>>
> >>> I am copying the data from a CSV file to a Table using "COPY" command.
> >>> But one thing that I got stuck, is how to skip duplicate records while
> >>> copying from CSV to tables. By looking at the documentation, it seems,
> >>> Postgresql don't have any inbuilt too to handle this with "copy"
> >>> command. By doing Google I got below 1 idea to use temp table.
> >>>
> >>> http://stackoverflow.com/questions/13947327/to-ignore-duplicate-keys-during-copy-from-in-postgresql
> >>>
> >>> I am also thinking what if I let the records get inserted, and then
> >>> delete the duplicate records from table as this post suggested -
> >>> http://www.postgresql.org/message-id/37013500.DFF0A64A@manhattanproject.com.
> >>>
> >>> Both of the solution looks like doing double work. But I am not sure
> >>> which is the best solution here. Can anybody suggest which approach
> >>> should I adopt ? Or if any better ideas you guys have on this task,
> >>> please share.
> >>
> >> Assuming you are using Unix, or can install Unix tools, run the input
> >> files through
> >>
> >>    sort -u
> >>
> >> before passing them to COPY.
> >>
> >> Oliver Elphick
> >>
> >
> > I think I need to ask more specific way. I have a table say `table1`, where I feed data from different CSV files.
Nowsuppose I have inserted N records to my table `table1` from csv file `c1`. This is ok, next time when again I am
importingfrom a different CSV file say `c2` to `table1`, I just don't want reinsert any record from this new CSV file
totable `table1`, if the current CSV data already table has. 
> >
> > How to do this?
>
> As others have pointed out this depends on what you are considering a
> duplicate.
>
> Is it if the entire row is duplicated?

It is entire row.

> Or if some portion of the row(a 'primary key') is duplicated?
>
> >
> >   My SO link is not a solution to my problem I see now.
> >
>
>
>

--
================
Regards,
Arup Rakshit
================
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as
possible,you are, by definition, not smart enough to debug it. 

--Brian Kernighan


pgsql-general by date:

Previous
From: Peter Swartz
Date:
Subject: Re: Enum in foreign table: error and correct way to handle.
Next
From: Adrian Klaver
Date:
Subject: Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY"