Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY" - Mailing list pgsql-general

From Adrian Klaver
Subject Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY"
Date
Msg-id 5561DF29.70108@aklaver.com
Whole thread Raw
In response to Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY"  (Arup Rakshit <aruprakshit@rocketmail.com>)
Responses Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY"
List pgsql-general
On 05/24/2015 04:55 AM, Arup Rakshit wrote:
> On Sunday, May 24, 2015 02:52:47 PM you wrote:
>> On Sun, 2015-05-24 at 16:56 +0630, Arup Rakshit wrote:
>>> Hi,
>>>
>>> I am copying the data from a CSV file to a Table using "COPY" command.
>>> But one thing that I got stuck, is how to skip duplicate records while
>>> copying from CSV to tables. By looking at the documentation, it seems,
>>> Postgresql don't have any inbuilt too to handle this with "copy"
>>> command. By doing Google I got below 1 idea to use temp table.
>>>
>>> http://stackoverflow.com/questions/13947327/to-ignore-duplicate-keys-during-copy-from-in-postgresql
>>>
>>> I am also thinking what if I let the records get inserted, and then
>>> delete the duplicate records from table as this post suggested -
>>> http://www.postgresql.org/message-id/37013500.DFF0A64A@manhattanproject.com.
>>>
>>> Both of the solution looks like doing double work. But I am not sure
>>> which is the best solution here. Can anybody suggest which approach
>>> should I adopt ? Or if any better ideas you guys have on this task,
>>> please share.
>>
>> Assuming you are using Unix, or can install Unix tools, run the input
>> files through
>>
>>    sort -u
>>
>> before passing them to COPY.
>>
>> Oliver Elphick
>>
>
> I think I need to ask more specific way. I have a table say `table1`, where I feed data from different CSV files. Now
supposeI have inserted N records to my table `table1` from csv file `c1`. This is ok, next time when again I am
importingfrom a different CSV file say `c2` to `table1`, I just don't want reinsert any record from this new CSV file
totable `table1`, if the current CSV data already table has. 
>
> How to do this?

As others have pointed out this depends on what you are considering a
duplicate.

Is it if the entire row is duplicated?

Or if some portion of the row(a 'primary key') is duplicated?

>
>   My SO link is not a solution to my problem I see now.
>


--
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: rob stone
Date:
Subject: Re: How to skip duplicate records while copying from CSV to table in Postgresql using "COPY"
Next
From: Peter Swartz
Date:
Subject: Re: Enum in foreign table: error and correct way to handle.