Re: how to make duplicate finding query faster? - Mailing list pgsql-admin

From Sachin Kumar
Subject Re: how to make duplicate finding query faster?
Date
Msg-id CALg-PKAwst1C2fF9ZBY+Dog0D2CZ7P-xppG41b3DnV62R=SxwQ@mail.gmail.com
Whole thread Raw
In response to Re: how to make duplicate finding query faster?  (Holger Jakobs <holger@jakobs.com>)
Responses Re: how to make duplicate finding query faster?  (Holger Jakobs <holger@jakobs.com>)
Re: how to make duplicate finding query faster?  ("Gavan Schneider" <list.pg.gavan@pendari.org>)
List pgsql-admin
HI Mr. Holger,

This will not suffice our requirement. we have to validate that there would not be any duplicate value in DB. we have done that earlier by leaving DB to check if there is any duplicate and found duplicate value in DB. 

we have a table "card"  and a single column "number " which we are updated with a csv file with 600k numbers and require no duplicate number should be there in the table.

Please if can have a faster query can help us in achieving this requirement.



On Wed, Dec 30, 2020 at 1:54 PM Holger Jakobs <holger@jakobs.com> wrote:
Am 30.12.20 um 08:36 schrieb Sachin Kumar:
Hi All,

I am uploading data into PostgreSQL using the CSV file and checking if there is any duplicates value in DB it should return a duplicate error.  I am using below mention query.

if Card_Bank.objects.filter( Q(ACCOUNT_NUMBER=card_number) ).exists():
        flag=2
      else:
        flag=1
it is taking too much time i am using 600k cards in CSV.

Kindly help me in making the query faster.

I am using Python, Django & PostgreSQL.
--

Best Regards,
Sachin Kumar

I think it would be easier to not check the duplicates before, but let the DB complain about duplicates.

That would about slash the roundtrips to the DB in half. Instead of check + insert there would be only an insert, which might fail every now and then.

Regards,

Holger

-- 
Holger Jakobs, Bergisch Gladbach, Tel. +49-178-9759012


--

Best Regards,
Sachin Kumar

pgsql-admin by date:

Previous
From: Holger Jakobs
Date:
Subject: Re: how to make duplicate finding query faster?
Next
From: Holger Jakobs
Date:
Subject: Re: how to make duplicate finding query faster?