Re: Insert performance (OT?) - Mailing list pgsql-performance

From Richard Huxton
Subject Re: Insert performance (OT?)
Date
Msg-id 42DCDB47.3060009@archonet.com
Whole thread Raw
In response to Insert performance (OT?)  (Yves Vindevogel <yves.vindevogel@implements.be>)
Responses Re: Insert performance (OT?)  (Yves Vindevogel <yves.vindevogel@implements.be>)
Re: Insert performance (OT?)  (Manfred Koizar <mkoi-pg@aon.at>)
List pgsql-performance
Yves Vindevogel wrote:
  >>> So, I must use a function that will check against u1 and u2, and then
>>> insert if it is ok.
>>> I know that such a function is way slower that my insert query.
>>
>> So - you have a table, called something like "upload" with 20,000 rows
>> and you'd like to know whether it is safe to insert them. Well, it's
>> easy enough to identify which ones are duplicates.
>>
>> SELECT * FROM upload JOIN main_table ON u1=f1 AND u2=f2 AND u3=f3;
>> SELECT * FROM upload JOIN main_table ON u1=f1 AND u2=f2 AND u3=f4;
>>
> That is a good idea.  I can delete the ones that would fail my first
> unique index this way, and then delete the ones that would fail my
> second unique index and then upload them.
> Hmm, why did I not think of that myself.

I've spent a lot of time moving data from one system to another, usually
having to clean it in the process. At 9pm on a Friday, you decide that
on the next job you'll find an efficient way to do it :-)

>> Are you saying that deleting these rows and then inserting takes too
>> long?
>>
> This goes very fast, but not with a function that checks each record one
> by one.

You could get away with one query if you converted them to left-joins:
INSERT INTO ...
SELECT * FROM upload LEFT JOIN ... WHERE f3 IS NULL
UNION
SELECT * FROM upload LEFT JOIN ... WHERE f4 IS NULL

The UNION will remove duplicates for you, but this might turn out to be
slower than two separate queries.

--
   Richard Huxton
   Archonet Ltd

pgsql-performance by date:

Previous
From: Yves Vindevogel
Date:
Subject: Fwd: Insert performance (OT?)
Next
From: Yves Vindevogel
Date:
Subject: Re: Insert performance (OT?)