Re: Most efficient way to insert without duplicates - Mailing list pgsql-general

From Moshe Jacobson
Subject Re: Most efficient way to insert without duplicates
Date
Msg-id CAJ4CxLmFHQmpavxj+LwpodeL4tAMwT=D6DG5yA=nxW0wx7y6pA@mail.gmail.com
Whole thread Raw
In response to Most efficient way to insert without duplicates  (François Beausoleil <francois@teksol.info>)
List pgsql-general
On Tue, Apr 16, 2013 at 10:51 PM, François Beausoleil <francois@teksol.info> wrote:
INSERT INTO persona_followers(service_id, follower_id, valid_at)
  SELECT service_id, follower_id, NOW()
  FROM (SELECT DISTINCT service_id, follower_id FROM import) AS import
  WHERE NOT EXISTS(SELECT * FROM persona_followers WHERE import.service_id = persona_followers.service_id AND import.follower_id = persona_followers.follower_id);

Try this for your insert query instead:

insert into persona_followers( service_id, follower_id, valid_at )
select i.service_id, i.follower_id, now()
from import i
left join persona_followers pf on i.service_id = pf.service_id and i.follower_id = pf.follower_id
where pf.service_id is null
order by 1,2;

This will insert only those rows that are not already present, and involves no subqueries and only one join.

-- 
Moshe Jacobson
Nead Werx, Inc. | Manager of Systems Engineering
2323 Cumberland Parkway, Suite 201 | Atlanta, GA 30339
moshe@neadwerx.com | 
www.neadwerx.com

"Quality is not an act, it is a habit." -- Aristotle

pgsql-general by date:

Previous
From: Yuriy Rusinov
Date:
Subject: Re: GSL onto postgresql server 9.2
Next
From: Alfonso Afonso
Date:
Subject: Re: SQLNestedException: Cannot get a connection, pool error Timeout waiting for idle object