Re: COPY from .csv File and Remove Duplicates - Mailing list pgsql-general

From Craig Ringer
Subject Re: COPY from .csv File and Remove Duplicates
Date
Msg-id 4E449834.1020405@ringerc.id.au
Whole thread Raw
In response to Re: COPY from .csv File and Remove Duplicates  ("David Johnston" <polobo@yahoo.com>)
List pgsql-general
On 12/08/2011 10:32 AM, David Johnston wrote:

> The general structure for the insert would be:
>
> INSERT INTO maintable (cols)
> SELECT cols FROM staging WHERE staging.idcols NOT IN (SELECT
> maintable.idcols FROM maintable);
>
> There may be more efficient ways to write the query but the idea is the
> same.

Yeah... I'd favour an EXISTS test or a join.

INSERT INTO maintable (cols)
SELECT cols FROM staging WHERE NOT EXISTS (SELECT
1 FROM maintable WHERE maintable.idcol = staging.idcol);

... as the NOT IN(...) test can have less than lovely behavior for large
key sets.

--
Craig Ringer

pgsql-general by date:

Previous
From: "David Johnston"
Date:
Subject: Re: COPY from .csv File and Remove Duplicates
Next
From: Greg Smith
Date:
Subject: Re: Postgres on SSD