Home > mailing lists

Re: COPY from .csv File and Remove Duplicates - Mailing list pgsql-general

From	Craig Ringer
Subject	Re: COPY from .csv File and Remove Duplicates
Date	August 12, 2011 00:04:36
Msg-id	4E449834.1020405@ringerc.id.au Whole thread Raw
In response to	Re: COPY from .csv File and Remove Duplicates ("David Johnston" <polobo@yahoo.com>)
List	pgsql-general

Tree view

On 12/08/2011 10:32 AM, David Johnston wrote:

> The general structure for the insert would be:
>
> INSERT INTO maintable (cols)
> SELECT cols FROM staging WHERE staging.idcols NOT IN (SELECT
> maintable.idcols FROM maintable);
>
> There may be more efficient ways to write the query but the idea is the
> same.

Yeah... I'd favour an EXISTS test or a join.

INSERT INTO maintable (cols)
SELECT cols FROM staging WHERE NOT EXISTS (SELECT
1 FROM maintable WHERE maintable.idcol = staging.idcol);

... as the NOT IN(...) test can have less than lovely behavior for large
key sets.

--
Craig Ringer

pgsql-general by date:

From: "David Johnston"
Date: 11 August 2011, 23:32:57
Subject: Re: COPY from .csv File and Remove Duplicates

From: Greg Smith
Date: 12 August 2011, 01:57:25
Subject: Re: Postgres on SSD

Re: COPY from .csv File and Remove Duplicates - Mailing list pgsql-general

Previous

Next