Thread: Inserting many rows using "with"

Inserting many rows using "with"

From
Harmen
Date:
Hi all,

I got a table with subscriptions to some kind of campaigns, and every contact can
have max 1 running subscription per campaign:

    CREATE TABLE subscriptions
    ( id SERIAL NOT NULL PRIMARY KEY
    , campaign_id INT NOT NULL
    , contact_id INT NOT NULL
    , done BOOL NOT NULL
    );
    CREATE UNIQUE INDEX subscriptions_running ON subscriptions
    ( campaign_id
    , contact_id
    ) WHERE NOT done;

Now I want to add, say, 100_000 contacts to a subscription. The contacts may or
may not already have a subscription, I want to simply ignore the ones which
already have a not-done subscription.

I begin a transaction, and loop over the contact IDs with:

    INSERT INTO subscriptions
        (campaign_id, contact_id, done)
    VALUES ($1, $2, FALSE)
    ON CONFLICT (campaign_id, contact_id) WHERE NOT done DO NOTHING
    RETURNING id;

This does what it should do, so in that sense it's fine :)


But it's still a network roundtrip per statement, and it takes a while to run.
How about doing it in a single query, sending all contacts_ids at once:

    WITH ids AS (SELECT unnest($2::int[]) AS contact_id)
    INSERT INTO subscriptions
        (campaign_id, contact_id, done)
        SELECT $1, ids.contact_id, FALSE
        FROM ids
    ON CONFLICT (campaign_id, contact_id) WHERE NOT done DO NOTHING
    RETURNING id;

Where $2 is an array of integers. This is a single query, and it behaves the
same as the loop above. And it's indeed faster.


Are there any known problems with this strategy? Are they any other methods of
inserting lots of records in a nicer way?

Alternatives I know of, and the only options I could find documented:
- create a huge custom insert statement
- use COPY, but that doesn't work with 'ON CONFLICT' as far as I can see

Any thoughts?
Thanks!
Harmen



Re: Inserting many rows using "with"

From
Brian Dunavant
Date:


On Fri, Sep 11, 2020 at 1:31 PM Harmen <harmen@lijzij.de> wrote:
Hi all,

Are there any known problems with this strategy? Are they any other methods of
inserting lots of records in a nicer way?

I do this all the time with insert and it's wonderful.  It can get tricky if you need to do UPDATEs.   

You can avoid the unnest and just use value rows directly if you want (which can be easier to read)l

 with ids(id) as 
( values (1), 
              (2),
              (3) 
) select id from ids;

Further, you can use this technique with multiple CTEs to insert into completely different tables all in the same query saving lots of round trips.


Alternatives I know of, and the only options I could find documented:
- create a huge custom insert statement
- use COPY, but that doesn't work with 'ON CONFLICT' as far as I can see

COPY is great for large amounts of data.  If it's a huge amount of data, you can load it into a temp table with COPY, and then use your ON CONFLICT insert to implement your logic.