Thread: Inserting many rows using "with"
Hi all, I got a table with subscriptions to some kind of campaigns, and every contact can have max 1 running subscription per campaign: CREATE TABLE subscriptions ( id SERIAL NOT NULL PRIMARY KEY , campaign_id INT NOT NULL , contact_id INT NOT NULL , done BOOL NOT NULL ); CREATE UNIQUE INDEX subscriptions_running ON subscriptions ( campaign_id , contact_id ) WHERE NOT done; Now I want to add, say, 100_000 contacts to a subscription. The contacts may or may not already have a subscription, I want to simply ignore the ones which already have a not-done subscription. I begin a transaction, and loop over the contact IDs with: INSERT INTO subscriptions (campaign_id, contact_id, done) VALUES ($1, $2, FALSE) ON CONFLICT (campaign_id, contact_id) WHERE NOT done DO NOTHING RETURNING id; This does what it should do, so in that sense it's fine :) But it's still a network roundtrip per statement, and it takes a while to run. How about doing it in a single query, sending all contacts_ids at once: WITH ids AS (SELECT unnest($2::int[]) AS contact_id) INSERT INTO subscriptions (campaign_id, contact_id, done) SELECT $1, ids.contact_id, FALSE FROM ids ON CONFLICT (campaign_id, contact_id) WHERE NOT done DO NOTHING RETURNING id; Where $2 is an array of integers. This is a single query, and it behaves the same as the loop above. And it's indeed faster. Are there any known problems with this strategy? Are they any other methods of inserting lots of records in a nicer way? Alternatives I know of, and the only options I could find documented: - create a huge custom insert statement - use COPY, but that doesn't work with 'ON CONFLICT' as far as I can see Any thoughts? Thanks! Harmen
On Fri, Sep 11, 2020 at 1:31 PM Harmen <harmen@lijzij.de> wrote:
Hi all,
Are there any known problems with this strategy? Are they any other methods of
inserting lots of records in a nicer way?
I do this all the time with insert and it's wonderful. It can get tricky if you need to do UPDATEs.
You can avoid the unnest and just use value rows directly if you want (which can be easier to read)l
with ids(id) as
( values (1),
(2),
(3)
) select id from ids;
Further, you can use this technique with multiple CTEs to insert into completely different tables all in the same query saving lots of round trips.
Alternatives I know of, and the only options I could find documented:
- create a huge custom insert statement
- use COPY, but that doesn't work with 'ON CONFLICT' as far as I can see
COPY is great for large amounts of data. If it's a huge amount of data, you can load it into a temp table with COPY, and then use your ON CONFLICT insert to implement your logic.