Re: Getting sequence-generated IDs from multiple row insert - Mailing list pgsql-general

From Andrew Sullivan
Subject Re: Getting sequence-generated IDs from multiple row insert
Date
Msg-id 20140331194838.GE54796@crankycanuck.ca
Whole thread Raw
In response to Getting sequence-generated IDs from multiple row insert  (Ben Hoyt <benhoyt@gmail.com>)
Responses Re: Getting sequence-generated IDs from multiple row insert  (David Johnston <polobo@yahoo.com>)
List pgsql-general
On Mon, Mar 31, 2014 at 03:28:14PM -0400, Ben Hoyt wrote:

> , but I've just hit a case where two sessions each doing a
> multiple insert don't use sequential IDs. For example, the range code above
> for the first insert gave 2117552...2117829. And the second insert gave
> 2117625...2117818. Which are obviously overlapping and is a nasty bug
> waiting to happen. Thankfully it caused an IntegrityError further down in
> my code so I didn't screw things up.

Good thing you caught it.  But yes, just from the description it
looked like an obvious race to me.  Concurrency is hard.

> Second, what's the right thing to do here? The first thing I found was
> PostgreSQL's "RETURNING" clause, but somewhat frustratingly for this use
> case, even that's not guaranteed to return the results in the order you
> specified.

In SQL, _nothing_ is guaranteed to return in the order you specified.
This isn't really a Postgres thing; unless you use ORDER BY, SQL's
sets are not ordered.

> I need the IDs in insertion order so I can do further processing.

This sets off alarm bells for me.  What further processing are you
doing?  Is it possible that you could move that into a single step in
the database (maybe with a function or even a trigger) so that the
result of your RETURNING really would provide you with what you need?

> So currently I've changed my code to use RETURNING and then I'm ordering
> the results based on a secondary column that I know the order of. This
> works, but seems clunky, so I'm wondering if there's a nicer way.

This is probably what I'd do, assuming that "further processing" isn't
more data transformation.  If it _is_, then I'd do the whole thing in
a single step (in the database, once I inserted).

A

--
Andrew Sullivan
ajs@crankycanuck.ca


pgsql-general by date:

Previous
From: Edson Richter
Date:
Subject: Re: Why does "checkpointer" is consumig ~1.2Gb of RAM?
Next
From: David Johnston
Date:
Subject: Re: getting the current query from pg_stat_activity