> The chunk to be allocated is not the same size, so to set the increment
> value will not help.
I'm sometimes not that subtle, so I'd just use a BIGINT sequence. Think
about the largest chunk you'll ever get (probably less than 2^30 rows, yes
?), set this sequence increment to this very high value (this will still
let you have 2^32 chunks at least), and make each chunk be
(big_sequence_value + N).
> I'm not sure how the nextval function to handle this internally, if it
> has to read and update the sequence object. Does it use some kind of
> lock ? Otherwise the problem mentioned here should happen to nextval
> function also.
Yes it takes a lock, but the lock is held for a very short time (just
increment the value and release the lock).
In InnoDB, the AUTO_INCREMENT lock is held for the entire duration of the
SQL statement, including any triggers that might fire. This means if you
have ON INSERT triggers which take some time to process, INSERT
concurrency is horrendous. Not so with Postgres sequences.
> The lock command does not work for the sequence, so in my example, I
> have to use a dummy table for lock to work. Another thought is to have
> the nextval function takes an extra argument for the increment value
> (instead of the default increment value).
If you really really want your ids to be contiguous, you can't use the
bigint sequence above. So :
If you do not want to spoil Postgres' good concurrency on inserts, you'll
want to avoid doing I/O while holding a lock. Burning CPU while holding a
lock is OK unless you want to use all your cores, but waiting for IO while
locked is BAD.
So, here's the deal :
- the hard way is to patch nextval() for an extra argument
- the easy way is to write a plpgsql function doing this :
- you have a chunk of N rows to insert.
- get lock
- X := nextval()
- call nextval() (N-1) times in a plpgsql FOR loop
- release lock
- do your INSERTs, using X ... (X+N-1) as ids (generate_series() is your
friend)
Thus you only lock while burning CPU calling nextval(). Not that bad. The
IO-intensive INSERT isn't under the lock so it can be concurrent.