Thread: allocate chunk of sequence
hello, I try to allocate a chunk of ids from a sequence with the following proc. However, if I don't use the 'lock lock_table', the proc may not work when it runs at the same time by different psql sessions. Is there a better way without using the 'lock lock_table' ? Thanks, Gary create or replace function proc_allocate_seq(int) returns int as $$ declare nNumberOfFiles alias for $1; aFileId int; aNewFileId int; begin lock lock_table; aFileId = nextval('aa_seq'); ; sleep(3); if you have the proc aNewFileId = setval('aa_seq', aFileId + nNumberOfFiles - 1); return aFileId; end; $$ language plpgsql;
Gary Fu wrote: > hello, > > I try to allocate a chunk of ids from a sequence with the following > proc. However, if I don't use the 'lock lock_table', the proc may not > work when it runs at the same time by different psql sessions. Is > there a better way without using the 'lock lock_table' ? > > > aNewFileId = setval('aa_seq', aFileId + nNumberOfFiles - 1); This is NOT the best way to use sequences. Much better would be to use a loop to allocate the them one after the other, and put them into a record type or something. Do they HAVE to be contiguous? If they're always the same size, then set the increment value of the sequence on creation to reflect that. i.e.: create sequence abc increment by 20 then just select nextval, and you have that plus the 20 after it all to yourself. Lots of ways to handle this, but setval is generally the worst way to handle anything in a highly parallel env.
Scott Marlowe wrote: > Gary Fu wrote: > >> hello, >> >> I try to allocate a chunk of ids from a sequence with the following >> proc. However, if I don't use the 'lock lock_table', the proc may not >> work when it runs at the same time by different psql sessions. Is >> there a better way without using the 'lock lock_table' ? >> >> >> aNewFileId = setval('aa_seq', aFileId + nNumberOfFiles - 1); > > This is NOT the best way to use sequences. > > Much better would be to use a loop to allocate the them one after the > other, and put them into a record type or something. > > Do they HAVE to be contiguous? > > If they're always the same size, then set the increment value of the > sequence on creation to reflect that. > > i.e.: > > create sequence abc increment by 20 > > then just select nextval, and you have that plus the 20 after it all to > yourself. Lots of ways to handle this, but setval is generally the > worst way to handle anything in a highly parallel env. > > Thanks for the reply. The chunk to be allocated is not the same size, so to set the increment value will not help. I'm not sure how the nextval function to handle this internally, if it has to read and update the sequence object. Does it use some kind of lock ? Otherwise the problem mentioned here should happen to nextval function also. The lock command does not work for the sequence, so in my example, I have to use a dummy table for lock to work. Another thought is to have the nextval function takes an extra argument for the increment value (instead of the default increment value). Gary
> The chunk to be allocated is not the same size, so to set the increment > value will not help. I'm sometimes not that subtle, so I'd just use a BIGINT sequence. Think about the largest chunk you'll ever get (probably less than 2^30 rows, yes ?), set this sequence increment to this very high value (this will still let you have 2^32 chunks at least), and make each chunk be (big_sequence_value + N). > I'm not sure how the nextval function to handle this internally, if it > has to read and update the sequence object. Does it use some kind of > lock ? Otherwise the problem mentioned here should happen to nextval > function also. Yes it takes a lock, but the lock is held for a very short time (just increment the value and release the lock). In InnoDB, the AUTO_INCREMENT lock is held for the entire duration of the SQL statement, including any triggers that might fire. This means if you have ON INSERT triggers which take some time to process, INSERT concurrency is horrendous. Not so with Postgres sequences. > The lock command does not work for the sequence, so in my example, I > have to use a dummy table for lock to work. Another thought is to have > the nextval function takes an extra argument for the increment value > (instead of the default increment value). If you really really want your ids to be contiguous, you can't use the bigint sequence above. So : If you do not want to spoil Postgres' good concurrency on inserts, you'll want to avoid doing I/O while holding a lock. Burning CPU while holding a lock is OK unless you want to use all your cores, but waiting for IO while locked is BAD. So, here's the deal : - the hard way is to patch nextval() for an extra argument - the easy way is to write a plpgsql function doing this : - you have a chunk of N rows to insert. - get lock - X := nextval() - call nextval() (N-1) times in a plpgsql FOR loop - release lock - do your INSERTs, using X ... (X+N-1) as ids (generate_series() is your friend) Thus you only lock while burning CPU calling nextval(). Not that bad. The IO-intensive INSERT isn't under the lock so it can be concurrent.
PFC wrote: > >> The chunk to be allocated is not the same size, so to set the >> increment value will not help. > > I'm sometimes not that subtle, so I'd just use a BIGINT sequence. > Think about the largest chunk you'll ever get (probably less than 2^30 > rows, yes ?), set this sequence increment to this very high value > (this will still let you have 2^32 chunks at least), and make each > chunk be (big_sequence_value + N). And that's only if they need to be contiguous. If they don't, then just grab however many you need one after the other. > >> I'm not sure how the nextval function to handle this internally, if >> it has to read and update the sequence object. Does it use some kind >> of lock ? Otherwise the problem mentioned here should happen to >> nextval function also. > > Yes it takes a lock, but the lock is held for a very short time > (just increment the value and release the lock). And that lock will be WAY shorter and bothersome than the lock you'll need if you do select setval('seqname',select max(id)+1 from sometable, FALSE); in the middle of your UDF. > In InnoDB, the AUTO_INCREMENT lock is held for the entire duration > of the SQL statement, including any triggers that might fire. This > means if you have ON INSERT triggers which take some time to process, > INSERT concurrency is horrendous. Not so with Postgres sequences. Note that that hasn't been true for some time now, according to Heikki Tuuri (sp?). He told me they changed the way that worked about halfway through the 5.0 dev cycle so that they use a kind of internal sequence much like postgresql. Except in postgresql the sequence is exposed. Again, needing contiguous ids is gonna cause a problem. It is going to serialize all inserts to your database, and slow it down in any kind of large parallel environment. If you can get away without them then do so.