Thread: Sequence question
Hi, I have a question about sequences. I need a field to have values with no holes in the sequence. However, the values do not need to be in order. My users will draw a number or numbers from the sequence and write to the field. Sometimes, however, these sequence numbers will be discarded (after a transaction is complete), and thus available for use. During the transaction, however, any drawn numbers need to be unavailable. I would like the next user who draws a number to draw the lowest number she can, starting with the holes in the sequence. This continuous sequence is absolutely required by our company, as the fact that the sequence has no holes is used to check for much more serious problems. So my question is: what's the most effective way to get the next available number? My present method is to do a query that finds the first and last number in each of the holes, step through those holes, and then start generating new numbers. Unfortunately, this involves doing a table scan each time - before I generate the number, and does not produce the transaction-safety I want. Does anyone have any better ideas? Places I should look? Thanks, Eric
Far from being a perfect idea but a faster solution than stepping through all holes: 1) Create a second table containing only one field of type of your key. 2) When you delete an entry place the delete key value in your second table 3) If you insert a new entry into your old table and your new table contains a value, take the minimum value in the new table as your new key and delete that entry from the new table. If the new table is empty just use the sequence to get the new key value. Hope that helps David Ecker Eric E wrote: > Hi, > I have a question about sequences. I need a field to have values > with no holes in the sequence. However, the values do not need to be in > order. > > My users will draw a number or numbers from the sequence and write to > the field. Sometimes, however, these sequence numbers will be discarded > (after a transaction is complete), and thus available for use. During > the transaction, however, any drawn numbers need to be unavailable. > I would like the next user who draws a number to draw the lowest number > she can, starting with the holes in the sequence. > > This continuous sequence is absolutely required by our company, as the > fact that the sequence has no holes is used to check for much more > serious problems. > > So my question is: > what's the most effective way to get the next available number? > > My present method is to do a query that finds the first and last number > in each of the holes, step through those holes, and then start > generating new numbers. Unfortunately, this involves doing a table scan > each time - before I generate the number, and does not produce the > transaction-safety I want. > > Does anyone have any better ideas? Places I should look? > > Thanks, > > Eric
On Tue, Oct 19, 2004 at 11:19:05AM -0400, Eric E wrote: > My users will draw a number or numbers from the sequence and write to > the field. Sometimes, however, these sequence numbers will be discarded > (after a transaction is complete), and thus available for use. During > the transaction, however, any drawn numbers need to be unavailable. > I would like the next user who draws a number to draw the lowest number > she can, starting with the holes in the sequence. There are two ways I've seen to do this. One is the low-concurrency way. Another is a sort of clever approach that isn't theoretically perfect, but which provides slightly better concurrency. The low-concurrency approach is pretty much what you'd expect: keep the value in a table which is locked by each transaction which is incrementing it, and complete the incrementing in the transaction scope. That way, if it rolls back, the value hasn't been incremented, and is ready for the next user. The problem, of course, is that this forces every transaction to stand in line. An alternative approach I've heard is to pre-allocate numbers from a sequence into a table: create table seq_allocation ( serialno int8 not null unique, grant_status int constraint status_limiter check (grant_status in (1,2,3)) ); The idea is that a grant_status of 1 means the serial number is unallocated, a grant_status of 2 means it's pending, and 3 means it's granted. When you start, in one transaction you pick the next available serialno with a status of 1. Then you update that row to set it to 2 (make sure you use "where grant_status = 1" to avoid a race condition), and then commit. Now you have your serial number. Use it, and then at the end of your transaction where you are committing, set the grant_status to 3, so you know it's really used. Now, how do you handle the cases where either the transaction fails so you can't set it to 3? Simple: your client captures errors and then sets the value back to 1 later. For client errors, you need yet another process which will go around periodically and check for grant_status = 2, and make sure nobody's actually in the middle of trying to use them. (You could refine the seq_allocation table by storing the pid of the allocating back end. Then your maintenance script could look for such a back end while cleaning up.) The savepoints features of 8.0 will make some of this even easier for you. Note that this second method is not completely bulletproof, but it might be good enough for the cases you want. I have a feeling, however, that you're creating a new problem for yourself by not being able to skip sequence values. My bet is that you actually need to find a better way to solve the "other serious problems" you have rather than banging on sequences to get them to fit your intended use. A -- Andrew Sullivan | ajs@crankycanuck.ca I remember when computers were frustrating because they *did* exactly what you told them to. That actually seems sort of quaint now. --J.D. Baldwin
On Wed, Oct 20, 2004 at 11:57:42AM -0400, Andrew Sullivan wrote: > Now, how do you handle the cases where either the transaction fails > so you can't set it to 3? Simple: your client captures errors and > then sets the value back to 1 later. Has anyone read "the Sagas paper" by Garcia-Molina? They present a way to handle "extended transaction models", trying to cope sort-of automatically with this kind of situations. More generally, AFAIU the idea is to have multi-transaction recoverability and rollback-ability. It seems interesting. http://portal.acm.org/citation.cfm?doid=38713.38742 I have only skimmed through it, but it sounds somewhat interesting. I'd love to know what do people think of this. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Los dioses no protegen a los insensatos. Éstos reciben protección de otros insensatos mejor dotados" (Luis Wu, Mundo Anillo)
Hi Andrew, I had basically started working on an idea like the second approach, but had not been able to put the status element so clearly. I really like the statuses of available, pending, and granted. There's one more twist I think I can use to optimize this: once a number is assigned, it cannot be reassigned. So I think I can add: - have the sequence preallocation table hold only numbers with status being available or pending, i.e., delete numbers once they have been allocated. This leaves on two possible statuses: available and pending. - push off getting new numbers into the preallocation table with a full-table search until convenient times I also liked your point about the atomicity of : get number, change status to pending, commit After that one can proceed with writing the number into my data table. My thought was that the you could set the status ussing sessionID. That way a server-side job could look for expired sessions and remark those numbers available. Any thoughts? This is point is definitely important > I have a feeling, > however, that you're creating a new problem for yourself by not being > able to skip sequence values. My bet is that you actually need to > find a better way to solve the "other serious problems" you have > rather than banging on sequences to get them to fit your intended > use. I just haven't really seen anyway around the need to use all of our storage rows that doens't involve a complicated mapping to boxes. Thanks, Eric Andrew Sullivan wrote: > On Tue, Oct 19, 2004 at 11:19:05AM -0400, Eric E wrote: > > >>My users will draw a number or numbers from the sequence and write to >>the field. Sometimes, however, these sequence numbers will be discarded >>(after a transaction is complete), and thus available for use. During >>the transaction, however, any drawn numbers need to be unavailable. >>I would like the next user who draws a number to draw the lowest number >>she can, starting with the holes in the sequence. > > > There are two ways I've seen to do this. One is the low-concurrency > way. Another is a sort of clever approach that isn't theoretically > perfect, but which provides slightly better concurrency. > > The low-concurrency approach is pretty much what you'd expect: keep > the value in a table which is locked by each transaction which is > incrementing it, and complete the incrementing in the transaction > scope. That way, if it rolls back, the value hasn't been > incremented, and is ready for the next user. The problem, of course, > is that this forces every transaction to stand in line. > > An alternative approach I've heard is to pre-allocate numbers from a > sequence into a table: > > create table seq_allocation ( > serialno int8 not null unique, > grant_status int > constraint status_limiter check (grant_status in > (1,2,3)) ); > > The idea is that a grant_status of 1 means the serial number is > unallocated, a grant_status of 2 means it's pending, and 3 means it's > granted. When you start, in one transaction you pick the next > available serialno with a status of 1. Then you update that row to > set it to 2 (make sure you use "where grant_status = 1" to avoid a > race condition), and then commit. Now you have your serial number. > Use it, and then at the end of your transaction where you are > committing, set the grant_status to 3, so you know it's really used. > > Now, how do you handle the cases where either the transaction fails > so you can't set it to 3? Simple: your client captures errors and > then sets the value back to 1 later. For client errors, you need yet > another process which will go around periodically and check for > grant_status = 2, and make sure nobody's actually in the middle of > trying to use them. (You could refine the seq_allocation table by > storing the pid of the allocating back end. Then your maintenance > script could look for such a back end while cleaning up.) > > The savepoints features of 8.0 will make some of this even easier for > you. > > Note that this second method is not completely bulletproof, but it > might be good enough for the cases you want. I have a feeling, > however, that you're creating a new problem for yourself by not being > able to skip sequence values. My bet is that you actually need to > find a better way to solve the "other serious problems" you have > rather than banging on sequences to get them to fit your intended > use. > > A >
On Wed, Oct 20, 2004 at 02:59:27PM -0400, Eric E wrote: > - have the sequence preallocation table hold only numbers with status > being available or pending, i.e., delete numbers once they have been > allocated. This leaves on two possible statuses: available and pending. I would argue that you're best to record everything. That is, your proposal moves from a table which covers all the possible states -- granted, pending, and available -- to a table which entails ambiguous answers to some questions. Since you're not going to preallocate every logically possible id, then if an id isn't in the table, you can't tell if it simply has never been allocated, or if it has already been used. If you have that stored elsewhere, though, you can get it from a join, so perhaps there's no need for this. (I note that a real normalisation freak, like the one I occasionally play on TV, would require you to use a REFERENCES constraint on the status value, and use the referenced table as a control for what status values you can use. This has the not inconsiderable benefit that if you have a new status -- say, "storage burned down" or "impounded by SCO for copyright violation" or something else -- you have a completely trivial way to add it. It's certainly the way I'd actually do this.) > I also liked your point about the atomicity of : > get number, change status to pending, commit The real problem with it is that you do have the possibility of orphaned pending actions. > My thought was that the you could set the status ussing sessionID. That > way a server-side job could look for expired sessions and remark those > numbers available. That's something like what I'd do, yes. It mostly depends on what's available to your application. I tend to be very belt-and-suspenders about this sort of thing. Probably I'd put a wall-clock timestamp on the field, too, to give me clues about when things might be going wrong, &c. > > find a better way to solve the "other serious problems" you have > > rather than banging on sequences to get them to fit your intended > > use. > > I just haven't really seen anyway around the need to use all of our > storage rows that doens't involve a complicated mapping to boxes. I was more concerned that you were trying to do this for invoice numbers, another place where people often require serial numbers. In that case, I usually think they're wrong, because I can think of plenty of better ways to solve that one (unless it's a legal requirement, which is sometimes is). But mapping data points to places in space is one of those cases where you probably _do_ need this sort of preallocation mechanism. It's what hotels do, after all. A -- Andrew Sullivan | ajs@crankycanuck.ca The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun