Thread: Performance implications of creating many, many sequences
Consider the following table: CREATE TABLE invoice ( account_id integer NOT NULL REFERENCES account, invoice_number integer NOT NULL, UNIQUE (account_id,invoice_number)); I would like to do the equivalent of making invoice_number a serial type, but on a per-account basis. Would it be a reasonableapproach to create a separate sequence for each individual account? Are there performance implications I shouldknow about, given that there will be hundreds of thousands of accounts? Is there another approach I should be lookingat instead?
Is this "invoice_number" just an id or what might appear an a bill (in some pretty form etc)? If the former, just get a unique id over all invoices. At the very least it will save time i) in writing where clauses ii) re-creating the correct id once some one assigns an invoice to the wrong customer. On 10/22/2010 01:18 PM, Michael Gardner wrote: > Consider the following table: > > CREATE TABLE invoice ( > account_id integer NOT NULL REFERENCES account, > invoice_number integer NOT NULL, > UNIQUE (account_id,invoice_number)); > > I would like to do the equivalent of making invoice_number a serial type, but on a per-account basis. Would it be a reasonableapproach to create a separate sequence for each individual account? Are there performance implications I shouldknow about, given that there will be hundreds of thousands of accounts? Is there another approach I should be lookingat instead?
On Oct 22, 2010, at 2:50 PM, Rob Sargent wrote: > Is this "invoice_number" just an id or what might appear an a bill (in > some pretty form etc)? It will appear on actual invoices, as part of a compound invoice identifier (like ABCD-0042, where ABCD is an identifierfor the account in question and 42 is the invoice number). > If the former, just get a unique id over all invoices. At the very > least it will save time i) in writing where clauses ii) re-creating the > correct id once some one assigns an invoice to the wrong customer. There will be such an ID, but I do not want to show it directly to users. I want an invoice number independent of whateversurrogate key the database happens to use to uniquely identify rows. (Sorry if you receive this twice, Rob. I accidentally replied to you instead of the list the first time.)
On 23/10/2010 3:18 AM, Michael Gardner wrote: > Consider the following table: > > CREATE TABLE invoice ( > account_id integer NOT NULL REFERENCES account, > invoice_number integer NOT NULL, > UNIQUE (account_id,invoice_number)); > > I would like to do the equivalent of making invoice_number a serial type, but on a per-account basis. Would it be a reasonableapproach to create a separate sequence for each individual account? Are there performance implications I shouldknow about, given that there will be hundreds of thousands of accounts? Is there another approach I should be lookingat instead? I'd favour "another approach". The purpose of sequences is to improve concurrency in the face of rapid inserts. You're probably not going to have a bunch of transactions all wanting to grab new invoice numbers for the same customer at the same time, so this isn't going to be a problem; you're already partitioning the concurrency limitation out to be per-customer, which should be good enough. Also, sequences are not gapless. If a transaction grabs an entry then rolls back instead of committing, that ID is never used. Your customers would probably not like that for invoice numbers. Instead, maintain a counter, either in the main customer record or in an associated (customer_id, counter) side table if you want to reduce potential lock contention. Write a simple SQL function that uses an UPDATE ... RETURNING statement to grab a new ID from the counter and increment it. Use that function instead of 'nextval(seqname)' when you want an ID. The UPDATE will take a lock out on the customer row (or side-table row if you did it that way) that'll prevent anyone else updating it until the transaction commits or rolls back. -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/
On Oct 22, 2010, at 11:03 PM, Craig Ringer wrote: > Instead, maintain a counter, either in the main customer record or in an associated (customer_id, counter) side table ifyou want to reduce potential lock contention. Write a simple SQL function that uses an UPDATE ... RETURNING statement tograb a new ID from the counter and increment it. Use that function instead of 'nextval(seqname)' when you want an ID. TheUPDATE will take a lock out on the customer row (or side-table row if you did it that way) that'll prevent anyone elseupdating it until the transaction commits or rolls back. Thanks for the suggestion. It seems like there should be a safe way to use max() instead of a separate counter though, aslong as I can guarantee that invoice numbers never change and invoices are never deleted. Right?
On 10/24/2010 12:42 AM, Michael Gardner wrote: > On Oct 22, 2010, at 11:03 PM, Craig Ringer wrote: > >> Instead, maintain a counter, either in the main customer record or in an associated (customer_id, counter) side tableif you want to reduce potential lock contention. Write a simple SQL function that uses an UPDATE ... RETURNING statementto grab a new ID from the counter and increment it. Use that function instead of 'nextval(seqname)' when you wantan ID. The UPDATE will take a lock out on the customer row (or side-table row if you did it that way) that'll preventanyone else updating it until the transaction commits or rolls back. > > Thanks for the suggestion. It seems like there should be a safe way to use max() instead of a separate counter though,as long as I can guarantee that invoice numbers never change and invoices are never deleted. Right? True. You'll then have to provide your own locking (say, SELECT ... FOR UPDATE on the customer record) to ensure that no two invoices are allocated the same number, though. If you use UPDATE ... RETURNING on a counter field the locking is done for you. You'll have a UNIQUE(customer_id,order_id) constraint in place anyway, of course, so you won't have the risk of genuinely duplicate IDs, just the need to retry a transaction that fails due to a duplicate key error if two invoice creations on a customer happen concurrently. Perhaps it's too unlikely to care about, but I just dislike using max(x)+1 on principle, as it's just a generally unsafe sql programming idiom. -- Craig Ringer