Thread: Column with recycled sequence value
Hi, suppose I have a let's say heavy used table. There's a column containing UNIQUE in4 values. The data type musn't exceed 32-Bit. Since however the table is heavy used 2^32 will be reached soon and then? There are far less than 4G-records saved thus these values may be reused. How can this be accomplished? I already defined a sequence but then? I appreciate your help.
On Thu, Jan 13, 2005 at 18:08:20 +0100, KÖPFERL Robert <robert.koepferl@sonorys.at> wrote: > Hi, > > suppose I have a let's say heavy used table. There's a column containing > UNIQUE in4 > values. The data type musn't exceed 32-Bit. Since however the table is heavy > used 2^32 will be reached soon and then? There are far less than 4G-records > saved thus these values may be reused. How can this be accomplished? > > I already defined a sequence but then? It is possible to let sequences roll over. One possible solution is to just try to do inserts with the sequence id and have the application retry when this fails. If the records being kept are almost all recent ones, this should work pretty well.
On Thu, Jan 13, 2005 at 06:08:20PM +0100, KÖPFERL Robert wrote: > Hi, > > suppose I have a let's say heavy used table. There's a column containing > UNIQUE in4 > values. The data type musn't exceed 32-Bit. Since however the table is heavy > used 2^32 will be reached soon and then? There are far less than 4G-records > saved thus these values may be reused. How can this be accomplished? You can set the sequence up to cycle (so once it gets to the end, it wraps around to the beginning again). The keyword is CYCLE at CREATE SEQUENCE time. It defaults to NO CYCLE. One potential problem, of course, are collisions on the table, because some value wasn't cleared out. It sounds like you don't have that problem though. A -- Andrew Sullivan | ajs@crankycanuck.ca The plural of anecdote is not data. --Roger Brinner
On Thu, 2005-01-13 at 11:08, KÖPFERL Robert wrote: > Hi, > > suppose I have a let's say heavy used table. There's a column containing > UNIQUE in4 > values. The data type musn't exceed 32-Bit. Since however the table is heavy > used 2^32 will be reached soon and then? There are far less than 4G-records > saved thus these values may be reused. How can this be accomplished? > > I already defined a sequence but then? Redefine it as a bigserial?
On Thu, Jan 13, 2005 at 02:48:47PM -0600, Scott Marlowe wrote: > On Thu, 2005-01-13 at 11:08, KÖPFERL Robert wrote: > > suppose I have a let's say heavy used table. There's a column containing > > UNIQUE in4 > > values. The data type musn't exceed 32-Bit. Since however the table is heavy > > used 2^32 will be reached soon and then? There are far less than 4G-records > > saved thus these values may be reused. How can this be accomplished? > > > > I already defined a sequence but then? > > Redefine it as a bigserial? That would violate the 32-bit requirement since bigserial would make the field a bigint (64 bits). Or am I missing something? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Thu, 2005-01-13 at 15:19, Michael Fuhr wrote: > On Thu, Jan 13, 2005 at 02:48:47PM -0600, Scott Marlowe wrote: > > On Thu, 2005-01-13 at 11:08, KÖPFERL Robert wrote: > > > > suppose I have a let's say heavy used table. There's a column containing > > > UNIQUE in4 > > > values. The data type musn't exceed 32-Bit. Since however the table is heavy > > > used 2^32 will be reached soon and then? There are far less than 4G-records > > > saved thus these values may be reused. How can this be accomplished? > > > > > > I already defined a sequence but then? > > > > Redefine it as a bigserial? > > That would violate the 32-bit requirement since bigserial would > make the field a bigint (64 bits). Or am I missing something? I wasn't sure if that was a limitation he was facing due to business rules or if he was referring to the limit in postgresql. Any method that tries to reuse sequence numbers is a bad idea (TM) and to be avoided, so my point in asking was if the user didn't know about bigserial / bigint.
On Thu, Jan 13, 2005 at 03:31:54PM -0600, Scott Marlowe wrote: > Any method that tries to reuse sequence numbers is a bad idea (TM) and Why? I can think of a dozen cases where it can be useful. It just depends on the application. A -- Andrew Sullivan | ajs@crankycanuck.ca In the future this spectacle of the middle classes shocking the avant- garde will probably become the textbook definition of Postmodernism. --Brad Holland
On Thu, Jan 13, 2005 at 03:31:54PM -0600, Scott Marlowe wrote: > > I wasn't sure if that was a limitation he was facing due to business > rules or if he was referring to the limit in postgresql. Gotcha -- I should have asked about the nature of the requirement. -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On Thu, 2005-01-13 at 15:43, Andrew Sullivan wrote: > On Thu, Jan 13, 2005 at 03:31:54PM -0600, Scott Marlowe wrote: > > Any method that tries to reuse sequence numbers is a bad idea (TM) and > > Why? I can think of a dozen cases where it can be useful. It just > depends on the application. The usual, if it's a PK of a FK relationship it's possible to have an orphaned set at some point pointing back to it, race conditions on trying to find a reusable sequence number, and the poor performance needed to lock it to reuse it. What cases are you thinking of? I've seen some very limited ones, like using a short range for managing a queue, but that avoids a lot of the performance issues since it's a small set you're trundling through to find the next one available. But they're not that common, and most of the time someone is thinking of doing so it's because a boss who doesn't get it wants a pretty list with no holes in the sequence or something equally silly.
On Thu, Jan 13, 2005 at 03:57:04PM -0600, Scott Marlowe wrote: > What cases are you thinking of? I've seen some very limited ones, like Off the top of my head - legacy application, closed, where you can't fix the source and can't have larger than 32bit datatype, but you have another way to ensure no dups. - the queue-management item you mentioned. - optimistic cases where a short search range is more important than that a transaction doesn't fail on insert - circular number spaces (xid uses this, after all) > the time someone is thinking of doing so it's because a boss who doesn't > get it wants a pretty list with no holes in the sequence or something > equally silly. Like they have some business problem they need solved, and doing it this way is ugly but relatively cheap, and doing it the other way means replacing 4 software systems and retraining 100 people. Is it a pretty design? Probably not. Is it something that is, of all the compromises available, the best one under the circumstances? I dunno; I'd have to look at the circumstances. I think it's probably usually a good idea to avoid this, sure, but I'm not willing to make it a blanket statement. A -- Andrew Sullivan | ajs@crankycanuck.ca The fact that technology doesn't work is no bar to success in the marketplace. --Philip Greenspun
You could update all the fields which use this sequence number. You say you have a lot of activity so you must have mahy holes in your sequence, probably of the possible 2^32 values, only a fes millions are used. You can do the following : - Take down the database, back it up, and restart it with a single user, so only you can connect, using psql.- Create a table : CREATE TABLE translate ( new_id SERIAL PRIMARY KEY, old_id INTEGER, UNIQUE(old_id) ) WITHOUT OIDS; - Insert into this table all the used sequence values you have in your database. If you have all the proper constraints, these should come from only one table, so it should be straightformard : INSERT INTO translate (old_id) SELECT id FROM your_table; Thus the "translate" table maps old id's to a new sequence that you just started, and that means your new id's will be compactly arranged, starting at 1. - Update your existing table, joining it to the translate table, to replace the old id by the new id. > On Thu, Jan 13, 2005 at 06:08:20PM +0100, KÖPFERL Robert wrote: >> Hi, >> >> suppose I have a let's say heavy used table. There's a column containing >> UNIQUE in4 >> values. The data type musn't exceed 32-Bit. Since however the table is >> heavy >> used 2^32 will be reached soon and then? There are far less than >> 4G-records >> saved thus these values may be reused. How can this be accomplished? > > You can set the sequence up to cycle (so once it gets to the end, it > wraps around to the beginning again). The keyword is CYCLE at CREATE > SEQUENCE time. It defaults to NO CYCLE. > > One potential problem, of course, are collisions on the table, > because some value wasn't cleared out. It sounds like you don't have > that problem though. > > A >
Andrew Sullivan <ajs@crankycanuck.ca> writes: > You can set the sequence up to cycle (so once it gets to the end, it > wraps around to the beginning again). The keyword is CYCLE at CREATE > SEQUENCE time. It defaults to NO CYCLE. > > One potential problem, of course, are collisions on the table, > because some value wasn't cleared out. It sounds like you don't have > that problem though. Alternatively you can go through the database and make sure all the foreign keys are declared and marked ON UPDATE CASCADE. Then go through and renumber all your entries sequentially starting at 1 and reset your sequence. I'm not sure this is such a hot idea really. But I don't really like the idea of letting the sequence wrap around much either. You'll have to consider the pros and cons of each approach (and of just moving to bigserial too). If you're going to do this you'll want an index on all the foreign key columns. That is, the columns referring to this value from other tables. Otherwise the automatic updates would be very slow. And will probably want to schedule down-time for this. Otherwise application code that holds values in local state might get very confused. I think I would do it with a program that connects and updates each record individually and commits periodically rather than with a single big update. Just because I like having control and having things that give me progress information and can be interrupted without losing work. Oh, and this won't work if you have any external references to these values from outside your database. Say if the value is something like a customer account number that you've previously sent to customers... -- greg
> -----Original Message----- > From: Andrew Sullivan [mailto:ajs@crankycanuck.ca] > Sent: Donnerstag, 13. Jänner 2005 20:49 > To: 'pgsql-sql@postgresql.org' > Subject: Re: [SQL] Column with recycled sequence value > ... > > used 2^32 will be reached soon and then? There are far less > than 4G-records > > saved thus these values may be reused. How can this be accomplished? > > You can set the sequence up to cycle (so once it gets to the end, it > wraps around to the beginning again). The keyword is CYCLE at CREATE > SEQUENCE time. It defaults to NO CYCLE. > > One potential problem, of course, are collisions on the table, > because some value wasn't cleared out. It sounds like you don't have > that problem though. actually I am asking just because of that. The thing is, that I am able to synthetisize wonderful SELECTs but I have no unterstanding of how to first find out wether a record doesn't exist in order to create it in the same transaction. As user of procedural languages I'm thinking of variables and so on. But AFAIK SQL has a way to also find a way without them. OR I'd like to read that I can't use SQL for this purpose and at minimum PL/pgSQL would be appropriate for me. PS. 32-Bit is my bussiness limitation - but also that is thinkable. thanks