Thread: Column with recycled sequence value

Column with recycled sequence value

From

KÖPFERL Robert

Date:

13 January 2005, 18:39:02

Hi,

suppose I have a let's say heavy used table. There's a column containing
UNIQUE in4
values. The data type musn't exceed 32-Bit. Since however the table is heavy
used 2^32 will be reached soon and then? There are far less than 4G-records
saved thus these values may be reused. How can this be accomplished?

I already defined a sequence but then?


I appreciate your help.

Re: Column with recycled sequence value

From

Bruno Wolff III

Date:

13 January 2005, 20:20:13

On Thu, Jan 13, 2005 at 18:08:20 +0100, KÖPFERL Robert <robert.koepferl@sonorys.at> wrote:
> Hi,
> 
> suppose I have a let's say heavy used table. There's a column containing
> UNIQUE in4
> values. The data type musn't exceed 32-Bit. Since however the table is heavy
> used 2^32 will be reached soon and then? There are far less than 4G-records
> saved thus these values may be reused. How can this be accomplished?
> 
> I already defined a sequence but then?

It is possible to let sequences roll over. One possible solution is to just
try to do inserts with the sequence id and have the application retry when
this fails. If the records being kept are almost all recent ones, this should
work pretty well.

Re: Column with recycled sequence value

From

Andrew Sullivan

Date:

13 January 2005, 20:24:36

On Thu, Jan 13, 2005 at 06:08:20PM +0100, KÖPFERL Robert wrote:
> Hi,
> 
> suppose I have a let's say heavy used table. There's a column containing
> UNIQUE in4
> values. The data type musn't exceed 32-Bit. Since however the table is heavy
> used 2^32 will be reached soon and then? There are far less than 4G-records
> saved thus these values may be reused. How can this be accomplished?

You can set the sequence up to cycle (so once it gets to the end, it
wraps around to the beginning again).  The keyword is CYCLE at CREATE
SEQUENCE time.  It defaults to NO CYCLE.

One potential problem, of course, are collisions on the table,
because some value wasn't cleared out.  It sounds like you don't have
that problem though.

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
The plural of anecdote is not data.    --Roger Brinner

Re: Column with recycled sequence value

From

Scott Marlowe

Date:

13 January 2005, 21:01:25

On Thu, 2005-01-13 at 11:08, KÖPFERL Robert wrote:
> Hi,
>
> suppose I have a let's say heavy used table. There's a column containing
> UNIQUE in4
> values. The data type musn't exceed 32-Bit. Since however the table is heavy
> used 2^32 will be reached soon and then? There are far less than 4G-records
> saved thus these values may be reused. How can this be accomplished?
>
> I already defined a sequence but then?

Redefine it as a bigserial?

Re: Column with recycled sequence value

From

Michael Fuhr

Date:

13 January 2005, 21:27:12

On Thu, Jan 13, 2005 at 02:48:47PM -0600, Scott Marlowe wrote:
> On Thu, 2005-01-13 at 11:08, KÖPFERL Robert wrote:

> > suppose I have a let's say heavy used table. There's a column containing
> > UNIQUE in4
> > values. The data type musn't exceed 32-Bit. Since however the table is heavy
> > used 2^32 will be reached soon and then? There are far less than 4G-records
> > saved thus these values may be reused. How can this be accomplished?
> > 
> > I already defined a sequence but then?
> 
> Redefine it as a bigserial?

That would violate the 32-bit requirement since bigserial would
make the field a bigint (64 bits).  Or am I missing something?

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Column with recycled sequence value

From

Scott Marlowe

Date:

13 January 2005, 21:37:02

On Thu, 2005-01-13 at 15:19, Michael Fuhr wrote:
> On Thu, Jan 13, 2005 at 02:48:47PM -0600, Scott Marlowe wrote:
> > On Thu, 2005-01-13 at 11:08, KÖPFERL Robert wrote:
>
> > > suppose I have a let's say heavy used table. There's a column containing
> > > UNIQUE in4
> > > values. The data type musn't exceed 32-Bit. Since however the table is heavy
> > > used 2^32 will be reached soon and then? There are far less than 4G-records
> > > saved thus these values may be reused. How can this be accomplished?
> > >
> > > I already defined a sequence but then?
> >
> > Redefine it as a bigserial?
>
> That would violate the 32-bit requirement since bigserial would
> make the field a bigint (64 bits).  Or am I missing something?

I wasn't sure if that was a limitation he was facing due to business
rules or if he was referring to the limit in postgresql.

Any method that tries to reuse sequence numbers is a bad idea (TM) and
to be avoided, so my point in asking was if the user didn't know about
bigserial / bigint.

Re: Column with recycled sequence value

From

Andrew Sullivan

Date:

13 January 2005, 21:47:11

On Thu, Jan 13, 2005 at 03:31:54PM -0600, Scott Marlowe wrote:
> Any method that tries to reuse sequence numbers is a bad idea (TM) and

Why?  I can think of a dozen cases where it can be useful.  It just
depends on the application.

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
In the future this spectacle of the middle classes shocking the avant-
garde will probably become the textbook definition of Postmodernism.                --Brad Holland

Re: Column with recycled sequence value

From

Michael Fuhr

Date:

13 January 2005, 21:50:08

On Thu, Jan 13, 2005 at 03:31:54PM -0600, Scott Marlowe wrote:
> 
> I wasn't sure if that was a limitation he was facing due to business
> rules or if he was referring to the limit in postgresql.

Gotcha -- I should have asked about the nature of the requirement.

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

Re: Column with recycled sequence value

From

Scott Marlowe

Date:

13 January 2005, 21:57:37

On Thu, 2005-01-13 at 15:43, Andrew Sullivan wrote:
> On Thu, Jan 13, 2005 at 03:31:54PM -0600, Scott Marlowe wrote:
> > Any method that tries to reuse sequence numbers is a bad idea (TM) and
> 
> Why?  I can think of a dozen cases where it can be useful.  It just
> depends on the application.

The usual, if it's a PK of a FK relationship it's possible to have an
orphaned set at some point pointing back to it, race conditions on
trying to find a reusable sequence number, and the poor performance
needed to lock it to reuse it.

What cases are you thinking of?  I've seen some very limited ones, like
using a short range for managing a queue, but that avoids a lot of the
performance issues since it's a small set you're trundling through to
find the next one available.  But they're not that common, and most of
the time someone is thinking of doing so it's because a boss who doesn't
get it wants a pretty list with no holes in the sequence or something
equally silly.

Re: Column with recycled sequence value

From

Andrew Sullivan

Date:

13 January 2005, 22:15:34

On Thu, Jan 13, 2005 at 03:57:04PM -0600, Scott Marlowe wrote:
> What cases are you thinking of?  I've seen some very limited ones, like

Off the top of my head

- legacy application, closed, where you can't fix the source and can't
have larger than 32bit datatype, but you have another way to ensure
no dups.
- the queue-management item you mentioned.
- optimistic cases where a short search range is more important than
that a transaction doesn't fail on insert
- circular number spaces (xid uses this, after all)

> the time someone is thinking of doing so it's because a boss who doesn't
> get it wants a pretty list with no holes in the sequence or something
> equally silly.

Like they have some business problem they need solved, and doing it
this way is ugly but relatively cheap, and doing it the other way
means replacing 4 software systems and retraining 100 people.  Is it
a pretty design?  Probably not.  Is it something that is, of all the
compromises available, the best one under the circumstances?  I
dunno; I'd have to look at the circumstances.  I think it's probably
usually a good idea to avoid this, sure, but I'm not willing to make
it a blanket statement.

A

-- 
Andrew Sullivan  | ajs@crankycanuck.ca
The fact that technology doesn't work is no bar to success in the marketplace.    --Philip Greenspun

Re: Column with recycled sequence value

From

PFC

Date:

14 January 2005, 01:28:37

You could update all the fields which use this sequence number. You say  
you have a lot of activity so you must have mahy holes in your sequence,  
probably of the possible 2^32 values, only a fes millions are used.
You can do the following :
- Take down the database, back it up, and restart it with a single user,  
so only you can connect, using psql.- Create a table :
CREATE TABLE translate ( new_id SERIAL PRIMARY KEY, old_id INTEGER,  
UNIQUE(old_id) ) WITHOUT OIDS;
- Insert into this table all the used sequence values you have in your  
database. If you have all the proper constraints, these should come from  
only one table, so it should be straightformard :

INSERT INTO translate (old_id) SELECT id FROM your_table;
Thus the "translate" table maps old id's to a new sequence that you just  
started, and that means your new id's will be compactly arranged, starting  
at 1.
- Update your existing table, joining it to the translate table, to  
replace the old id by the new id.


> On Thu, Jan 13, 2005 at 06:08:20PM +0100, KÖPFERL Robert wrote:
>> Hi,
>>
>> suppose I have a let's say heavy used table. There's a column containing
>> UNIQUE in4
>> values. The data type musn't exceed 32-Bit. Since however the table is  
>> heavy
>> used 2^32 will be reached soon and then? There are far less than  
>> 4G-records
>> saved thus these values may be reused. How can this be accomplished?
>
> You can set the sequence up to cycle (so once it gets to the end, it
> wraps around to the beginning again).  The keyword is CYCLE at CREATE
> SEQUENCE time.  It defaults to NO CYCLE.
>
> One potential problem, of course, are collisions on the table,
> because some value wasn't cleared out.  It sounds like you don't have
> that problem though.
>
> A
>

Re: Column with recycled sequence value

From

Greg Stark

Date:

14 January 2005, 05:12:24

Andrew Sullivan <ajs@crankycanuck.ca> writes:

> You can set the sequence up to cycle (so once it gets to the end, it
> wraps around to the beginning again).  The keyword is CYCLE at CREATE
> SEQUENCE time.  It defaults to NO CYCLE.
> 
> One potential problem, of course, are collisions on the table,
> because some value wasn't cleared out.  It sounds like you don't have
> that problem though.

Alternatively you can go through the database and make sure all the foreign
keys are declared and marked ON UPDATE CASCADE. Then go through and renumber
all your entries sequentially starting at 1 and reset your sequence. 

I'm not sure this is such a hot idea really. But I don't really like the idea
of letting the sequence wrap around much either. You'll have to consider the
pros and cons of each approach (and of just moving to bigserial too).

If you're going to do this you'll want an index on all the foreign key
columns. That is, the columns referring to this value from other tables.
Otherwise the automatic updates would be very slow.

And will probably want to schedule down-time for this. Otherwise application
code that holds values in local state might get very confused.

I think I would do it with a program that connects and updates each record
individually and commits periodically rather than with a single big update.
Just because I like having control and having things that give me progress
information and can be interrupted without losing work.

Oh, and this won't work if you have any external references to these values
from outside your database. Say if the value is something like a customer
account number that you've previously sent to customers...

-- 
greg

Re: Column with recycled sequence value

From

KÖPFERL Robert

Date:

14 January 2005, 10:14:05


> -----Original Message-----
> From: Andrew Sullivan [mailto:ajs@crankycanuck.ca]
> Sent: Donnerstag, 13. Jänner 2005 20:49
> To: 'pgsql-sql@postgresql.org'
> Subject: Re: [SQL] Column with recycled sequence value
>
...
> > used 2^32 will be reached soon and then? There are far less
> than 4G-records
> > saved thus these values may be reused. How can this be accomplished?
>
> You can set the sequence up to cycle (so once it gets to the end, it
> wraps around to the beginning again).  The keyword is CYCLE at CREATE
> SEQUENCE time.  It defaults to NO CYCLE.
>
> One potential problem, of course, are collisions on the table,
> because some value wasn't cleared out.  It sounds like you don't have
> that problem though.

actually I am asking just because of that.
The thing is, that I am able to synthetisize wonderful SELECTs but I have no
unterstanding of how to first find out wether a record doesn't exist in
order to create it in the same transaction. As user of procedural languages
I'm thinking of variables and so on. But AFAIK SQL has a way to also find a
way without them.
OR I'd like to read that I can't use SQL for this purpose and at minimum
PL/pgSQL would be appropriate for me.


PS. 32-Bit is my bussiness limitation - but also that is thinkable.




thanks