Thread: Best practice for long-lived journal tables: bigint or recycling IDs?

Best practice for long-lived journal tables: bigint or recycling IDs?

From
Mark Stosberg
Date:
Hello,

I have some tables that continually collect statistics, and then over time are
pruned as the stats are aggregated into more useful formats. 

For some of these tables, it it is fore-seeable that the associated sequences
would be incremented past the max value of the "int" type in the normal course
of things. 

I see two options to prepare for that:

1. Convert the primary keys to "bigint", which should be good enough "forever".
I suppose there would some minor storage and performance penalty. 

2. Reset the sequence at some point. There would no "collisions", because the
older rows would have long been pruned-out. I suppose there is an improbable
edge case in which we restore some old data from tape and then are confused
because some new data has the same IDs, but as I said, these tables are used as
temporary holding locations, not permanent storage. 

Both options have some appeal for me. What have others done?
   Mark

-- . . . . . . . . . . . . . . . . . . . . . . . . . . .   Mark Stosberg            Principal Developer
mark@summersault.com    Summersault, LLC       765-939-9301 ext 202     database driven websites. . . . .
http://www.summersault.com/. . . . . . . .
 




Re: Best practice for long-lived journal tables: bigint or recycling IDs?

From
Alvaro Herrera
Date:
Mark Stosberg wrote:
> 
> Hello,
> 
> I have some tables that continually collect statistics, and then over time are
> pruned as the stats are aggregated into more useful formats. 
> 
> For some of these tables, it it is fore-seeable that the associated sequences
> would be incremented past the max value of the "int" type in the normal course
> of things. 
> 
> I see two options to prepare for that:

3. Deal with wraparound by ensuring that the applications behave sanely

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: Best practice for long-lived journal tables: bigint or recycling IDs?

From
Mark Stosberg
Date:
On Tue, 8 Jul 2008 17:20:13 -0400
Alvaro Herrera <alvherre@commandprompt.com> wrote:

> Mark Stosberg wrote:
> > 
> > Hello,
> > 
> > I have some tables that continually collect statistics, and then over time are
> > pruned as the stats are aggregated into more useful formats. 
> > 
> > For some of these tables, it it is fore-seeable that the associated sequences
> > would be incremented past the max value of the "int" type in the normal course
> > of things. 
> > 
> > I see two options to prepare for that:
> 
> 3. Deal with wraparound by ensuring that the applications behave sanely

Wrap-around?  

Exceeding the max size of "int" looks more like a brick wall than wrap-around to me:
insert into t values (2147483648);ERROR:  integer out of range
   Mark

-- . . . . . . . . . . . . . . . . . . . . . . . . . . .   Mark Stosberg            Principal Developer
mark@summersault.com    Summersault, LLC       765-939-9301 ext 202     database driven websites. . . . .
http://www.summersault.com/. . . . . . . .
 

-- . . . . . . . . . . . . . . . . . . . . . . . . . . .   Mark Stosberg            Principal Developer
mark@summersault.com    Summersault, LLC       765-939-9301 ext 202     database driven websites. . . . .
http://www.summersault.com/. . . . . . . .
 




Re: Re: Best practice for long-lived journal tables: bigint or recycling IDs?

From
Alvaro Herrera
Date:
Mark Stosberg wrote:
> On Tue, 8 Jul 2008 17:20:13 -0400
> Alvaro Herrera <alvherre@commandprompt.com> wrote:
> 
> > 3. Deal with wraparound by ensuring that the applications behave sanely
> 
> Wrap-around?  
> 
> Exceeding the max size of "int" looks more like a brick wall than wrap-around to me:
> 
>  insert into t values (2147483648);
>  ERROR:  integer out of range

Hmm, you can alter the sequence so that it wraps around at the point it
reaches INT_MAX.  So inserting this number would never actually happen.

alvherre=# create table t (a serial);
NOTICE:  CREATE TABLE créera des séquences implicites « t_a_seq » pour la colonne serial « t.a »
CREATE TABLE
alvherre=# alter sequence t_a_seq maxvalue 2147483647;
ALTER SEQUENCE
alvherre=# alter sequence t_a_seq cycle;
ALTER SEQUENCE
alvherre=# select setval('t_a_seq', 2147483645);  setval   
------------2147483645
(1 ligne)

alvherre=# insert into t default values;
INSERT 0 1
alvherre=# insert into t default values;
INSERT 0 1
alvherre=# insert into t default values;
INSERT 0 1
alvherre=# insert into t default values;
INSERT 0 1
alvherre=# insert into t default values;
INSERT 0 1
alvherre=# select * from t;    a      
------------21474836462147483647         1         2         3
(5 lignes)


-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: Best practice for long-lived journal tables: bigint or recycling IDs?

From
Mark Stosberg
Date:
> > > 3. Deal with wraparound by ensuring that the applications behave sanely
> >
> > Wrap-around?
> >
> > Exceeding the max size of "int" looks more like a brick wall than wrap-around to me:
> >
> >  insert into t values (2147483648);
> >  ERROR:  integer out of range
>
> Hmm, you can alter the sequence so that it wraps around at the point it
> reaches INT_MAX.  So inserting this number would never actually happen.

Ah, that does look like the best solution. I'll confirm that will work for our cases.

Thanks!
   Mark


> alvherre=# create table t (a serial);
> NOTICE:  CREATE TABLE créera des séquences implicites « t_a_seq » pour la colonne serial « t.a »
> CREATE TABLE
> alvherre=# alter sequence t_a_seq maxvalue 2147483647;
> ALTER SEQUENCE
> alvherre=# alter sequence t_a_seq cycle;
> ALTER SEQUENCE
> alvherre=# select setval('t_a_seq', 2147483645);
>    setval
> ------------
>  2147483645
> (1 ligne)
>
> alvherre=# insert into t default values;
> INSERT 0 1
> alvherre=# insert into t default values;
> INSERT 0 1
> alvherre=# insert into t default values;
> INSERT 0 1
> alvherre=# insert into t default values;
> INSERT 0 1
> alvherre=# insert into t default values;
> INSERT 0 1
> alvherre=# select * from t;
>      a
> ------------
>  2147483646
>  2147483647
>           1
>           2
>           3
> (5 lignes)