Thread: Questions about SERIAL type

Questions about SERIAL type

From
reina@nsi.edu (Tony Reina)
Date:
I was thinking of re-designing my database schema to use a SERIAL
value as an indentification across tables (i.e. as a foreign key).
I've been playing with some example tables and have found the
following behavior from SERIAL:

(1) I think SERIAL is defined as an int4. However, the upper bound
seems to be 2^31 - 1 (217483647) not 2^32 - 1. I suppose this is
because a generic int4 should have one bit for the sign
(negative/positive). However, shouldn't SERIAL always be a positive
number? Would it be correct to make it some kind of unsigned int4
instead?

(2) The SERIAL number increases even if the transaction was aborted
(e.g. if a repeated tuple were trying to be inserted into a unique
table, the transaction fails, but the SERIAL gets incremented).I was hoping that VACUUM VERBOSE ANALYZE would somehow
reclaimthe
 
lost SERIAL indicies. So, for example, if I had the table:

db02=# select * from center_out order by id;subject | arm | target | rep |     id     
---------+-----+--------+-----+------------F       | L   |      1 |   1 |          1F       | L   |      1 |   2 |
   3F       | L   |     10 |   2 |          4F       | L   |    100 |   2 |     100001F       | L   |    100 |   3 |
10000002F      | L   |    500 |   3 | 2110000001F       | L   |    501 |   3 | 2147483646F       | L   |    502 |   3 |
2147483647
(8 rows)

then a VACUUM VERBOSE ANALYZE would do the following:

db02=# select * from center_out order by id;subject | arm | target | rep |     id     
---------+-----+--------+-----+------------F       | L   |      1 |   1 |          1F       | L   |      1 |   2 |
   2F       | L   |     10 |   2 |          3F       | L   |    100 |   2 |          4F       | L   |    100 |   3 |
     5F       | L   |    500 |   3 |          6F       | L   |    501 |   3 |          7F       | L   |    502 |   3 |
       8
 
(8 rows)

I figure that I should never reach 2^31 - 1 transaction per table even
with many aborted ones; however, I think these would be nice changes.

Comments?

-Tony


Re: Questions about SERIAL type

From
Doug McNaught
Date:
reina@nsi.edu (Tony Reina) writes:

> I was thinking of re-designing my database schema to use a SERIAL
> value as an indentification across tables (i.e. as a foreign key).
> I've been playing with some example tables and have found the
> following behavior from SERIAL:
> 
> (1) I think SERIAL is defined as an int4. However, the upper bound
> seems to be 2^31 - 1 (217483647) not 2^32 - 1. I suppose this is
> because a generic int4 should have one bit for the sign
> (negative/positive). However, shouldn't SERIAL always be a positive
> number? Would it be correct to make it some kind of unsigned int4
> instead?

I don't think PG (or the SQL standard) has any concept of unsigned
numbers.  Besides, you can have sequences that have negative values at
some points, and even decrement rather than increment.  Some folks may 
rely on this behavior.

> (2) The SERIAL number increases even if the transaction was aborted
> (e.g. if a repeated tuple were trying to be inserted into a unique
> table, the transaction fails, but the SERIAL gets incremented).
>     I was hoping that VACUUM VERBOSE ANALYZE would somehow reclaim the
> lost SERIAL indicies. So, for example, if I had the table:

How would this work?  Would the DB have to go through all tables
looking for REFERENCES constraints and update those rows referring to
a renumbered key?  What if you had a referencing column without a
REFERENCES constraint?  What if you had some kind of data external to
the database that relied on those primary keys staying the same?  Not
practical IMHO.

> I figure that I should never reach 2^31 - 1 transaction per table even
> with many aborted ones; however, I think these would be nice changes.

What's going to happen AFAIK is that 64-bit sequences will be
available.  It's unlikely that overflow will be an issue with
those...  ;)

-Doug
-- 
Let us cross over the river, and rest under the shade of the trees.  --T. J. Jackson, 1863


Re: Questions about SERIAL type

From
"G. Anthony Reina"
Date:
Doug McNaught wrote:

> I don't think PG (or the SQL standard) has any concept of unsigned
> numbers.  Besides, you can have sequences that have negative values at
> some points, and even decrement rather than increment.  Some folks may
> rely on this behavior.

When I tried setting the current value to -200  I got an error that the
number was outside of the proper range.

db02=# create table test (id SERIAL);
NOTICE:  CREATE TABLE will create implicit sequence 'test_id_seq' for SERIAL
column 'test.id'
NOTICE:  CREATE TABLE/UNIQUE will create implicit index 'test_id_key' for
table 'test'
CREATE

db02=# select setval('test_id_seq', -200);
ERROR:  test_id_seq.setval: value -200 is out of bounds (1,2147483647)

So I'm not sure how people would be using negative values. It looks like from
the documentation that the SERIAL type always increments by 1 so I'm not sure
how they could use decrementing values. Unless, of course, they've changed
the source code to do this. Perhaps I'm missing something here in the
documentation (using PG 7.1.3, maybe 7.2beta has changed this?).


> How would this work?  Would the DB have to go through all tables
> looking for REFERENCES constraints and update those rows referring to
> a renumbered key?  What if you had a referencing column without a
> REFERENCES constraint?  What if you had some kind of data external to
> the database that relied on those primary keys staying the same?  Not
> practical IMHO.
>

Yes, it would have to do this which may be time consuming and possibly
impractical. However, the VACUUM ANALYZE is doing an aweful lot of processing
on the tables and the indicies already.

However, perhaps the other thing to do is to not increment the SERIAL value
on an aborted transaction. I'm not sure why serial has to be incremented if
the transaction fails. Of course, this won't take care of unused SERIAL
numbers when DELETEs occur.

I'm not sure about other database schemas which depend on the SERIAL values
remaining the same for external consistency. You could still use an OID in
that case I should think instead of SERIAL (?)

>
> > I figure that I should never reach 2^31 - 1 transaction per table even
> > with many aborted ones; however, I think these would be nice changes.
>
> What's going to happen AFAIK is that 64-bit sequences will be
> available.  It's unlikely that overflow will be an issue with
> those...  ;)
>

That will definitely make overflow unlikely. Perhaps I'm just being too
paranoid that somehow I'll get to the point where my SERIAL value is maxed
out but I have large gaps from DELETED/UPDATED/ABORTED transactions.

-Tony

db02=# select version();                          version
-------------------------------------------------------------PostgreSQL 7.1.3 on i686-pc-linux-gnu, compiled by GCC
2.96
(1 row)





Re: Questions about SERIAL type

From
Ned Wolpert
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm not 100% sure that you actually want this.  The main reason I say this is
that in most cases I use sequence numbers is to do forign-key relationships. 

If you change sequence numbers on rows in a table, unless all tables that
use that sequence number are also modified, then the relationship between
tables that rely on the sequence number is lost.  If for any reason the
sequence number is used externally, (not usually a good idea, but sometimes
it is) then that relationship is also lost.

And for argument sake, lets assume that we know each location a sequence
number is referenced, so you can make the changes everywhere. (And that these
numbers aren't used for other things like order-numbers that need to appear
in a string format and printed/referenced later) That means that the database
needs to be off-line during this access.  So the modifications to Vacuum to
make it less intrusive to users while its occuring is now lost. 

I don't think this is a good idea...  (Also, does 7.2 have an 8 byte sequence
number (serial8) anyways? So isn't this problem moot?)


On 28-Nov-2001 Tony Reina wrote:
> I was thinking of re-designing my database schema to use a SERIAL
> value as an indentification across tables (i.e. as a foreign key).
> I've been playing with some example tables and have found the
> following behavior from SERIAL:
> 
> (1) I think SERIAL is defined as an int4. However, the upper bound
> seems to be 2^31 - 1 (217483647) not 2^32 - 1. I suppose this is
> because a generic int4 should have one bit for the sign
> (negative/positive). However, shouldn't SERIAL always be a positive
> number? Would it be correct to make it some kind of unsigned int4
> instead?
> 
> (2) The SERIAL number increases even if the transaction was aborted
> (e.g. if a repeated tuple were trying to be inserted into a unique
> table, the transaction fails, but the SERIAL gets incremented).
>       I was hoping that VACUUM VERBOSE ANALYZE would somehow reclaim the
> lost SERIAL indicies. So, for example, if I had the table:
> 
> db02=# select * from center_out order by id;
>  subject | arm | target | rep |     id     
> ---------+-----+--------+-----+------------
>  F       | L   |      1 |   1 |          1
>  F       | L   |      1 |   2 |          3
>  F       | L   |     10 |   2 |          4
>  F       | L   |    100 |   2 |     100001
>  F       | L   |    100 |   3 |   10000002
>  F       | L   |    500 |   3 | 2110000001
>  F       | L   |    501 |   3 | 2147483646
>  F       | L   |    502 |   3 | 2147483647
> (8 rows)
> 
> then a VACUUM VERBOSE ANALYZE would do the following:
> 
> db02=# select * from center_out order by id;
>  subject | arm | target | rep |     id     
> ---------+-----+--------+-----+------------
>  F       | L   |      1 |   1 |          1
>  F       | L   |      1 |   2 |          2
>  F       | L   |     10 |   2 |          3
>  F       | L   |    100 |   2 |          4
>  F       | L   |    100 |   3 |          5
>  F       | L   |    500 |   3 |          6
>  F       | L   |    501 |   3 |          7
>  F       | L   |    502 |   3 |          8
> (8 rows)
> 
> I figure that I should never reach 2^31 - 1 transaction per table even
> with many aborted ones; however, I think these would be nice changes.
> 
> Comments?
> 
> -Tony
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly


Virtually, 
Ned Wolpert <ned.wolpert@knowledgenet.com>

D08C2F45:  28E7 56CB 58AC C622 5A51  3C42 8B2B 2739 D08C 2F45 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8BW1siysnOdCML0URAjFqAJ9RJk25zXl/mjhJmjC5tsf4bkj7EQCeNpph
PcrtIXqceZLqdkDOyfAcq84=
=MqDe
-----END PGP SIGNATURE-----


Re: Questions about SERIAL type

From
Doug McNaught
Date:
"G. Anthony Reina" <reina@nsi.edu> writes:

> Doug McNaught wrote:
> 
> > I don't think PG (or the SQL standard) has any concept of unsigned
> > numbers.  Besides, you can have sequences that have negative values at
> > some points, and even decrement rather than increment.  Some folks may
> > rely on this behavior.
> 
> When I tried setting the current value to -200  I got an error that the
> number was outside of the proper range.

You need to specify MINVALUE, MAXVALUE and INCREMENT explicitly to
play tricks like that.  See the docs for CREATE SEQUENCE.  

If you need it, you should be able to create a sequence that uses the
whole range from -2^31 to 2^31-1 with proper arguments to CREATE
SEQUENCE.

> So I'm not sure how people would be using negative values. It looks like from
> the documentation that the SERIAL type always increments by 1 so I'm not sure
> how they could use decrementing values. Unless, of course, they've changed
> the source code to do this. Perhaps I'm missing something here in the
> documentation (using PG 7.1.3, maybe 7.2beta has changed this?).

You didn't read the right part of the docs.  ;)  See CREATE SEQUENCE
in the SQL reference. 

> > How would this work?  Would the DB have to go through all tables
> > looking for REFERENCES constraints and update those rows referring to
> > a renumbered key?  What if you had a referencing column without a
> > REFERENCES constraint?  What if you had some kind of data external to
> > the database that relied on those primary keys staying the same?  Not
> > practical IMHO.
> >
> 
> Yes, it would have to do this which may be time consuming and possibly
> impractical. However, the VACUUM ANALYZE is doing an aweful lot of processing
> on the tables and the indicies already.

I'd be more concerned about the hairiness and maintainability of the
resulting code, actually.  ;)

> However, perhaps the other thing to do is to not increment the SERIAL value
> on an aborted transaction. I'm not sure why serial has to be incremented if
> the transaction fails. Of course, this won't take care of unused SERIAL
> numbers when DELETEs occur.

The reason we don't do it this way is that the sequence object would
have to be locked for the duration of every transaction that used it.
You'd get a lot of contention on that lock and a big slowdown of the
whole system.  And as you say it wouldn't address the DELETE issue. 

> I'm not sure about other database schemas which depend on the SERIAL values
> remaining the same for external consistency. You could still use an OID in
> that case I should think instead of SERIAL (?)

That's worse if anything.  ;)  

> > > I figure that I should never reach 2^31 - 1 transaction per table even
> > > with many aborted ones; however, I think these would be nice changes.
> >
> > What's going to happen AFAIK is that 64-bit sequences will be
> > available.  It's unlikely that overflow will be an issue with
> > those...  ;)
> >
> 
> That will definitely make overflow unlikely. Perhaps I'm just being too
> paranoid that somehow I'll get to the point where my SERIAL value is maxed
> out but I have large gaps from DELETED/UPDATED/ABORTED transactions.

Seriously, I wouldn't worry about it, unless you're incrementing
thousands of times a second, in which case you're in trouble for a lot 
of other reasons...

-Doug
-- 
Let us cross over the river, and rest under the shade of the trees.  --T. J. Jackson, 1863


Re: Questions about SERIAL type

From
"G. Anthony Reina"
Date:
Doug McNaught wrote:

> "G. Anthony Reina" <reina@nsi.edu> writes:
>
>
> You need to specify MINVALUE, MAXVALUE and INCREMENT explicitly to
> play tricks like that.  See the docs for CREATE SEQUENCE.
>
>

Ahhh. I found it now. I was looking at the documentation from an older on-line
version of Bruce's book and didn't see the MAXVALUE, MINVALUE stuff. I guess the
default for a serial column is MINVALUE 1, MAX VALUE 2^31-1, INCREMENT +1.


> The reason we don't do it this way is that the sequence object would
> have to be locked for the duration of every transaction that used it.
> You'd get a lot of contention on that lock and a big slowdown of the
> whole system.  And as you say it wouldn't address the DELETE issue.
>

Okay, yes I can see the lock problem now. That makes sense.

>
>
> >
> > That will definitely make overflow unlikely. Perhaps I'm just being too
> > paranoid that somehow I'll get to the point where my SERIAL value is maxed
> > out but I have large gaps from DELETED/UPDATED/ABORTED transactions.
>
> Seriously, I wouldn't worry about it, unless you're incrementing
> thousands of times a second, in which case you're in trouble for a lot
> of other reasons...
>
>

I figured that I was just being overly cautious. 2^31 transactions is quite a lot.
With the move to int8 the point should be moot.

p.s.

Now that I look at the CREATE SEQUENCE documentation, it appears to have a CYCLE
flag which wraps the sequence around if it were to reach the MAXVALUE. Does anyone
know if it wraps around to the next unused value? Or, if an index already exists
at SERIAL value = MINVALUE, then will the INSERT get an error about duplicate
insertions?

-Tony




Re: Questions about SERIAL type

From
Stephan Szabo
Date:
On 28 Nov 2001, Tony Reina wrote:

> I was thinking of re-designing my database schema to use a SERIAL
> value as an indentification across tables (i.e. as a foreign key).
> I've been playing with some example tables and have found the
> following behavior from SERIAL:
>
> (1) I think SERIAL is defined as an int4. However, the upper bound

IIRC in 7.2, there's 8 byte sequences and a serial8 pseudotype that
probably uses a signed int8.

> (2) The SERIAL number increases even if the transaction was aborted
> (e.g. if a repeated tuple were trying to be inserted into a unique
> table, the transaction fails, but the SERIAL gets incremented).

Yeah, the tradeoff was made to go for the concurrency advantage.  If
you need to rollback the sequence value if rollback is performed, you'd
need to wait until it's happened before the next insert would be able
to get the sequence value.

>     I was hoping that VACUUM VERBOSE ANALYZE would somehow reclaim the
> lost SERIAL indicies. So, for example, if I had the table:

Ick. That sounds really ugly to me. That seems to be outside what
the system can reasonably be expected to handle.  It'd be difficult
to determine the full set of in-database dependencies (say triggers
that do their own sort of integrity checks, views, functions, etc
that may join this field to another table) and probably impossible
to determine out of database ones (printed material, etc...).



Re: Questions about SERIAL type

From
Doug McNaught
Date:
"G. Anthony Reina" <reina@nsi.edu> writes:

> Now that I look at the CREATE SEQUENCE documentation, it appears to
> have a CYCLE flag which wraps the sequence around if it were to
> reach the MAXVALUE. Does anyone know if it wraps around to the next
> unused value? Or, if an index already exists at SERIAL value =
> MINVALUE, then will the INSERT get an error about duplicate
> insertions?

SERIAL columns get a unique index defined, so you'd get an error. 

-Doug
-- 
Let us cross over the river, and rest under the shade of the trees.  --T. J. Jackson, 1863


Re: Questions about SERIAL type

From
Ned Wolpert
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 28-Nov-2001 G. Anthony Reina wrote:
> However, perhaps the other thing to do is to not increment the SERIAL value
> on an aborted transaction. I'm not sure why serial has to be incremented if
> the transaction fails. Of course, this won't take care of unused SERIAL
> numbers when DELETEs occur.

I thought its incremented since the sequence is outside of the transaction. 
That way, if multiple clients are doing inserts using the sequence, one
doesn't have to wait for the other transactions to end before they get a lock
on the sequence.


Virtually, 
Ned Wolpert <ned.wolpert@knowledgenet.com>

D08C2F45:  28E7 56CB 58AC C622 5A51  3C42 8B2B 2739 D08C 2F45 
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: For info see http://www.gnupg.org

iD8DBQE8BXVOiysnOdCML0URApJnAJ9Z43xFgJRevgoNIQEGYkwkxbAAJACbBopF
N3slqHoAxPq7HkcDaI7FMsY=
=r9mw
-----END PGP SIGNATURE-----


Re: Questions about SERIAL type

From
"Serguei Mokhov"
Date:
----- Original Message ----- 
From: Doug McNaught <doug@wireboard.com>
Sent: Wednesday, November 28, 2001 5:31 PM

> > I figure that I should never reach 2^31 - 1 transaction per table even
> > with many aborted ones; however, I think these would be nice changes.
> 
> What's going to happen AFAIK is that 64-bit sequences will be
> available.  It's unlikely that overflow will be an issue with
> those...  ;)

"640K ought to be enough for everyone!" Gill Bates.

No offense, just an association :)