Thread: Questions about SERIAL type
I was thinking of re-designing my database schema to use a SERIAL value as an indentification across tables (i.e. as a foreign key). I've been playing with some example tables and have found the following behavior from SERIAL: (1) I think SERIAL is defined as an int4. However, the upper bound seems to be 2^31 - 1 (217483647) not 2^32 - 1. I suppose this is because a generic int4 should have one bit for the sign (negative/positive). However, shouldn't SERIAL always be a positive number? Would it be correct to make it some kind of unsigned int4 instead? (2) The SERIAL number increases even if the transaction was aborted (e.g. if a repeated tuple were trying to be inserted into a unique table, the transaction fails, but the SERIAL gets incremented).I was hoping that VACUUM VERBOSE ANALYZE would somehow reclaimthe lost SERIAL indicies. So, for example, if I had the table: db02=# select * from center_out order by id;subject | arm | target | rep | id ---------+-----+--------+-----+------------F | L | 1 | 1 | 1F | L | 1 | 2 | 3F | L | 10 | 2 | 4F | L | 100 | 2 | 100001F | L | 100 | 3 | 10000002F | L | 500 | 3 | 2110000001F | L | 501 | 3 | 2147483646F | L | 502 | 3 | 2147483647 (8 rows) then a VACUUM VERBOSE ANALYZE would do the following: db02=# select * from center_out order by id;subject | arm | target | rep | id ---------+-----+--------+-----+------------F | L | 1 | 1 | 1F | L | 1 | 2 | 2F | L | 10 | 2 | 3F | L | 100 | 2 | 4F | L | 100 | 3 | 5F | L | 500 | 3 | 6F | L | 501 | 3 | 7F | L | 502 | 3 | 8 (8 rows) I figure that I should never reach 2^31 - 1 transaction per table even with many aborted ones; however, I think these would be nice changes. Comments? -Tony
reina@nsi.edu (Tony Reina) writes: > I was thinking of re-designing my database schema to use a SERIAL > value as an indentification across tables (i.e. as a foreign key). > I've been playing with some example tables and have found the > following behavior from SERIAL: > > (1) I think SERIAL is defined as an int4. However, the upper bound > seems to be 2^31 - 1 (217483647) not 2^32 - 1. I suppose this is > because a generic int4 should have one bit for the sign > (negative/positive). However, shouldn't SERIAL always be a positive > number? Would it be correct to make it some kind of unsigned int4 > instead? I don't think PG (or the SQL standard) has any concept of unsigned numbers. Besides, you can have sequences that have negative values at some points, and even decrement rather than increment. Some folks may rely on this behavior. > (2) The SERIAL number increases even if the transaction was aborted > (e.g. if a repeated tuple were trying to be inserted into a unique > table, the transaction fails, but the SERIAL gets incremented). > I was hoping that VACUUM VERBOSE ANALYZE would somehow reclaim the > lost SERIAL indicies. So, for example, if I had the table: How would this work? Would the DB have to go through all tables looking for REFERENCES constraints and update those rows referring to a renumbered key? What if you had a referencing column without a REFERENCES constraint? What if you had some kind of data external to the database that relied on those primary keys staying the same? Not practical IMHO. > I figure that I should never reach 2^31 - 1 transaction per table even > with many aborted ones; however, I think these would be nice changes. What's going to happen AFAIK is that 64-bit sequences will be available. It's unlikely that overflow will be an issue with those... ;) -Doug -- Let us cross over the river, and rest under the shade of the trees. --T. J. Jackson, 1863
Doug McNaught wrote: > I don't think PG (or the SQL standard) has any concept of unsigned > numbers. Besides, you can have sequences that have negative values at > some points, and even decrement rather than increment. Some folks may > rely on this behavior. When I tried setting the current value to -200 I got an error that the number was outside of the proper range. db02=# create table test (id SERIAL); NOTICE: CREATE TABLE will create implicit sequence 'test_id_seq' for SERIAL column 'test.id' NOTICE: CREATE TABLE/UNIQUE will create implicit index 'test_id_key' for table 'test' CREATE db02=# select setval('test_id_seq', -200); ERROR: test_id_seq.setval: value -200 is out of bounds (1,2147483647) So I'm not sure how people would be using negative values. It looks like from the documentation that the SERIAL type always increments by 1 so I'm not sure how they could use decrementing values. Unless, of course, they've changed the source code to do this. Perhaps I'm missing something here in the documentation (using PG 7.1.3, maybe 7.2beta has changed this?). > How would this work? Would the DB have to go through all tables > looking for REFERENCES constraints and update those rows referring to > a renumbered key? What if you had a referencing column without a > REFERENCES constraint? What if you had some kind of data external to > the database that relied on those primary keys staying the same? Not > practical IMHO. > Yes, it would have to do this which may be time consuming and possibly impractical. However, the VACUUM ANALYZE is doing an aweful lot of processing on the tables and the indicies already. However, perhaps the other thing to do is to not increment the SERIAL value on an aborted transaction. I'm not sure why serial has to be incremented if the transaction fails. Of course, this won't take care of unused SERIAL numbers when DELETEs occur. I'm not sure about other database schemas which depend on the SERIAL values remaining the same for external consistency. You could still use an OID in that case I should think instead of SERIAL (?) > > > I figure that I should never reach 2^31 - 1 transaction per table even > > with many aborted ones; however, I think these would be nice changes. > > What's going to happen AFAIK is that 64-bit sequences will be > available. It's unlikely that overflow will be an issue with > those... ;) > That will definitely make overflow unlikely. Perhaps I'm just being too paranoid that somehow I'll get to the point where my SERIAL value is maxed out but I have large gaps from DELETED/UPDATED/ABORTED transactions. -Tony db02=# select version(); version -------------------------------------------------------------PostgreSQL 7.1.3 on i686-pc-linux-gnu, compiled by GCC 2.96 (1 row)
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I'm not 100% sure that you actually want this. The main reason I say this is that in most cases I use sequence numbers is to do forign-key relationships. If you change sequence numbers on rows in a table, unless all tables that use that sequence number are also modified, then the relationship between tables that rely on the sequence number is lost. If for any reason the sequence number is used externally, (not usually a good idea, but sometimes it is) then that relationship is also lost. And for argument sake, lets assume that we know each location a sequence number is referenced, so you can make the changes everywhere. (And that these numbers aren't used for other things like order-numbers that need to appear in a string format and printed/referenced later) That means that the database needs to be off-line during this access. So the modifications to Vacuum to make it less intrusive to users while its occuring is now lost. I don't think this is a good idea... (Also, does 7.2 have an 8 byte sequence number (serial8) anyways? So isn't this problem moot?) On 28-Nov-2001 Tony Reina wrote: > I was thinking of re-designing my database schema to use a SERIAL > value as an indentification across tables (i.e. as a foreign key). > I've been playing with some example tables and have found the > following behavior from SERIAL: > > (1) I think SERIAL is defined as an int4. However, the upper bound > seems to be 2^31 - 1 (217483647) not 2^32 - 1. I suppose this is > because a generic int4 should have one bit for the sign > (negative/positive). However, shouldn't SERIAL always be a positive > number? Would it be correct to make it some kind of unsigned int4 > instead? > > (2) The SERIAL number increases even if the transaction was aborted > (e.g. if a repeated tuple were trying to be inserted into a unique > table, the transaction fails, but the SERIAL gets incremented). > I was hoping that VACUUM VERBOSE ANALYZE would somehow reclaim the > lost SERIAL indicies. So, for example, if I had the table: > > db02=# select * from center_out order by id; > subject | arm | target | rep | id > ---------+-----+--------+-----+------------ > F | L | 1 | 1 | 1 > F | L | 1 | 2 | 3 > F | L | 10 | 2 | 4 > F | L | 100 | 2 | 100001 > F | L | 100 | 3 | 10000002 > F | L | 500 | 3 | 2110000001 > F | L | 501 | 3 | 2147483646 > F | L | 502 | 3 | 2147483647 > (8 rows) > > then a VACUUM VERBOSE ANALYZE would do the following: > > db02=# select * from center_out order by id; > subject | arm | target | rep | id > ---------+-----+--------+-----+------------ > F | L | 1 | 1 | 1 > F | L | 1 | 2 | 2 > F | L | 10 | 2 | 3 > F | L | 100 | 2 | 4 > F | L | 100 | 3 | 5 > F | L | 500 | 3 | 6 > F | L | 501 | 3 | 7 > F | L | 502 | 3 | 8 > (8 rows) > > I figure that I should never reach 2^31 - 1 transaction per table even > with many aborted ones; however, I think these would be nice changes. > > Comments? > > -Tony > > ---------------------------(end of broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly Virtually, Ned Wolpert <ned.wolpert@knowledgenet.com> D08C2F45: 28E7 56CB 58AC C622 5A51 3C42 8B2B 2739 D08C 2F45 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8BW1siysnOdCML0URAjFqAJ9RJk25zXl/mjhJmjC5tsf4bkj7EQCeNpph PcrtIXqceZLqdkDOyfAcq84= =MqDe -----END PGP SIGNATURE-----
"G. Anthony Reina" <reina@nsi.edu> writes: > Doug McNaught wrote: > > > I don't think PG (or the SQL standard) has any concept of unsigned > > numbers. Besides, you can have sequences that have negative values at > > some points, and even decrement rather than increment. Some folks may > > rely on this behavior. > > When I tried setting the current value to -200 I got an error that the > number was outside of the proper range. You need to specify MINVALUE, MAXVALUE and INCREMENT explicitly to play tricks like that. See the docs for CREATE SEQUENCE. If you need it, you should be able to create a sequence that uses the whole range from -2^31 to 2^31-1 with proper arguments to CREATE SEQUENCE. > So I'm not sure how people would be using negative values. It looks like from > the documentation that the SERIAL type always increments by 1 so I'm not sure > how they could use decrementing values. Unless, of course, they've changed > the source code to do this. Perhaps I'm missing something here in the > documentation (using PG 7.1.3, maybe 7.2beta has changed this?). You didn't read the right part of the docs. ;) See CREATE SEQUENCE in the SQL reference. > > How would this work? Would the DB have to go through all tables > > looking for REFERENCES constraints and update those rows referring to > > a renumbered key? What if you had a referencing column without a > > REFERENCES constraint? What if you had some kind of data external to > > the database that relied on those primary keys staying the same? Not > > practical IMHO. > > > > Yes, it would have to do this which may be time consuming and possibly > impractical. However, the VACUUM ANALYZE is doing an aweful lot of processing > on the tables and the indicies already. I'd be more concerned about the hairiness and maintainability of the resulting code, actually. ;) > However, perhaps the other thing to do is to not increment the SERIAL value > on an aborted transaction. I'm not sure why serial has to be incremented if > the transaction fails. Of course, this won't take care of unused SERIAL > numbers when DELETEs occur. The reason we don't do it this way is that the sequence object would have to be locked for the duration of every transaction that used it. You'd get a lot of contention on that lock and a big slowdown of the whole system. And as you say it wouldn't address the DELETE issue. > I'm not sure about other database schemas which depend on the SERIAL values > remaining the same for external consistency. You could still use an OID in > that case I should think instead of SERIAL (?) That's worse if anything. ;) > > > I figure that I should never reach 2^31 - 1 transaction per table even > > > with many aborted ones; however, I think these would be nice changes. > > > > What's going to happen AFAIK is that 64-bit sequences will be > > available. It's unlikely that overflow will be an issue with > > those... ;) > > > > That will definitely make overflow unlikely. Perhaps I'm just being too > paranoid that somehow I'll get to the point where my SERIAL value is maxed > out but I have large gaps from DELETED/UPDATED/ABORTED transactions. Seriously, I wouldn't worry about it, unless you're incrementing thousands of times a second, in which case you're in trouble for a lot of other reasons... -Doug -- Let us cross over the river, and rest under the shade of the trees. --T. J. Jackson, 1863
Doug McNaught wrote: > "G. Anthony Reina" <reina@nsi.edu> writes: > > > You need to specify MINVALUE, MAXVALUE and INCREMENT explicitly to > play tricks like that. See the docs for CREATE SEQUENCE. > > Ahhh. I found it now. I was looking at the documentation from an older on-line version of Bruce's book and didn't see the MAXVALUE, MINVALUE stuff. I guess the default for a serial column is MINVALUE 1, MAX VALUE 2^31-1, INCREMENT +1. > The reason we don't do it this way is that the sequence object would > have to be locked for the duration of every transaction that used it. > You'd get a lot of contention on that lock and a big slowdown of the > whole system. And as you say it wouldn't address the DELETE issue. > Okay, yes I can see the lock problem now. That makes sense. > > > > > > That will definitely make overflow unlikely. Perhaps I'm just being too > > paranoid that somehow I'll get to the point where my SERIAL value is maxed > > out but I have large gaps from DELETED/UPDATED/ABORTED transactions. > > Seriously, I wouldn't worry about it, unless you're incrementing > thousands of times a second, in which case you're in trouble for a lot > of other reasons... > > I figured that I was just being overly cautious. 2^31 transactions is quite a lot. With the move to int8 the point should be moot. p.s. Now that I look at the CREATE SEQUENCE documentation, it appears to have a CYCLE flag which wraps the sequence around if it were to reach the MAXVALUE. Does anyone know if it wraps around to the next unused value? Or, if an index already exists at SERIAL value = MINVALUE, then will the INSERT get an error about duplicate insertions? -Tony
On 28 Nov 2001, Tony Reina wrote: > I was thinking of re-designing my database schema to use a SERIAL > value as an indentification across tables (i.e. as a foreign key). > I've been playing with some example tables and have found the > following behavior from SERIAL: > > (1) I think SERIAL is defined as an int4. However, the upper bound IIRC in 7.2, there's 8 byte sequences and a serial8 pseudotype that probably uses a signed int8. > (2) The SERIAL number increases even if the transaction was aborted > (e.g. if a repeated tuple were trying to be inserted into a unique > table, the transaction fails, but the SERIAL gets incremented). Yeah, the tradeoff was made to go for the concurrency advantage. If you need to rollback the sequence value if rollback is performed, you'd need to wait until it's happened before the next insert would be able to get the sequence value. > I was hoping that VACUUM VERBOSE ANALYZE would somehow reclaim the > lost SERIAL indicies. So, for example, if I had the table: Ick. That sounds really ugly to me. That seems to be outside what the system can reasonably be expected to handle. It'd be difficult to determine the full set of in-database dependencies (say triggers that do their own sort of integrity checks, views, functions, etc that may join this field to another table) and probably impossible to determine out of database ones (printed material, etc...).
"G. Anthony Reina" <reina@nsi.edu> writes: > Now that I look at the CREATE SEQUENCE documentation, it appears to > have a CYCLE flag which wraps the sequence around if it were to > reach the MAXVALUE. Does anyone know if it wraps around to the next > unused value? Or, if an index already exists at SERIAL value = > MINVALUE, then will the INSERT get an error about duplicate > insertions? SERIAL columns get a unique index defined, so you'd get an error. -Doug -- Let us cross over the river, and rest under the shade of the trees. --T. J. Jackson, 1863
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 28-Nov-2001 G. Anthony Reina wrote: > However, perhaps the other thing to do is to not increment the SERIAL value > on an aborted transaction. I'm not sure why serial has to be incremented if > the transaction fails. Of course, this won't take care of unused SERIAL > numbers when DELETEs occur. I thought its incremented since the sequence is outside of the transaction. That way, if multiple clients are doing inserts using the sequence, one doesn't have to wait for the other transactions to end before they get a lock on the sequence. Virtually, Ned Wolpert <ned.wolpert@knowledgenet.com> D08C2F45: 28E7 56CB 58AC C622 5A51 3C42 8B2B 2739 D08C 2F45 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE8BXVOiysnOdCML0URApJnAJ9Z43xFgJRevgoNIQEGYkwkxbAAJACbBopF N3slqHoAxPq7HkcDaI7FMsY= =r9mw -----END PGP SIGNATURE-----
----- Original Message ----- From: Doug McNaught <doug@wireboard.com> Sent: Wednesday, November 28, 2001 5:31 PM > > I figure that I should never reach 2^31 - 1 transaction per table even > > with many aborted ones; however, I think these would be nice changes. > > What's going to happen AFAIK is that 64-bit sequences will be > available. It's unlikely that overflow will be an issue with > those... ;) "640K ought to be enough for everyone!" Gill Bates. No offense, just an association :)