Thread: unique row identifier data type exhausted . . .
It feels like there should be some *really* obvious answer to this question, and I'll find myself whacking my forehead in self-abasement and out of sheer relief to have found the answer to a problem that should not have bothered me in the first place since the answer is too self-evident . . . however, it is bothering me: what happens if the data type that you've chosen to uniquely identify a row is exhausted? If, for instance you use int4 and you've had your couple billion deletes and inserts on the table and the next nextval('seq') . . . well, what exactly happens and how do they do it? Admittedly, 10^9 is a big number but it is far from out of the question that you'd reach it on a really busy database (can't think of a real-world example but that ought to be a moot point), not to mention oids since they are unique across an entire database. Cheers Frank -- frank joerdens joerdens new media heinrich-roller str. 16/17 10405 berlin germany e: frank@joerdens.de t: +49 (0)30 44055471 f: +49 (0)30 44055475 h: http://www.joerdens.de pgp public key: http://www.joerdens.de/pgp/frank_joerdens.asc
> It feels like there should be some *really* obvious answer to this > question, and I'll find myself whacking my forehead in self-abasement > and out of sheer relief to have found the answer to a problem that > should not have bothered me in the first place since the answer is too > self-evident . . . however, it is bothering me: what happens if the data > type that you've chosen to uniquely identify a row is exhausted? If, for > instance you use int4 and you've had your couple billion deletes and > inserts on the table and the next nextval('seq') . . . well, what > exactly happens and how do they do it? Admittedly, 10^9 is a big number > but it is far from out of the question that you'd reach it on a really > busy database (can't think of a real-world example but that ought to be > a moot point), not to mention oids since they are unique across an > entire database. I am curious to know how difficult it would be (if at all) to change the type that oid represents, to a 64 bit number. C'mon guys, this isn't the 90s any more! - Andrew
Frank Joerdens wrote: > It feels like there should be some *really* obvious answer to this > question, and I'll find myself whacking my forehead in self-abasement > and out of sheer relief to have found the answer to a problem that > should not have bothered me in the first place since the answer is too > self-evident . . . however, it is bothering me: what happens if the data > type that you've chosen to uniquely identify a row is exhausted? If, for > instance you use int4 and you've had your couple billion deletes and > inserts on the table and the next nextval('seq') . . . well, what > exactly happens and how do they do it? Admittedly, 10^9 is a big number > but it is far from out of the question that you'd reach it on a really > busy database (can't think of a real-world example but that ought to be > a moot point), not to mention oids since they are unique across an > entire database. This is the command to create a sequence : CREATE SEQUENCE seqname [ INCREMENT increment ] [ MINVALUE minvalue ] [ MAXVALUE maxvalue ] [ START start ] [ CACHE cache ] [ CYCLE ] The CYCLE option restarts the sequence automatically upon overflow. Of course, your analysis of the database should point out that it is possible to do this. Jurgen Defurne defurnj@glo.be
> > > It feels like there should be some *really* obvious answer to this > > question, and I'll find myself whacking my forehead in self-abasement > > and out of sheer relief to have found the answer to a problem that > > should not have bothered me in the first place since the answer is too > > self-evident . . . however, it is bothering me: what happens if the data > > type that you've chosen to uniquely identify a row is exhausted? If, for > > instance you use int4 and you've had your couple billion deletes and > > inserts on the table and the next nextval('seq') . . . well, what > > exactly happens and how do they do it? Admittedly, 10^9 is a big number > > but it is far from out of the question that you'd reach it on a really > > busy database (can't think of a real-world example but that ought to be > > a moot point), not to mention oids since they are unique across an > > entire database. > > I am curious to know how difficult it would be (if at all) to change the > type that oid represents, to a 64 bit number. C'mon guys, this isn't the 90s > any more! When we are sure all platforms support 64-bit int's, we will move in that direction. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
> When we are sure all platforms support 64-bit int's, we will move in > that direction. Sorry if this is a stupid question, but couldn't you fairly easily make it an option at compile time? To use either 32 or 64 bit OID's. (And, less importantly, for sequences) - Andrew
> > > When we are sure all platforms support 64-bit int's, we will move in > > that direction. > > Sorry if this is a stupid question, but couldn't you fairly easily make it > an option at compile time? To use either 32 or 64 bit OID's. > (And, less importantly, for sequences) Well, we could but then all the API would change, and internally there would be changes. No one has really asked for it vs. the work in doing it. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Mon, 24 Apr 2000, Andrew Snow wrote: > > When we are sure all platforms support 64-bit int's, we will move in > > that direction. > > Sorry if this is a stupid question, but couldn't you fairly easily make it > an option at compile time? To use either 32 or 64 bit OID's. > (And, less importantly, for sequences) Is this necessarily a good solution? If you use 64-bit OIDs, some joker will just hook up a several-terra-byte disk array to his machine, try to store the location of every molecule in the universe and break it. Admittedly, ~2x10^20 is a very large number, but that's what they thought about 2000, also... What I'm saying is, is there a better way of doing this? Cheers -- Tom Cook - Software Engineer "Christ died for sin, the righteous for the unrighteous, to bring you to God." - 1 Peter 3:18 LISAcorp - www.lisa.com.au -------------------------------------------------- 38 Greenhill Rd. Level 3, 228 Pitt Street Wayville, SA, 5034 Sydney, NSW, 2000 Phone: +61 8 8272 1555 Phone: +61 2 9283 0877 Fax: +61 8 8271 1199 Fax: +61 2 9283 0866 --------------------------------------------------
For me it seems more a design problem than the length of internal number.... Why not create a table with 2 field containing int, setting the primary key on both of them and running a sequencing scheme on both as if it was a single number.... There are no more limitations anymore, as if you know you will need a lot of record you may decide to use 1, 2, or 3 numbers... Question, does postgress is able to run sequence on 2 combined numbers... Cheers... Franck Martin Network and Database Development Officer SOPAC South Pacific Applied Geoscience Commission Fiji E-mail: franck@sopac.org.fj <mailto:franck@sopac.org.fj> Web site: www.sopac.org.fj <http://www.sopac.org.fj> -----Original Message----- From: Tom Cook [mailto:tcook@lisa.com.au] Sent: Wednesday, April 26, 2000 11:49 AM To: Pgsql-General@Postgresql. Org Subject: RE: [GENERAL] unique row identifier data type exhausted . . . On Mon, 24 Apr 2000, Andrew Snow wrote: > > When we are sure all platforms support 64-bit int's, we will move in > > that direction. > > Sorry if this is a stupid question, but couldn't you fairly easily make it > an option at compile time? To use either 32 or 64 bit OID's. > (And, less importantly, for sequences) Is this necessarily a good solution? If you use 64-bit OIDs, some joker will just hook up a several-terra-byte disk array to his machine, try to store the location of every molecule in the universe and break it. Admittedly, ~2x10^20 is a very large number, but that's what they thought about 2000, also... What I'm saying is, is there a better way of doing this?
> Is this necessarily a good solution? If you use 64-bit OIDs, some joker > will just hook up a several-terra-byte disk array to his machine, try to > store the location of every molecule in the universe and break it. If you have to have OIDs at all, its a lot better than a 32 bit number. I think it would be easier to switch to 64 bit OIDs than ditch them completely. The "serial" type should definitely be 64 bit. To make matters worse I believe its really only a 31 bit number as the plus/minus symbol is discarded. But I think moving to 64 bit will take place soon enough, when it needs to, and it should shut everyone up. > "Christ died for sin, the righteous for the unrighteous, to bring > you to God." > - 1 Peter 3:18 Is it just me, or are there a lot of Christian PostgreSQL users? ;-) (I'm not knocking it, I'm one too..) - Andrew
> > > > Is this necessarily a good solution? If you use 64-bit OIDs, some joker > > will just hook up a several-terra-byte disk array to his machine, try to > > store the location of every molecule in the universe and break it. > > If you have to have OIDs at all, its a lot better than a 32 bit number. I > think it would be easier to switch to 64 bit OIDs than ditch them > completely. > The "serial" type should definitely be 64 bit. To make matters worse I > believe its really only a 31 bit number as the plus/minus symbol is > discarded. But I think moving to 64 bit will take place soon enough, when > it needs to, and it should shut everyone up. If you look at that TODO list, oid's flowing over 32-bits is not something we are losing sleep over. In fact, the first fix would be to make sure oid's are truly treated as unsigned int's, thereby doubling their range. I have done some of those myself, but I am sure there are more areas that need fixing. Illustra's solution was to use two int32's, making the upper 32-bit value represent the site, so oid's remain unique as they move between sites. If we picked a random 32-bit oid on initdb startup, that would pretty much make them unique all the time. > > > "Christ died for sin, the righteous for the unrighteous, to bring > > you to God." > > - 1 Peter 3:18 > > Is it just me, or are there a lot of Christian PostgreSQL users? ;-) (I'm > not knocking it, I'm one too..) Good question. :-) -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On Wed, 26 Apr 2000, Tom Cook wrote: > Is this necessarily a good solution? If you use 64-bit OIDs, some joker > will just hook up a several-terra-byte disk array to his machine, try to > store the location of every molecule in the universe and break it. That's not going to work anyway. To store information about a molecule you need at least one such molecule to hold that state, barring major revolutions in storage technology. :-) > Admittedly, ~2x10^20 is a very large number, but that's what they thought > about 2000, also... A while ago I said that in order to exhaust the oid space you need to add 1 million new records a day for more than 10 years. Then someone said, ok, what if I have an email service with 1 million users that each get 10 emails a day. Then you're talking about 1 year. But in order to exhaust 64 bits, you can have 10^9 users (i.e., everyone), getting two million emails a day for 1000 years. That seems pretty safe for as long as I care. Of course to store all molecules you really need more like 384 bits. > What I'm saying is, is there a better way of doing this? Transfinite numbers ;) -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Seems to me that 64 bit OIDs is enough for any rational sensible person, and if it's not enough for you, then you have waytoo much time to think about it. >>> e99re41@DoCS.UU.SE 04/26/00 08:53AM >>> On Wed, 26 Apr 2000, Tom Cook wrote: > Is this necessarily a good solution? If you use 64-bit OIDs, some joker > will just hook up a several-terra-byte disk array to his machine, try to > store the location of every molecule in the universe and break it. That's not going to work anyway. To store information about a molecule you need at least one such molecule to hold that state, barring major revolutions in storage technology. :-) > Admittedly, ~2x10^20 is a very large number, but that's what they thought > about 2000, also... A while ago I said that in order to exhaust the oid space you need to add 1 million new records a day for more than 10 years. Then someone said, ok, what if I have an email service with 1 million users that each get 10 emails a day. Then you're talking about 1 year. But in order to exhaust 64 bits, you can have 10^9 users (i.e., everyone), getting two million emails a day for 1000 years. That seems pretty safe for as long as I care. Of course to store all molecules you really need more like 384 bits. > What I'm saying is, is there a better way of doing this? Transfinite numbers ;) -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Bruce Momjian wrote: > > > Is this necessarily a good solution? If you use 64-bit OIDs, some joker > > > will just hook up a several-terra-byte disk array to his machine, try to > > > store the location of every molecule in the universe and break it. > > > > If you have to have OIDs at all, its a lot better than a 32 bit number. I > > think it would be easier to switch to 64 bit OIDs than ditch them > > completely. > > The "serial" type should definitely be 64 bit. To make matters worse I > > believe its really only a 31 bit number as the plus/minus symbol is > > discarded. But I think moving to 64 bit will take place soon enough, when > > it needs to, and it should shut everyone up. > > If you look at that TODO list, oid's flowing over 32-bits is not > something we are losing sleep over. In fact, the first fix would be to > make sure oid's are truly treated as unsigned int's, thereby doubling > their range. I have done some of those myself, but I am sure there are > more areas that need fixing. > > Illustra's solution was to use two int32's, making the upper 32-bit > value represent the site, so oid's remain unique as they move between > sites. If we picked a random 32-bit oid on initdb startup, that would > pretty much make them unique all the time. > > -- > Bruce Momjian | http://www.op.net/~candle > pgman@candle.pha.pa.us | (610) 853-3000 > + If your life is a hard drive, | 830 Blythe Avenue > + Christ can be your backup. | Drexel Hill, Pennsylvania 19026 While I am working on a system which could use 64-bit OIDs, and I think it is a good idea to move to them, I wonder if the developers should consider the people who are running older, "legacy" systems as well. Moving to a 64-bit OID would add considerably to the space required (ie the overhead) to run the database. Many Linux systems are "Linux" because Windows got too big and clunky to run there. If possible, I'd suggest leaving the OID size as a compile time switch so those who wish to run "light" can do so, and those who wish to tally the molecules of the universe can think about how to compress the data to fit within a 64-bit OID. Mark -- Mark Dalphin email: mdalphin@amgen.com Mail Stop: 29-2-A phone: +1-805-447-4951 (work) One Amgen Center Drive +1-805-375-0680 (home) Thousand Oaks, CA 91320 fax: +1-805-499-9955 (work)
> > Is this necessarily a good solution? If you use 64-bit OIDs, some joker > > will just hook up a several-terra-byte disk array to his machine, try to > > store the location of every molecule in the universe and break it. > > That's not going to work anyway. To store information about a molecule you > need at least one such molecule to hold that state, barring major > revolutions in storage technology. :-) Maybe one or two quarks are enough to represent a single bit. Then you can break this barrier and store the data, because most molecules consists of more quarks. But that's an incomplete approach again, because if we could store the position of each quark and all other occurences of energy (along with it's actual direction and speed), we could add rules and/or triggers and end up with a complete UNIVERSE simulator in Postgres. Can someone ask IBM (Interstellar Business Machines Corp.) what database they used in our UNIVERSE? Must be running in our parent universe, so the real question is: "can we determine the universe nesting level we actually live in?" Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #========================================= wieck@debis.com (Jan Wieck) #
Alright, alright already! Maybe 64-bit OIDs are sufficient for reasonable people. My point was that, as soon as you set a limit on something, someone will find a use for it which pushes that limit. "640K should be enought for anyone." - William Gates On Wed, 26 Apr 2000, Dale Anderson wrote: > Seems to me that 64 bit OIDs is enough for any rational sensible person, and if it's not enough for you, then you haveway too much time to think about it. > > >>> e99re41@DoCS.UU.SE 04/26/00 08:53AM >>> > On Wed, 26 Apr 2000, Tom Cook wrote: > > > Is this necessarily a good solution? If you use 64-bit OIDs, some joker > > will just hook up a several-terra-byte disk array to his machine, try to > > store the location of every molecule in the universe and break it. > > That's not going to work anyway. To store information about a molecule you > need at least one such molecule to hold that state, barring major > revolutions in storage technology. :-) > > > Admittedly, ~2x10^20 is a very large number, but that's what they thought > > about 2000, also... > > A while ago I said that in order to exhaust the oid space you need to add > 1 million new records a day for more than 10 years. Then someone said, ok, > what if I have an email service with 1 million users that each get 10 > emails a day. Then you're talking about 1 year. But in order to exhaust 64 > bits, you can have 10^9 users (i.e., everyone), getting two million emails > a day for 1000 years. That seems pretty safe for as long as I care. > > Of course to store all molecules you really need more like 384 bits. > > > What I'm saying is, is there a better way of doing this? > > Transfinite numbers ;) > > > -- > Peter Eisentraut Sernanders v�g 10:115 > peter_e@gmx.net 75262 Uppsala > http://yi.org/peter-e/ Sweden > > > -- Tom Cook - Software Engineer "Never criticize a man until you've walked a mile in his shoes; that way, when you criticize him, you're a mile away and have his shoes." - Unknown LISAcorp - www.lisa.com.au -------------------------------------------------- 38 Greenhill Rd. Level 3, 228 Pitt Street Wayville, SA, 5034 Sydney, NSW, 2000 Phone: +61 8 8272 1555 Phone: +61 2 9283 0877 Fax: +61 8 8271 1199 Fax: +61 2 9283 0866 --------------------------------------------------