Thread: unique row identifier data type exhausted . . .

unique row identifier data type exhausted . . .

From
Frank Joerdens
Date:
It feels like there should be some *really* obvious answer to this
question, and I'll find myself whacking my forehead in self-abasement
and out of sheer relief to have found the answer to a problem that
should not have bothered me in the first place since the answer is too
self-evident . . . however, it is bothering me: what happens if the data
type that you've chosen to uniquely identify a row is exhausted? If, for
instance you use int4 and you've had your couple billion deletes and
inserts on the table and the next nextval('seq') . . . well, what
exactly happens and how do they do it? Admittedly, 10^9 is a big number
but it is far from out of the question that you'd reach it on a really
busy database (can't think of a real-world example but that ought to be
a moot point), not to mention oids since they are unique across an
entire database.

Cheers
Frank

--
frank joerdens

joerdens new media
heinrich-roller str. 16/17
10405 berlin
germany

e: frank@joerdens.de
t: +49 (0)30 44055471
f: +49 (0)30 44055475
h: http://www.joerdens.de

pgp public key: http://www.joerdens.de/pgp/frank_joerdens.asc

RE: unique row identifier data type exhausted . . .

From
"Andrew Snow"
Date:
> It feels like there should be some *really* obvious answer to this
> question, and I'll find myself whacking my forehead in self-abasement
> and out of sheer relief to have found the answer to a problem that
> should not have bothered me in the first place since the answer is too
> self-evident . . . however, it is bothering me: what happens if the data
> type that you've chosen to uniquely identify a row is exhausted? If, for
> instance you use int4 and you've had your couple billion deletes and
> inserts on the table and the next nextval('seq') . . . well, what
> exactly happens and how do they do it? Admittedly, 10^9 is a big number
> but it is far from out of the question that you'd reach it on a really
> busy database (can't think of a real-world example but that ought to be
> a moot point), not to mention oids since they are unique across an
> entire database.

I am curious to know how difficult it would be (if at all) to change the
type that oid represents, to a 64 bit number. C'mon guys, this isn't the 90s
any more!


- Andrew




Re: unique row identifier data type exhausted . . .

From
Jurgen Defurne
Date:
Frank Joerdens wrote:

> It feels like there should be some *really* obvious answer to this
> question, and I'll find myself whacking my forehead in self-abasement
> and out of sheer relief to have found the answer to a problem that
> should not have bothered me in the first place since the answer is too
> self-evident . . . however, it is bothering me: what happens if the data
> type that you've chosen to uniquely identify a row is exhausted? If, for
> instance you use int4 and you've had your couple billion deletes and
> inserts on the table and the next nextval('seq') . . . well, what
> exactly happens and how do they do it? Admittedly, 10^9 is a big number
> but it is far from out of the question that you'd reach it on a really
> busy database (can't think of a real-world example but that ought to be
> a moot point), not to mention oids since they are unique across an
> entire database.

This is the command to create a sequence :

CREATE SEQUENCE seqname
    [ INCREMENT increment ]
    [ MINVALUE  minvalue ]
    [ MAXVALUE  maxvalue ]
    [ START     start ]
    [ CACHE     cache ]
    [ CYCLE ]

The CYCLE option restarts the sequence automatically upon overflow. Of
course, your analysis of the database should point out that it is possible
to do this.

Jurgen Defurne
defurnj@glo.be


Re: unique row identifier data type exhausted . . .

From
Bruce Momjian
Date:
>
> > It feels like there should be some *really* obvious answer to this
> > question, and I'll find myself whacking my forehead in self-abasement
> > and out of sheer relief to have found the answer to a problem that
> > should not have bothered me in the first place since the answer is too
> > self-evident . . . however, it is bothering me: what happens if the data
> > type that you've chosen to uniquely identify a row is exhausted? If, for
> > instance you use int4 and you've had your couple billion deletes and
> > inserts on the table and the next nextval('seq') . . . well, what
> > exactly happens and how do they do it? Admittedly, 10^9 is a big number
> > but it is far from out of the question that you'd reach it on a really
> > busy database (can't think of a real-world example but that ought to be
> > a moot point), not to mention oids since they are unique across an
> > entire database.
>
> I am curious to know how difficult it would be (if at all) to change the
> type that oid represents, to a 64 bit number. C'mon guys, this isn't the 90s
> any more!

When we are sure all platforms support 64-bit int's, we will move in
that direction.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

RE: unique row identifier data type exhausted . . .

From
"Andrew Snow"
Date:
> When we are sure all platforms support 64-bit int's, we will move in
> that direction.

Sorry if this is a stupid question, but couldn't you fairly easily make it
an option at compile time? To use either 32 or 64 bit OID's.
(And, less importantly, for sequences)

- Andrew




Re: unique row identifier data type exhausted . . .

From
Bruce Momjian
Date:
>
> > When we are sure all platforms support 64-bit int's, we will move in
> > that direction.
>
> Sorry if this is a stupid question, but couldn't you fairly easily make it
> an option at compile time? To use either 32 or 64 bit OID's.
> (And, less importantly, for sequences)

Well, we could but then all the API would change, and internally there
would be changes.  No one has really asked for it vs. the work in doing
it.


--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

RE: unique row identifier data type exhausted . . .

From
Tom Cook
Date:
On Mon, 24 Apr 2000, Andrew Snow wrote:

> > When we are sure all platforms support 64-bit int's, we will move in
> > that direction.
>
> Sorry if this is a stupid question, but couldn't you fairly easily make it
> an option at compile time? To use either 32 or 64 bit OID's.
> (And, less importantly, for sequences)

Is this necessarily a good solution? If you use 64-bit OIDs, some joker
will just hook up a several-terra-byte disk array to his machine, try to
store the location of every molecule in the universe and break it.

Admittedly, ~2x10^20 is a very large number, but that's what they thought
about 2000, also...

What I'm saying is, is there a better way of doing this?

Cheers
--
Tom Cook - Software Engineer

"Christ died for sin, the righteous for the unrighteous, to bring you to God."
    - 1 Peter 3:18

LISAcorp - www.lisa.com.au

--------------------------------------------------
38 Greenhill Rd.          Level 3, 228 Pitt Street
Wayville, SA, 5034        Sydney, NSW, 2000

Phone:   +61 8 8272 1555  Phone:   +61 2 9283 0877
Fax:     +61 8 8271 1199  Fax:     +61 2 9283 0866
--------------------------------------------------


RE: unique row identifier data type exhausted . . .

From
Franck Martin
Date:
For me it seems more a design problem than the length of internal number....

Why not create a table with 2 field containing int, setting the primary key
on both of them and running a sequencing scheme on both as if it was a
single number....

There are no more limitations anymore, as if you know you will need a lot of
record you may decide to use 1, 2, or 3 numbers...

Question, does postgress is able to run sequence on 2 combined numbers...

Cheers...

Franck Martin
Network and Database Development Officer
SOPAC South Pacific Applied Geoscience Commission
Fiji
E-mail: franck@sopac.org.fj <mailto:franck@sopac.org.fj>
Web site: www.sopac.org.fj <http://www.sopac.org.fj>

        -----Original Message-----
        From:    Tom Cook [mailto:tcook@lisa.com.au]
        Sent:    Wednesday, April 26, 2000 11:49 AM
        To:    Pgsql-General@Postgresql. Org
        Subject:    RE: [GENERAL] unique row identifier data
type exhausted . . .

        On Mon, 24 Apr 2000, Andrew Snow wrote:

        > > When we are sure all platforms support 64-bit int's, we
will move in
        > > that direction.
        >
        > Sorry if this is a stupid question, but couldn't you
fairly easily make it
        > an option at compile time? To use either 32 or 64 bit
OID's.
        > (And, less importantly, for sequences)

        Is this necessarily a good solution? If you use 64-bit OIDs,
some joker
        will just hook up a several-terra-byte disk array to his
machine, try to
        store the location of every molecule in the universe and
break it.

        Admittedly, ~2x10^20 is a very large number, but that's what
they thought
        about 2000, also...

        What I'm saying is, is there a better way of doing this?



RE: unique row identifier data type exhausted . . .

From
"Andrew Snow"
Date:

> Is this necessarily a good solution? If you use 64-bit OIDs, some joker
> will just hook up a several-terra-byte disk array to his machine, try to
> store the location of every molecule in the universe and break it.

If you have to have OIDs at all, its a lot better than a 32 bit number.  I
think it would be easier to switch to 64 bit OIDs than ditch them
completely.
The "serial" type should definitely be 64 bit.  To make matters worse I
believe its really only a 31 bit number as the plus/minus symbol is
discarded.  But I think moving to 64 bit will take place soon enough, when
it needs to, and it should shut everyone up.

> "Christ died for sin, the righteous for the unrighteous, to bring
> you to God."
>     - 1 Peter 3:18

Is it just me, or are there a lot of Christian PostgreSQL users? ;-)  (I'm
not knocking it, I'm one too..)


- Andrew


Re: unique row identifier data type exhausted . . .

From
Bruce Momjian
Date:
>
>
> > Is this necessarily a good solution? If you use 64-bit OIDs, some joker
> > will just hook up a several-terra-byte disk array to his machine, try to
> > store the location of every molecule in the universe and break it.
>
> If you have to have OIDs at all, its a lot better than a 32 bit number.  I
> think it would be easier to switch to 64 bit OIDs than ditch them
> completely.
> The "serial" type should definitely be 64 bit.  To make matters worse I
> believe its really only a 31 bit number as the plus/minus symbol is
> discarded.  But I think moving to 64 bit will take place soon enough, when
> it needs to, and it should shut everyone up.

If you look at that TODO list, oid's flowing over 32-bits is not
something we are losing sleep over.  In fact, the first fix would be to
make sure oid's are truly treated as unsigned int's, thereby doubling
their range.  I have done some of those myself, but I am sure there are
more areas that need fixing.

Illustra's solution was to use two int32's, making the upper 32-bit
value represent the site, so oid's remain unique as they move between
sites.  If we picked a random 32-bit oid on initdb startup, that would
pretty much make them unique all the time.

>
> > "Christ died for sin, the righteous for the unrighteous, to bring
> > you to God."
> >     - 1 Peter 3:18
>
> Is it just me, or are there a lot of Christian PostgreSQL users? ;-)  (I'm
> not knocking it, I'm one too..)

Good question.  :-)

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

RE: unique row identifier data type exhausted . . .

From
Peter Eisentraut
Date:
On Wed, 26 Apr 2000, Tom Cook wrote:

> Is this necessarily a good solution? If you use 64-bit OIDs, some joker
> will just hook up a several-terra-byte disk array to his machine, try to
> store the location of every molecule in the universe and break it.

That's not going to work anyway. To store information about a molecule you
need at least one such molecule to hold that state, barring major
revolutions in storage technology. :-)

> Admittedly, ~2x10^20 is a very large number, but that's what they thought
> about 2000, also...

A while ago I said that in order to exhaust the oid space you need to add
1 million new records a day for more than 10 years. Then someone said, ok,
what if I have an email service with 1 million users that each get 10
emails a day. Then you're talking about 1 year. But in order to exhaust 64
bits, you can have 10^9 users (i.e., everyone), getting two million emails
a day for 1000 years. That seems pretty safe for as long as I care.

Of course to store all molecules you really need more like 384 bits.

> What I'm saying is, is there a better way of doing this?

Transfinite numbers ;)


--
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden


RE: unique row identifier data type exhausted . . .

From
"Dale Anderson"
Date:
Seems to me that 64 bit OIDs is enough for any rational sensible person, and if it's not enough for you, then you have
waytoo much time to think about it. 

>>> e99re41@DoCS.UU.SE 04/26/00 08:53AM >>>
On Wed, 26 Apr 2000, Tom Cook wrote:

> Is this necessarily a good solution? If you use 64-bit OIDs, some joker
> will just hook up a several-terra-byte disk array to his machine, try to
> store the location of every molecule in the universe and break it.

That's not going to work anyway. To store information about a molecule you
need at least one such molecule to hold that state, barring major
revolutions in storage technology. :-)

> Admittedly, ~2x10^20 is a very large number, but that's what they thought
> about 2000, also...

A while ago I said that in order to exhaust the oid space you need to add
1 million new records a day for more than 10 years. Then someone said, ok,
what if I have an email service with 1 million users that each get 10
emails a day. Then you're talking about 1 year. But in order to exhaust 64
bits, you can have 10^9 users (i.e., everyone), getting two million emails
a day for 1000 years. That seems pretty safe for as long as I care.

Of course to store all molecules you really need more like 384 bits.

> What I'm saying is, is there a better way of doing this?

Transfinite numbers ;)


--
Peter Eisentraut                  Sernanders väg 10:115
peter_e@gmx.net                   75262 Uppsala
http://yi.org/peter-e/            Sweden



Re: unique row identifier data type exhausted . . .

From
Mark Dalphin
Date:
Bruce Momjian wrote:

> > > Is this necessarily a good solution? If you use 64-bit OIDs, some joker
> > > will just hook up a several-terra-byte disk array to his machine, try to
> > > store the location of every molecule in the universe and break it.
> >
> > If you have to have OIDs at all, its a lot better than a 32 bit number.  I
> > think it would be easier to switch to 64 bit OIDs than ditch them
> > completely.
> > The "serial" type should definitely be 64 bit.  To make matters worse I
> > believe its really only a 31 bit number as the plus/minus symbol is
> > discarded.  But I think moving to 64 bit will take place soon enough, when
> > it needs to, and it should shut everyone up.
>
> If you look at that TODO list, oid's flowing over 32-bits is not
> something we are losing sleep over.  In fact, the first fix would be to
> make sure oid's are truly treated as unsigned int's, thereby doubling
> their range.  I have done some of those myself, but I am sure there are
> more areas that need fixing.
>
> Illustra's solution was to use two int32's, making the upper 32-bit
> value represent the site, so oid's remain unique as they move between
> sites.  If we picked a random 32-bit oid on initdb startup, that would
> pretty much make them unique all the time.
>
> --
>   Bruce Momjian                        |  http://www.op.net/~candle
>   pgman@candle.pha.pa.us               |  (610) 853-3000
>   +  If your life is a hard drive,     |  830 Blythe Avenue
>   +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

While I am working on a system which could use 64-bit OIDs, and I think it is a
good idea to move to them, I wonder if the developers should consider the people
who are running older, "legacy" systems as well.  Moving to a 64-bit OID would
add considerably to the space required (ie the overhead) to run the database.
Many Linux systems are "Linux" because Windows got too big and clunky to run
there.  If possible, I'd suggest leaving the OID size as a compile time switch
so those who wish to run "light" can do so, and those who wish to tally the
molecules of the universe can think about how to compress the data to fit within
a 64-bit OID.

Mark

--
Mark Dalphin                          email: mdalphin@amgen.com
Mail Stop: 29-2-A                     phone: +1-805-447-4951 (work)
One Amgen Center Drive                       +1-805-375-0680 (home)
Thousand Oaks, CA 91320                 fax: +1-805-499-9955 (work)




Re: unique row identifier data type exhausted . . .

From
wieck@debis.com (Jan Wieck)
Date:
> > Is this necessarily a good solution? If you use 64-bit OIDs, some joker
> > will just hook up a several-terra-byte disk array to his machine, try to
> > store the location of every molecule in the universe and break it.
>
> That's not going to work anyway. To store information about a molecule you
> need at least one such molecule to hold that state, barring major
> revolutions in storage technology. :-)

    Maybe one or two quarks are enough to represent a single bit.
    Then you can break this barrier and store the  data,  because
    most molecules consists of more quarks.

    But  that's an incomplete approach again, because if we could
    store the position of each quark and all other occurences  of
    energy (along with it's actual direction and speed), we could
    add rules and/or triggers and end up with a complete UNIVERSE
    simulator in Postgres.

    Can  someone  ask  IBM (Interstellar Business Machines Corp.)
    what database they used in our UNIVERSE? Must be  running  in
    our  parent  universe,  so  the  real  question  is:  "can we
    determine the universe nesting level we actually live in?"


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#========================================= wieck@debis.com (Jan Wieck) #



RE: unique row identifier data type exhausted . . .

From
Tom Cook
Date:
Alright, alright already! Maybe 64-bit OIDs are sufficient for reasonable
people. My point was that, as soon as you set a limit on something,
someone will find a use for it which pushes that limit.

"640K should be enought for anyone." - William Gates

On Wed, 26 Apr 2000, Dale Anderson wrote:

> Seems to me that 64 bit OIDs is enough for any rational sensible person, and if it's not enough for you, then you
haveway too much time to think about it. 
>
> >>> e99re41@DoCS.UU.SE 04/26/00 08:53AM >>>
> On Wed, 26 Apr 2000, Tom Cook wrote:
>
> > Is this necessarily a good solution? If you use 64-bit OIDs, some joker
> > will just hook up a several-terra-byte disk array to his machine, try to
> > store the location of every molecule in the universe and break it.
>
> That's not going to work anyway. To store information about a molecule you
> need at least one such molecule to hold that state, barring major
> revolutions in storage technology. :-)
>
> > Admittedly, ~2x10^20 is a very large number, but that's what they thought
> > about 2000, also...
>
> A while ago I said that in order to exhaust the oid space you need to add
> 1 million new records a day for more than 10 years. Then someone said, ok,
> what if I have an email service with 1 million users that each get 10
> emails a day. Then you're talking about 1 year. But in order to exhaust 64
> bits, you can have 10^9 users (i.e., everyone), getting two million emails
> a day for 1000 years. That seems pretty safe for as long as I care.
>
> Of course to store all molecules you really need more like 384 bits.
>
> > What I'm saying is, is there a better way of doing this?
>
> Transfinite numbers ;)
>
>
> --
> Peter Eisentraut                  Sernanders v�g 10:115
> peter_e@gmx.net                   75262 Uppsala
> http://yi.org/peter-e/            Sweden
>
>
>

--
Tom Cook - Software Engineer

"Never criticize a man until you've walked a mile in his shoes; that way,
when you criticize him, you're a mile away and have his shoes."
    - Unknown

LISAcorp - www.lisa.com.au

--------------------------------------------------
38 Greenhill Rd.          Level 3, 228 Pitt Street
Wayville, SA, 5034        Sydney, NSW, 2000

Phone:   +61 8 8272 1555  Phone:   +61 2 9283 0877
Fax:     +61 8 8271 1199  Fax:     +61 2 9283 0866
--------------------------------------------------