Thread: Using oids

Using oids

From

Malcolm Warren

Date:

03 September 2003, 05:27:42

While updating to 7.3.4 I note with some alarm the following passage in
README.Debian.migration.gz written by Oliver Elphick:

"Some schema designs rely on the use of oids as row identifiers. This is
definitely not recommended, not least because oids are not guaranteed to
exist in all future versions of PostgreSQL. Oids are an internal feature
only. They are not suitable as candidate keys, since they are not
guaranteed to be unique; furthermore, the starting point for oids is likely
to change whenever a change to the database structure occurs."

While I have not used oids to join tables, I have used them extensively in
programming, because if Postgres has supplied a unique number for each row,
why on earth should I bother supplying another one of my own?

Like many people starting with Postgres, three or four years ago I
carefully read Bruce Momjian's excellent introductory book on Postgres,
which explained many initially difficult concepts with such clarity.

The book states "Every row in POSTGRESQL is assigned a unique, normally
invisible number called an object identification number (OID). When the
software is initialized with initdb, a counter is created and set to
approximately seventeen-thousand. The counter is used to uniquely number
every row. Although databases may be created and destroyed, the counter
continues to increase. It is used by all databases, so identification
numbers are always unique. No two rows in any table or in any database will
ever have the same object ID."

Further down we read:
"Object identification numbers can be used as primary and foreign key
values in joins. Since every row has a unique object ID, a separate column
is not needed to hold the row's unique number."

On the next page are listed the limitations of oids, for example they are
nonsequential, nonmodifiable, and not backed up by default, but for my uses
these were not problems at all. I have merely used the oid number as a
temporary unique identifier before assigning a permanent booking number to
it, which takes about a nanosecond, and in other similar cases.

To sum up: The Debian migration gzip file declares that oids are not
guaranteed to be unique, issues dire warnings about using them as keys and
worst of all states that they may be phased out in the future.

The book states that they are unique, tells you how to use them, actually
gives an example of using them as primary and foreign keys (which
fortunately I decided was not very wise) and certainly doesn't say anything
about phasing them out in the future.

Can anybody shed any light on this?
Malcolm Warren

Re: Using oids

From

"Shridhar Daithankar"

Date:

03 September 2003, 06:09:11

On 3 Sep 2003 at 10:27, Malcolm Warren wrote:

> To sum up: The Debian migration gzip file declares that oids are not
> guaranteed to be unique, issues dire warnings about using them as keys and
> worst of all states that they may be phased out in the future.
>
> The book states that they are unique, tells you how to use them, actually
> gives an example of using them as primary and foreign keys (which
> fortunately I decided was not very wise) and certainly doesn't say anything
> about phasing them out in the future.

Yes. It is correct. As of 7.3.x and onwards oids are optional at table creation
times. They default to be available for new objects but that is for backwards
compatibility I believe. In future, they would default to be not available for
a particular object(hopefully). Right now you need to explicitly specify no
oids while creating tables etc.

About oids not being unique, oids can assume 4 billion different values. If you
have more than those many rows in a table, oids will wrap around and will no
longer be unique in that object.

About oids being eliminated, I am sure it would happen some time in the future,
looking at the development on this issue. Core team could elaborate more on
this.

Correct me if I am wrong.

HTH

Bye
 Shridhar

--
Nusbaum's Rule:    The more pretentious the corporate name, the smaller the
organization.  (For instance, the Murphy Center for the    Codification of Human
and Organizational Law, contrasted    to IBM, GM, and AT&T.)

Re: Using oids

From

Bo Lorentsen

Date:

03 September 2003, 06:28:28

On Wed, 2003-09-03 at 11:10, Shridhar Daithankar wrote:

> Yes. It is correct. As of 7.3.x and onwards oids are optional at table creation
> times. They default to be available for new objects but that is for backwards
> compatibility I believe. In future, they would default to be not available for
> a particular object(hopefully). Right now you need to explicitly specify no
> oids while creating tables etc.
I do understand the limitation of the oid as implimented now, but why
remove the possibility to make unique row references (like after an
insert), instead of extenting the oid ?

I don't care if they wrap, or is a unique string or anything else, as
long as I can use it to refetch a row after an insert, without keeping
track of app. implimentation specific SERIAL fields.

> About oids not being unique, oids can assume 4 billion different values. If you
> have more than those many rows in a table, oids will wrap around and will no
> longer be unique in that object.
I see that this is a problem, and 4 billion is not alot, but why not
make another format like in oracle, so that it still is possible to
refere to a row using a unique --- thing. There must be an internal oid
somewhere, what we may be able to use.

> About oids being eliminated, I am sure it would happen some time in the future,
> looking at the development on this issue. Core team could elaborate more on
> this.
Sounds like a sad thing, if this is not replaced by something simular,
but more fit to a large DB.

/BL

Re: Using oids

From

"Shridhar Daithankar"

Date:

03 September 2003, 06:36:49

On 3 Sep 2003 at 11:28, Bo Lorentsen wrote:

> On Wed, 2003-09-03 at 11:10, Shridhar Daithankar wrote:
>
> > Yes. It is correct. As of 7.3.x and onwards oids are optional at table creation
> > times. They default to be available for new objects but that is for backwards
> > compatibility I believe. In future, they would default to be not available for
> > a particular object(hopefully). Right now you need to explicitly specify no
> > oids while creating tables etc.
> I do understand the limitation of the oid as implimented now, but why
> remove the possibility to make unique row references (like after an
> insert), instead of extenting the oid ?
>
> I don't care if they wrap, or is a unique string or anything else, as
> long as I can use it to refetch a row after an insert, without keeping
> track of app. implimentation specific SERIAL fields.

Well, what I do is, declare a serate sequence, retrive next available value and
explicitly insert it into a integer field. That avoids having to retrieve the
latest value again.

I don't know if this is a widespread practice but I find it useful for more
than one way in the environment in which I program.
>
> > About oids not being unique, oids can assume 4 billion different values. If you
> > have more than those many rows in a table, oids will wrap around and will no
> > longer be unique in that object.
> I see that this is a problem, and 4 billion is not alot, but why not
> make another format like in oracle, so that it still is possible to
> refere to a row using a unique --- thing. There must be an internal oid
> somewhere, what we may be able to use.

I understand. With growing use of 64 bit hardware, 4 billion will be history
pretty soon.

However historically oids were assumed to be 32 bit. There could be places
which unintentionally assumed it as such. Cleaning all those places is pretty
difficult given the big code base postgresql has.

If you compile postgresql with Oid as 64 bit integer, that will work in most
cases probably. However it does not guarantee that it will always work. There
always could be some places which assumed 32 bit data types.

That is one of the problem as I understand.

(Gathered and extended from one of Tom's post. correct me if I am wrong)

> > About oids being eliminated, I am sure it would happen some time in the future,
> > looking at the development on this issue. Core team could elaborate more on
> > this.
> Sounds like a sad thing, if this is not replaced by something simular,
> but more fit to a large DB.

You can request this to be a TODO to hackers. They will decide if this is worth
having it. Personally I support it the way it is. Optionally available.

For large table containing billions of rows, Oids add to tuple size and overall
IO. If you are not using Oids, they become overhead. Ability to turn them off
is certainly nice..

Bye
 Shridhar

--
Unfair animal names:-- tsetse fly            -- bullhead-- booby            -- duck-billed
platypus-- sapsucker            -- Clarence        -- Gary Larson

Re: Using oids

From

Malcolm Warren

Date:

03 September 2003, 06:39:56

On Wed, 03 Sep 2003 11:28:01 +0200, Bo Lorentsen <bl@netgroup.dk> wrote:

> On Wed, 2003-09-03 at 11:10, Shridhar Daithankar wrote:
>
>> Yes. It is correct. As of 7.3.x and onwards oids are optional at table
>> creation times. They default to be available for new objects but that is
>> for backwards compatibility I believe. In future, they would default to
>> be not available for a particular object(hopefully). Right now you need
>> to explicitly specify no oids while creating tables etc.

The point about oids is that they are so useful that many people have used
them extensively, me included.
And I did so on the basis of a book written by Bruce Mowjian, one of
Postgresql's major contributors.

If we are saying that there will always be backwards compatibility then I
can sleep at night.
If we are saying that at some point oids will exist no more then all of a
sudden I am going to have to re-write an awful lot of code.

Malcolm Warren

Re: Using oids

From

Bo Lorentsen

Date:

03 September 2003, 07:21:23

On Wed, 2003-09-03 at 11:38, Shridhar Daithankar wrote:

> Well, what I do is, declare a serate sequence, retrive next available value and
> explicitly insert it into a integer field. That avoids having to retrieve the
> latest value again.
Yeps, this is what I call an application specific implimentation, as one
can't do this at a more genral layer (that does not know about your
table layout).

Like having a general function that insert a row and return the newly
inserted row, containing the defaults set by PG. My code contain this
function (http://www.lue.dk/prj/dbc), and I have no way to make this
work if I'm not able to fetch the oid after an insert, in some way.

> I don't know if this is a widespread practice but I find it useful for more
> than one way in the environment in which I program.
You are not the only one recommenting this solution :-)

> I understand. With growing use of 64 bit hardware, 4 billion will be history
> pretty soon.
Agreed !

> However historically oids were assumed to be 32 bit. There could be places
> which unintentionally assumed it as such. Cleaning all those places is pretty
> difficult given the big code base postgresql has.
One day I will try to look after myself, but what does PG do internaly,
when referring to rows in a unique way (lets say in an index) ?

> If you compile postgresql with Oid as 64 bit integer, that will work in most
> cases probably. However it does not guarantee that it will always work. There
> always could be some places which assumed 32 bit data types.
But if convertet to a string type, all involved places would fail, and
no uncertency are involved when fixing it (well, a little too primitive
argument, I know) :-)

> You can request this to be a TODO to hackers. They will decide if this is worth
> having it. Personally I support it the way it is. Optionally available.
I'm not sure, if my skills reach this fare, but thanks for the advice.
And for the rest --- we disagree :-)

> For large table containing billions of rows, Oids add to tuple size and overall
> IO. If you are not using Oids, they become overhead. Ability to turn them off
> is certainly nice..
Yeps, if they really are not nessesary.

/B

Re: Using oids

From

Malcolm Warren

Date:

03 September 2003, 08:06:25

On Wed, 03 Sep 2003 11:40:01 +0100, Oliver Elphick <olly@lfix.co.uk> wrote:

> On Wed, 2003-09-03 at 10:39, Malcolm Warren wrote:
>> The point about oids is that they are so useful that many people have
>> used them extensively, me included.
>> And I did so on the basis of a book written by Bruce Mowjian, one of
>> Postgresql's major contributors.
>>
>> If we are saying that there will always be backwards compatibility then
>> I can sleep at night.
>> If we are saying that at some point oids will exist no more then all of
>> a sudden I am going to have to re-write an awful lot of code.
>
> I wrote that passage for the Debian package on the basis of various
> emails I have read on the lists in the past.  Whether the oid feature
> will ever be completely removed, I cannot say.
>
> It is certainly the case that if oids wrap round, uniqueness is no
> longer guaranteed.  You may be certain that your oid counter won't wrap,
> but I cannot express such certainty for all possible users of the Debian
> package.
>
> It is my contention that it is wrong to build your database design on an
> internal feature of the database; it automatically makes it unportable
> to any other database.  To my mind, it also demonstrates that the
> database design is faulty.  If the tables in question have candidate
> keys, why not use them?  If they don't, how can it be meaningful to use
> the oid as a foreign key?
>

Thanks for your reply Oliver, it's great to hear direct from the source.

I agree with you about database design and in fact fortunately I don't use
oids as foreign keys, which I thought unwise.
However I have found oids very useful as temporary unique references to a
record in my programming.
If I had known when I started writing my code three years ago that there
was even the
slightest doubt about continuing with oids then I wouldn't have used them.

As it is, I've used them extensively, because like many people I am
fortunate enough to be only dealing in thousands, not even millions of
records. And also because I had read a lot of documentation and had seen
nothing
that even hinted at a problem using them, on the contrary, I found
references to their usefulness.

So, will there be backwards compatibility? Surely if they are going to be
got rid of (for the moment by default they are still used), there has to be
the possibility to switch them on again where necessary?

I think with this email I have had my final say. I see that there is still
debate going on about what to do with oids and I've been able to have put
my point. Thanks.

Malcolm Warren

Re: Using oids

From

Martijn van Oosterhout

Date:

03 September 2003, 08:20:54

On Wed, Sep 03, 2003 at 12:20:42PM +0200, Bo Lorentsen wrote:
> On Wed, 2003-09-03 at 11:38, Shridhar Daithankar wrote:
>
> > Well, what I do is, declare a serate sequence, retrive next available value and
> > explicitly insert it into a integer field. That avoids having to retrieve the
> > latest value again.
> Yeps, this is what I call an application specific implimentation, as one
> can't do this at a more genral layer (that does not know about your
> table layout).

But your insert function needs to know something about the table it's
inserting into. The sequences have quite predicatable names. Besides, you
can set the name yourself (DCL does this IIRC).

> Like having a general function that insert a row and return the newly
> inserted row, containing the defaults set by PG. My code contain this
> function (http://www.lue.dk/prj/dbc), and I have no way to make this
> work if I'm not able to fetch the oid after an insert, in some way.

The only thing you need to know is the name of the primary key field. This
many be a problem in a generic layer. If you like you can make a UNIQUE
INDEX on the oid column and retry inserts when they fail.

In your code, do create an index on the OID column? If not, that's be a
performance hit,

> > However historically oids were assumed to be 32 bit. There could be places
> > which unintentionally assumed it as such. Cleaning all those places is pretty
> > difficult given the big code base postgresql has.
> One day I will try to look after myself, but what does PG do internaly,
> when referring to rows in a unique way (lets say in an index) ?

Something called CTID usually (AFAIK). It's a (block,tuple) pair. It changes
whenever you do an update. Or a vacuum.

> > For large table containing billions of rows, Oids add to tuple size and overall
> > IO. If you are not using Oids, they become overhead. Ability to turn them off
> > is certainly nice..
> Yeps, if they really are not nessesary.

If I were doing it would extract the primary key of each table on startup
and then change that one line of code to:

os << "SELECT * FROM " << sTable << " WHERE "
   << prikey << " = currval('" << sTable << "_" << prikey << "_seq')";

Say you have a LAST_ID function and you a table with more than one sequence,
which would it return?

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

Attachment

msg-28643-66309.dat

Re: Using oids

From

Bo Lorentsen

Date:

03 September 2003, 08:47:22

On Wed, 2003-09-03 at 13:19, Martijn van Oosterhout wrote:

> But your insert function needs to know something about the table it's
> inserting into. The sequences have quite predicatable names. Besides, you
> can set the name yourself (DCL does this IIRC).
No it don't know anything about the table it insert into. I simply do
the following :

1. INSERT data (comming from another layer)
2. Get the last oid
3. SELECT * FROM the same table where oid = what I just found.

I know absolutly nothing about the table, and I like it this way :-)

> The only thing you need to know is the name of the primary key field. This
> many be a problem in a generic layer. If you like you can make a UNIQUE
> INDEX on the oid column and retry inserts when they fail.
Hmm, it all end up putting alot of information to a lower layer, and
this is sad as PG already knows, but it may not tell me.

> In your code, do create an index on the OID column? If not, that's be a
> performance hit,
I'm not sure what you mean !

> Something called CTID usually (AFAIK). It's a (block,tuple) pair. It changes
> whenever you do an update. Or a vacuum.
So no id for a row, but only for a row instance ? Is this the reason for
the growing index files ?

> If I were doing it would extract the primary key of each table on startup
> and then change that one line of code to:
>
> os << "SELECT * FROM " << sTable << " WHERE "
>    << prikey << " = currval('" << sTable << "_" << prikey << "_seq')";
Thanks, but I have to be aware of the "prikey" name, and demand a prikey
for all tables to insert row into :-(

> Hope this helps,

I know what you mean, but I don't like the impact of the solution.

/BL

Re: Using oids

From

Oliver Elphick

Date:

03 September 2003, 10:04:56

On Wed, 2003-09-03 at 12:19, Martijn van Oosterhout wrote:
> If I were doing it would extract the primary key of each table on startup
> and then change that one line of code to:
>
> os << "SELECT * FROM " << sTable << " WHERE "
>    << prikey << " = currval('" << sTable << "_" << prikey << "_seq')";

You cannot use currval() until you have used nextval() on the same
sequence in the same session.

--
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight, UK                             http://www.lfix.co.uk/oliver
GPG: 1024D/3E1D0C1C: CA12 09E0 E8D5 8870 5839  932A 614D 4C34 3E1D 0C1C
                 ========================================
     "And he said unto his disciples, Therefore I say unto
      you, Take no thought for your life, what ye shall eat;
      neither for the body, what ye shall put on. For life
      is more than meat, and the body is more than clothing.
      Consider the ravens, for they neither sow nor reap;
      they have neither storehouse nor barn; and yet God
      feeds them;  how much better you are than the birds!
      Consider the lilies, how they grow; they toil
      not, they spin not; and yet I say unto you, that
      Solomon in all his glory was not arrayed like one of
      these. If then God so clothe the grass, which is to
      day in the field, and tomorrow is cast into the oven;
      how much more will he clothe you, O ye of little
      faith?  And seek not what ye shall eat, or what ye
      shall drink, neither be ye of doubtful mind.
      But rather seek ye the kingdom of God; and all these
      things shall be added unto you."
                              Luke 12:22-24; 27-29; 31.

Re: Using oids

From

Tom Lane

Date:

03 September 2003, 11:14:10

Bo Lorentsen <bl@netgroup.dk> writes:
> On Wed, 2003-09-03 at 11:10, Shridhar Daithankar wrote:
>> About oids not being unique, oids can assume 4 billion different
>> values. If you have more than those many rows in a table, oids will
>> wrap around and will no longer be unique in that object.

> I see that this is a problem, and 4 billion is not alot, but why not
> make another format like in oracle, so that it still is possible to
> refere to a row using a unique --- thing.

The reason OIDs shouldn't be considered unique is that there is no
mechanism to enforce that they are unique --- unless you make one,
that is, create a unique index on OID for a table.  The system does
not do that for you since it would be excessive overhead for tables
in which the user doesn't care about OID uniqueness.  But I'd
definitely recommend it if you are using OIDs for row identifiers.

If you want a globally unique ID based on OIDs, use the table OID
concatenated with the row OID.

I don't foresee OIDs going away in the future, because they are used
internally by the system; but I don't foresee them being "improved"
either.  Serial columns already do much of what people would like from
OIDs, and the developers' response to all questions along this line
is likely to be "use a serial column instead".  It's possible that the
default for table creation will switch to WITHOUT OIDS at some future
time, though I don't consider that a done deal because of the backwards
compatibility issue.

> There must be an internal oid
> somewhere, what we may be able to use.

No, there isn't.  There is only ctid, which is not useful as a long-term
row identifier, because UPDATE and VACUUM can change it.

            regards, tom lane

Re: Using oids

From

Alvaro Herrera Munoz

Date:

03 September 2003, 12:19:15

On Wed, Sep 03, 2003 at 01:05:30PM +0200, Malcolm Warren wrote:

> I agree with you about database design and in fact fortunately I don't use
> oids as foreign keys, which I thought unwise.  However I have found oids very
> useful as temporary unique references to a record in my programming.  If I
> had known when I started writing my code three years ago that there was even
> the slightest doubt about continuing with oids then I wouldn't have used
> them.

You can create tables WITH OIDS (this is by default on 7.3, but will
probably changed in some future release).  If you also create an unique
index on the oid column of the table, you have all you need.  But beware
that some INSERTs will fail because the OID counter will wrap around at some
point.  There is a non-zero probability that the newly generated OID will
collide with an existing tuple in that table; you have to be prepared to
repeat your query in that case, which can be a pain if you are doing
something else in the same transaction.

--
Alvaro Herrera (<alvherre[@]dcc.uchile.cl>)
"Lo esencial es invisible para los ojos" (A. de Saint Ex�pery)

Re: Using oids

From

Dennis Gearon

Date:

03 September 2003, 12:47:57

Why is that, anyway, and why should it be?

Oliver Elphick wrote:

>On Wed, 2003-09-03 at 12:19, Martijn van Oosterhout wrote:
>
>
>>If I were doing it would extract the primary key of each table on startup
>>and then change that one line of code to:
>>
>>os << "SELECT * FROM " << sTable << " WHERE "
>>   << prikey << " = currval('" << sTable << "_" << prikey << "_seq')";
>>
>>
>
>You cannot use currval() until you have used nextval() on the same
>sequence in the same session.
>
>
>

Re: Using oids

From

Dennis Gearon

Date:

03 September 2003, 12:50:50

The elimination is in concert with the dying of popularity in 'Object
Oriented Databases', right?

Shridhar Daithankar wrote:

>On 3 Sep 2003 at 10:27, Malcolm Warren wrote:
>
>
>
>>To sum up: The Debian migration gzip file declares that oids are not
>>guaranteed to be unique, issues dire warnings about using them as keys and
>>worst of all states that they may be phased out in the future.
>>
>>The book states that they are unique, tells you how to use them, actually
>>gives an example of using them as primary and foreign keys (which
>>fortunately I decided was not very wise) and certainly doesn't say anything
>>about phasing them out in the future.
>>
>>
>
>Yes. It is correct. As of 7.3.x and onwards oids are optional at table creation
>times. They default to be available for new objects but that is for backwards
>compatibility I believe. In future, they would default to be not available for
>a particular object(hopefully). Right now you need to explicitly specify no
>oids while creating tables etc.
>
>About oids not being unique, oids can assume 4 billion different values. If you
>have more than those many rows in a table, oids will wrap around and will no
>longer be unique in that object.
>
>About oids being eliminated, I am sure it would happen some time in the future,
>looking at the development on this issue. Core team could elaborate more on
>this.
>
>Correct me if I am wrong.
>
>HTH
>
>Bye
> Shridhar
>
>--
>Nusbaum's Rule:    The more pretentious the corporate name, the smaller the
>organization.  (For instance, the Murphy Center for the    Codification of Human
>and Organizational Law, contrasted    to IBM, GM, and AT&T.)
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 5: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faqs/FAQ.html
>
>
>

Re: Using oids

From

Doug McNaught

Date:

03 September 2003, 13:14:34

Dennis Gearon <gearond@fireserve.net> writes:

> Oliver Elphick wrote:
>
> >You cannot use currval() until you have used nextval() on the same
> >sequence in the same session.

> Why is that, anyway, and why should it be?

Because that's what currval() does.  It doesn't have anything to do
with sequence values in other sessions.  It gives you the last value
*you* got for thee sequence, so if you haven't called nextval() yet
you should and do get an error.

Maybe it should have been called lastval(), but that could be a bit
misleading too...

-Doug

Re: Using oids

From

Martijn van Oosterhout

Date:

03 September 2003, 13:16:59

On Wed, Sep 03, 2003 at 01:47:01PM +0200, Bo Lorentsen wrote:
> On Wed, 2003-09-03 at 13:19, Martijn van Oosterhout wrote:
> > The only thing you need to know is the name of the primary key field. This
> > many be a problem in a generic layer. If you like you can make a UNIQUE
> > INDEX on the oid column and retry inserts when they fail.
> Hmm, it all end up putting alot of information to a lower layer, and
> this is sad as PG already knows, but it may not tell me.

Well, in a sense it know and in a sense it doesn't. Sequences are not
considered special in terms of returning data to the client. It's just
another function from the parser's point of view.

> > In your code, do create an index on the OID column? If not, that's be a
> > performance hit,
> I'm not sure what you mean !

If you know the OID of a row, PostgreSQL doesn't have a special lookup table
to find it. That's also why they're not unique; the backend would have to
scan through every table to find out if the next one is available.

So, unless you specifically add an index to the table, looking up by OID
will always trigger a sequential scan.

That said, there is no reason why someone couldn't create a last_sequence()
function so you could say SELECT currval( last_sequence() ). Ofcourse, if
your table has no SERIAL field, you're stuffed either way.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

Attachment

msg-28643-66349.dat

Re: Using oids

From

Bo Lorentsen

Date:

03 September 2003, 13:19:50

On Wed, 2003-09-03 at 16:13, Tom Lane wrote:

> The reason OIDs shouldn't be considered unique is that there is no
> mechanism to enforce that they are unique --- unless you make one,
> that is, create a unique index on OID for a table.  The system does
> not do that for you since it would be excessive overhead for tables
> in which the user doesn't care about OID uniqueness.  But I'd
> definitely recommend it if you are using OIDs for row identifiers.
Ok, so my little INSERT / SELECT show will continue to work for a long
time, as I only uses the oids on short term bacis.

> If you want a globally unique ID based on OIDs, use the table OID
> concatenated with the row OID.
Ok, this make sense !

> No, there isn't.  There is only ctid, which is not useful as a long-term
> row identifier, because UPDATE and VACUUM can change it.
But there is no way for the client user to user these in a
"PQgetLastCtid" and "SELECT * from zxy where ctid = 42", so this will
not help :-)

Thanks anyway, may oid's live for a long time, and one day become grown
up 64 bit values :-)

/BL

Re: Using oids

From

Bruno Wolff III

Date:

03 September 2003, 13:22:27

On Wed, Sep 03, 2003 at 08:46:42 -0700,
  Dennis Gearon <gearond@fireserve.net> wrote:
> Why is that, anyway, and why should it be?

Because it reduces contention by giving each backend its own pool
of sequence values. But until you call nextval a backend won't
have any values reserved.

Re: Using oids

From

Greg Stark

Date:

03 September 2003, 13:31:54

Bo Lorentsen <bl@netgroup.dk> writes:

> > If I were doing it would extract the primary key of each table on startup
> > and then change that one line of code to:
> >
> > os << "SELECT * FROM " << sTable << " WHERE "
> >    << prikey << " = currval('" << sTable << "_" << prikey << "_seq')";
>
> Thanks, but I have to be aware of the "prikey" name, and demand a prikey
> for all tables to insert row into :-(

This is an issue faced mostly by driver developers that want to provide high
level abstract interfaces.

The problem is that using OIDs is basically imposing a primary key on every
table even when the application designer didn't want one. They're mostly
redundant because most tables will have a primary key, wasteful for small
tables, and inadequate for large tables.

I don't like hard coding the assumption that the sequence name is based on the
primary key column name either though. Not every table will have a primary key
of "serial" type. Consider reference tables where the primary key is
non-arbitrary value. Even when it is, the sequence name can be truncated.

The new binary FE protocol included some discussion of API features to allow
drivers like JDBC get column information. I believe that API included an
indication of what the primary key column was. I'm not sure it includes a hook
to get the value of the last insertion, presumably via the sequence. If it
does I would think that would be far preferable to using OIDs.

The disadvantage: tables with no primary key whatsoever would couldn't be
supported by your high level abstraction. I only end up with tables with no
primary keys for many-to-many relationships (or one-to-many sets of immutable
data which amounts to the same thing) anyways. You want to insert, do mass
deletions, but never update such records anyways.

The pros: no extra overhead for OIDs, more portable to other databases.

--
greg

Re: Using oids

From

Bo Lorentsen

Date:

03 September 2003, 13:36:55

On Wed, 2003-09-03 at 17:28, Martijn van Oosterhout wrote:

> If you know the OID of a row, PostgreSQL doesn't have a special lookup table
> to find it. That's also why they're not unique; the backend would have to
> scan through every table to find out if the next one is available.
Ahh, thats not nice, hav'nt checked that, yet.

> So, unless you specifically add an index to the table, looking up by OID
> will always trigger a sequential scan.
I thought it was much more easy for PG to find these, but I quess ctid
are the one that is fast to find.

> That said, there is no reason why someone couldn't create a last_sequence()
> function so you could say SELECT currval( last_sequence() ). Ofcourse, if
> your table has no SERIAL field, you're stuffed either way.
Not as nice as oid's.

/BL

Re: Using oids

From

Darko Prenosil

Date:

03 September 2003, 13:55:35

On Wednesday 03 September 2003 17:24, Bo Lorentsen wrote:
> On Wed, 2003-09-03 at 16:13, Tom Lane wrote:
> > The reason OIDs shouldn't be considered unique is that there is no
> > mechanism to enforce that they are unique --- unless you make one,
> > that is, create a unique index on OID for a table.  The system does
> > not do that for you since it would be excessive overhead for tables
> > in which the user doesn't care about OID uniqueness.  But I'd
> > definitely recommend it if you are using OIDs for row identifiers.
>
> Ok, so my little INSERT / SELECT show will continue to work for a long
> time, as I only uses the oids on short term bacis.
>
> > If you want a globally unique ID based on OIDs, use the table OID
> > concatenated with the row OID.
>
> Ok, this make sense !
>
> > No, there isn't.  There is only ctid, which is not useful as a long-term
> > row identifier, because UPDATE and VACUUM can change it.
>
> But there is no way for the client user to user these in a
> "PQgetLastCtid" and "SELECT * from zxy where ctid = 42", so this will
> not help :-)
>
> Thanks anyway, may oid's live for a long time, and one day become grown
> up 64 bit values :-)
>
Any other way, a lot of (my) code will become useless :-(

Regards !

Re: Using oids

From

Jonathan Bartlett

Date:

03 September 2003, 16:56:00

> About oids not being unique, oids can assume 4 billion different values. If you
> have more than those many rows in a table, oids will wrap around and will no
> longer be unique in that object.

Not quite.  After 4 billion inserts (even spread across millions of
tables), you run out of OIDs and they will no longer be unique.  OIDs I
think were originally meant to be globally unique identifiers, but they
are no longer so, and are really no longer useful.  When I want globally
unique identifiers, I use an int8 column + sequence.

Jon

Re: Using oids

From

Jonathan Bartlett

Date:

03 September 2003, 16:59:07

> If we are saying that there will always be backwards compatibility then I
> can sleep at night.
> If we are saying that at some point oids will exist no more then all of a
> sudden I am going to have to re-write an awful lot of code.

Why?  Couldn't you do alter table X add column oid int; alter table X
alter oid set default nextval('oids');  create sequence oids ...

Jon

>
> Malcolm Warren
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>

Re: Using oids

From

Jonathan Bartlett

Date:

03 September 2003, 17:08:47

> No it don't know anything about the table it insert into. I simply do
> the following :
>
> 1. INSERT data (comming from another layer)
> 2. Get the last oid
> 3. SELECT * FROM the same table where oid = what I just found.
>
> I know absolutly nothing about the table, and I like it this way :-)

The way I do it is to have a global sequence called 'objects' that spits
out 64-bit values, and then EVERY TABLE has a 64-bit field called
object_id, which defaults to nextval('objects') if I don't specify it.
So, on every table no matter what, I could do:

1. select nextval('objects');
2. INSERT data (comming from another layer, but set object_id to the
value I got in #1)
3. SELECT * FROM the same table where oid = what I just selected in #1

Jon

Re: Using oids

From

Bo Lorentsen

Date:

03 September 2003, 17:11:05

On Wed, 2003-09-03 at 18:24, Greg Stark wrote:

I am maintaining a driver layer, so that exactly why I care about this
kind of problem :-)

> The pros: no extra overhead for OIDs, more portable to other databases.
So when will this kind of meta data be provided, to let me detect if
there is at "PRIMARY KEY" on a table. ?

/BL

Re: Using oids

From

Jonathan Bartlett

Date:

03 September 2003, 17:28:01

> > If you want a globally unique ID based on OIDs, use the table OID
> > concatenated with the row OID.
> Ok, this make sense !

Are you sure this works after you hit the 4 billion mark?

Re: Using oids

From

Tom Lane

Date:

03 September 2003, 18:41:54

Jonathan Bartlett <johnnyb@eskimo.com> writes:
>>> If you want a globally unique ID based on OIDs, use the table OID
>>> concatenated with the row OID.

>> Ok, this make sense !

> Are you sure this works after you hit the 4 billion mark?

If you have a unique index on OID on each table for which you care, yes.

As someone else pointed out, you do then have to cope with the
possibility of insertions failing because of OID conflicts.

            regards, tom lane

Re: Using oids

From

Bo Lorentsen

Date:

03 September 2003, 18:42:09

On Wed, 2003-09-03 at 22:12, Jonathan Bartlett wrote:

> Are you sure this works after you hit the 4 billion mark?
As long as the returened oid is unique on the table in current session !

/BL

Re: [HACKERS] Using oids

From

Ashley Cambrell

Date:

03 September 2003, 21:29:08

> That said, there is no reason why someone couldn't create a last_sequence()
> function so you could say SELECT currval( last_sequence() ). Ofcourse, if
> your table has no SERIAL field, you're stuffed either way.

Instead of SELECT currval( last_sequence() ), what about implementing
oracl type binding?

Ala
http://groups.google.com.au/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&th=5cc63a569577d024#link2

With the new FE/BE changes, how easy would it be to implement? (I
mentioned it the FE/BE discussions)

Ashley Cambrell

Re: Using oids

From

Martijn van Oosterhout

Date:

03 September 2003, 23:41:06

On Wed, Sep 03, 2003 at 10:08:47PM +0200, Bo Lorentsen wrote:
> On Wed, 2003-09-03 at 18:24, Greg Stark wrote:
>
> I am maintaining a driver layer, so that exactly why I care about this
> kind of problem :-)
>
> > The pros: no extra overhead for OIDs, more portable to other databases.
> So when will this kind of meta data be provided, to let me detect if
> there is at "PRIMARY KEY" on a table. ?

That metadata has been around for a while, it's all stored in the system
tables. It's not returned with each query though. pg_attribute and pg_class
tell you almost everything you need to know.

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> "All that is needed for the forces of evil to triumph is for enough good
> men to do nothing." - Edmond Burke
> "The penalty good people pay for not being interested in politics is to be
> governed by people worse than themselves." - Plato

Attachment

msg-32091-66446.dat

Re: Using oids

From

rolf.ostvik@axxessit.no

Date:

04 September 2003, 03:17:38

on 2003-09-03 13:47 Bo Lorentsen <bl@netgroup.dk> wrote:
>No it don't know anything about the table it insert into. I simply do
>the following :

>1. INSERT data (comming from another layer)
>2. Get the last oid
>3. SELECT * FROM the same table where oid = what I just found.

>I know absolutly nothing about the table, and I like it this way :-)

I am not sure where the best place would be to jump into the discussion
with my suggestion so i do it her.

The problem is that someone use the OID as an identifier which always is
there, and you always use the same method to extract it, while at the same
time the use of OID is not advisable.

One alternative could be to create one sequence and use this sequence as a
identifier in all tables.
This would be a little bit more work when creating the table, but it would
be easier to extract the last used value from the sequence than it is to
extract it from the OID (since i really haven't loooked into how to
extract the OID in a easy way).

This will of course duplicate the function of the OID which (as of
present) always is there.

--
Rolf