Thread: OID Perfomance - Object-Relational databases

OID Perfomance - Object-Relational databases

From

Josh Berkus

Date:

03 October 2000, 12:10:43

Folks,
Because it's a very elegant solution to my database structure issues,
I'm using OID's extensively as referents and foriegn keys.  However, I
wanted to see if others had previous experience in this (answer as many
as you like):

1. Is there a performance loss on searches and joins when I use the OID
as a liniking field as opposed to a SERIAL column?

2. Can I define my own index on the OIDs of a table?

3. What is the difference between these two DDL statements in terms of
data access and PG-SQL performance (assuming that table clients has
already been defined):

CREATE TABLE client_addresses AS (client_OID    OID    REFERENCES clients,address1    VARCHAR (30),address2    VARCHAR
(30),address3   VARCHAR (30))
 
and:
CREATE TABLE client_addresses AS (client        clients,address1    VARCHAR (30),address2    VARCHAR (30),address3
VARCHAR(30))
 

(This is Michael's questions rephrased)

4. Int4 seems kinda small to me for a value that needs to enumerate
every single database object.  Within a couple of years of heavy use, a
customer-transaction database could easily exceed 2 billion objects
created (and destroyed).  Are there plans to expand this to Int8?

-Josh Berkus

P.S. My aplolgies if I've already posted these questions; I never
received them back from the list mailer.


-- 
______AGLIO DATABASE SOLUTIONS___________________________                                       Josh Berkus  Complete
informationtechnology      josh@agliodbs.com   and data management solutions       (415) 436-9166  for law firms, small
businesses      fax  436-0137   and non-profit organizations.       pager 338-4078                               San
Francisco

RE: OID Perfomance - Object-Relational databases

From

Michael Ansley

Date:

03 October 2000, 12:16:13

<p><font size="2">Hi, Josh,</font><p><font size="2">In fact, the last point about OIDs is particularly pertinent,
becausewe are expected to process up to 500 million records daily, thus exhausting the limit in, um, eight
days.</font><p><fontsize="2">Is anybody aware of when this limit will be raised.</font><p><font
size="2">Cheers...</font><p><fontsize="2">>>   -----Original Message-----</font><br /><font size="2">>>  
From:Josh Berkus [<a href="mailto:josh@agliodbs.com">mailto:josh@agliodbs.com</a>]</font><br /><font
size="2">>>  Sent: 03 October 2000 17:06</font><br /><font size="2">>>   To: sqllist</font><br /><font
size="2">>>  Cc: Jeff MacDonald</font><br /><font size="2">>>   Subject: [SQL] OID Perfomance -
Object-Relationaldatabases</font><br /><font size="2">>>   </font><br /><font size="2">>>   </font><br
/><fontsize="2">>>   Folks,</font><br /><font size="2">>>   </font><br /><font size="2">>>     
Becauseit's a very elegant solution to my database </font><br /><font size="2">>>   structure issues,</font><br
/><fontsize="2">>>   I'm using OID's extensively as referents and foriegn keys. </font><br /><font
size="2">>>   However, I</font><br /><font size="2">>>   wanted to see if others had previous experience in
this</font><br /><font size="2">>>   (answer as many</font><br /><font size="2">>>   as you
like):</font><br/><font size="2">>>   </font><br /><font size="2">>>   1. Is there a performance loss on
searchesand joins when </font><br /><font size="2">>>   I use the OID</font><br /><font size="2">>>   as a
linikingfield as opposed to a SERIAL column?</font><br /><font size="2">>>   </font><br /><font
size="2">>>  2. Can I define my own index on the OIDs of a table?</font><br /><font size="2">>>  
</font><br/><font size="2">>>   3. What is the difference between these two DDL statements </font><br /><font
size="2">>>  in terms of</font><br /><font size="2">>>   data access and PG-SQL performance (assuming that
tableclients has</font><br /><font size="2">>>   already been defined):</font><br /><font size="2">>>  
</font><br/><font size="2">>>   CREATE TABLE client_addresses AS (</font><br /><font size="2">>>     
client_OID     OID     REFERENCES clients,</font><br /><font size="2">>>      address1        VARCHAR
(30),</font><br/><font size="2">>>      address2        VARCHAR (30),</font><br /><font size="2">>>     
address3       VARCHAR (30)</font><br /><font size="2">>>      )</font><br /><font size="2">>>  
and:</font><br/><font size="2">>>   CREATE TABLE client_addresses AS (</font><br /><font size="2">>>     
client         clients,</font><br /><font size="2">>>      address1        VARCHAR (30),</font><br /><font
size="2">>>     address2        VARCHAR (30),</font><br /><font size="2">>>      address3        VARCHAR
(30)</font><br/><font size="2">>>      )</font><br /><font size="2">>>   </font><br /><font
size="2">>>  (This is Michael's questions rephrased)</font><br /><font size="2">>>   </font><br /><font
size="2">>>  4. Int4 seems kinda small to me for a value that needs to enumerate</font><br /><font
size="2">>>  every single database object.  Within a couple of years of </font><br /><font size="2">>>  
heavyuse, a</font><br /><font size="2">>>   customer-transaction database could easily exceed 2 billion
objects</font><br/><font size="2">>>   created (and destroyed).  Are there plans to expand this to
Int8?</font><br/><font size="2">>>   </font><br /><font size="2">>>   -Josh Berkus</font><br /><font
size="2">>>  </font><br /><font size="2">>>   P.S. My aplolgies if I've already posted these questions; I
never</font><br/><font size="2">>>   received them back from the list mailer.</font><br /><font
size="2">>>     </font><br /><font size="2">>>   </font><br /><font size="2">>>   </font><br /><font
size="2">>>  -- </font><br /><font size="2">>>   ______AGLIO DATABASE
SOLUTIONS___________________________</font><br/><font size="2">>>                                           Josh
Berkus</font><br/><font size="2">>>      Complete information technology      josh@agliodbs.com</font><br /><font
size="2">>>      and data management solutions       (415) 436-9166</font><br /><font size="2">>>      for
lawfirms, small businesses       fax  436-0137</font><br /><font size="2">>>       and non-profit
organizations.      pager 338-4078</font><br /><font size="2">>>                                      San
Francisco</font><br/><font size="2">>>   </font>

Re: OID Perfomance - another question

From

"Mitch Vincent"

Date:

03 October 2000, 12:19:27

Aren't there a pretty big concerns when using OIDs as IDs to relate records
in different tables to each other? Wouldn't the OIDs be totally re-assigned
if you had to dump/restore your database?

Just a question to satisfy my own curiosity, thanks!

-Mitch

> Folks,
>
> Because it's a very elegant solution to my database structure issues,
> I'm using OID's extensively as referents and foriegn keys.  However, I
> wanted to see if others had previous experience in this (answer as many
> as you like):
>
> 1. Is there a performance loss on searches and joins when I use the OID
> as a liniking field as opposed to a SERIAL column?

Re: OID Perfomance - another question

From

Tod McQuillin

Date:

03 October 2000, 12:25:12

On Tue, 3 Oct 2000, Mitch Vincent wrote:

> Aren't there a pretty big concerns when using OIDs as IDs to relate records
> in different tables to each other? Wouldn't the OIDs be totally re-assigned
> if you had to dump/restore your database?

Not if you use the -o flag to pg_dump.
      -o     Dump object identifiers (OIDs) for every table.
-- 
Tod McQuillin

Re: OID Perfomance - Object-Relational databases

From

"Mitch Vincent"

Date:

03 October 2000, 12:31:25

Aren't OIDs just integers? Isn't this limit just the limit of the value an
int4 can hold?

2,147,483,647 is the max for an int4 (I think) so at 500 million a day
you're looking at more like  4.29 (and change) days

If I'm correct in all the above, there wouldn't be any way to increase the
limit without the server running on a 64-bit machine (which you could do, I
guess)..

*shrug* just some thoughts..

-Mitch

----- Original Message -----
From: "Michael Ansley" <Michael.Ansley@intec-telecom-systems.com>
To: <josh@agliodbs.com>; "sqllist" <pgsql-sql@postgresql.org>
Cc: "Jeff MacDonald" <jeff@pgsql.com>
Sent: Tuesday, October 03, 2000 9:17 AM
Subject: RE: [SQL] OID Perfomance - Object-Relational databases


> Hi, Josh,
>
> In fact, the last point about OIDs is particularly pertinent, because we
are
> expected to process up to 500 million records daily, thus exhausting the
> limit in, um, eight days.
>
> Is anybody aware of when this limit will be raised.
>
> Cheers...
>
> >>   -----Original Message-----
> >>   From: Josh Berkus [mailto:josh@agliodbs.com]
> >>   Sent: 03 October 2000 17:06
> >>   To: sqllist
> >>   Cc: Jeff MacDonald
> >>   Subject: [SQL] OID Perfomance - Object-Relational databases
> >>
> >>
> >>   Folks,
> >>
> >>   Because it's a very elegant solution to my database
> >>   structure issues,
> >>   I'm using OID's extensively as referents and foriegn keys.
> >>    However, I
> >>   wanted to see if others had previous experience in this
> >>   (answer as many
> >>   as you like):
> >>
> >>   1. Is there a performance loss on searches and joins when
> >>   I use the OID
> >>   as a liniking field as opposed to a SERIAL column?
> >>
> >>   2. Can I define my own index on the OIDs of a table?
> >>
> >>   3. What is the difference between these two DDL statements
> >>   in terms of
> >>   data access and PG-SQL performance (assuming that table clients has
> >>   already been defined):
> >>
> >>   CREATE TABLE client_addresses AS (
> >>   client_OID OID REFERENCES clients,
> >>   address1 VARCHAR (30),
> >>   address2 VARCHAR (30),
> >>   address3 VARCHAR (30)
> >>   )
> >>   and:
> >>   CREATE TABLE client_addresses AS (
> >>   client clients,
> >>   address1 VARCHAR (30),
> >>   address2 VARCHAR (30),
> >>   address3 VARCHAR (30)
> >>   )
> >>
> >>   (This is Michael's questions rephrased)
> >>
> >>   4. Int4 seems kinda small to me for a value that needs to enumerate
> >>   every single database object.  Within a couple of years of
> >>   heavy use, a
> >>   customer-transaction database could easily exceed 2 billion objects
> >>   created (and destroyed).  Are there plans to expand this to Int8?
> >>
> >>   -Josh Berkus
> >>
> >>   P.S. My aplolgies if I've already posted these questions; I never
> >>   received them back from the list mailer.
> >>
> >>
> >>
> >>   --
> >>   ______AGLIO DATABASE SOLUTIONS___________________________
> >>                                           Josh Berkus
> >>      Complete information technology      josh@agliodbs.com
> >>       and data management solutions       (415) 436-9166
> >>      for law firms, small businesses       fax  436-0137
> >>       and non-profit organizations.       pager 338-4078
> >>                           San Francisco
> >>
>

RE: OID Perfomance - Object-Relational databases

From

Michael Ansley

Date:

03 October 2000, 12:34:31

<p><font size="2">Unsigned, I think you can double it, although I haven't applied much thought to this, so I could well
bewrong.</font><br /><p><font size="2">>>   -----Original Message-----</font><br /><font size="2">>>  
From:Mitch Vincent [<a href="mailto:mitch@venux.net">mailto:mitch@venux.net</a>]</font><br /><font size="2">>>  
Sent:03 October 2000 17:35</font><br /><font size="2">>>   To: Michael Ansley</font><br /><font
size="2">>>  Cc: sqllist</font><br /><font size="2">>>   Subject: Re: [SQL] OID Perfomance -
Object-Relationaldatabases</font><br /><font size="2">>>   </font><br /><font size="2">>>   </font><br
/><fontsize="2">>>   Aren't OIDs just integers? Isn't this limit just the limit </font><br /><font
size="2">>>  of the value an</font><br /><font size="2">>>   int4 can hold?</font><br /><font
size="2">>>  </font><br /><font size="2">>>   2,147,483,647 is the max for an int4 (I think) so at 500
</font><br/><font size="2">>>   million a day</font><br /><font size="2">>>   you're looking at more like 
4.29(and change) days</font><br /><font size="2">>>   </font><br /><font size="2">>>   If I'm correct in
allthe above, there wouldn't be any way </font><br /><font size="2">>>   to increase the</font><br /><font
size="2">>>  limit without the server running on a 64-bit machine </font><br /><font size="2">>>   (which
youcould do, I</font><br /><font size="2">>>   guess)..</font><br /><font size="2">>>   </font><br /><font
size="2">>>  *shrug* just some thoughts..</font><br /><font size="2">>>   </font><br /><font
size="2">>>  -Mitch</font><br /><font size="2">>>   </font><br /><font size="2">>>   ----- Original
Message-----</font><br /><font size="2">>>   From: "Michael Ansley"
<Michael.Ansley@intec-telecom-systems.com></font><br/><font size="2">>>   To: <josh@agliodbs.com>;
"sqllist"<pgsql-sql@postgresql.org></font><br /><font size="2">>>   Cc: "Jeff MacDonald"
<jeff@pgsql.com></font><br/><font size="2">>>   Sent: Tuesday, October 03, 2000 9:17 AM</font><br /><font
size="2">>>  Subject: RE: [SQL] OID Perfomance - Object-Relational databases</font><br /><font
size="2">>>  </font><br /><font size="2">>>   </font><br /><font size="2">>>   > Hi,
Josh,</font><br/><font size="2">>>   ></font><br /><font size="2">>>   > In fact, the last point
aboutOIDs is particularly </font><br /><font size="2">>>   pertinent, because we</font><br /><font
size="2">>>  are</font><br /><font size="2">>>   > expected to process up to 500 million records daily,
</font><br/><font size="2">>>   thus exhausting the</font><br /><font size="2">>>   > limit in, um,
eightdays.</font><br /><font size="2">>>   ></font><br /><font size="2">>>   > Is anybody aware of
whenthis limit will be raised.</font><br /><font size="2">>>   ></font><br /><font size="2">>>   >
Cheers...</font><br/><font size="2">>>   ></font><br /><font size="2">>>   > >>   -----Original
Message-----</font><br/><font size="2">>>   > >>   From: Josh Berkus [<a
href="mailto:josh@agliodbs.com">mailto:josh@agliodbs.com</a>]</font><br/><font size="2">>>   > >>  
Sent:03 October 2000 17:06</font><br /><font size="2">>>   > >>   To: sqllist</font><br /><font
size="2">>>  > >>   Cc: Jeff MacDonald</font><br /><font size="2">>>   > >>   Subject:
[SQL]OID Perfomance - Object-Relational databases</font><br /><font size="2">>>   > >></font><br /><font
size="2">>>  > >></font><br /><font size="2">>>   > >>   Folks,</font><br /><font
size="2">>>  > >></font><br /><font size="2">>>   > >>   Because it's a very elegant
solutionto my database</font><br /><font size="2">>>   > >>   structure issues,</font><br /><font
size="2">>>  > >>   I'm using OID's extensively as referents and foriegn keys.</font><br /><font
size="2">>>  > >>    However, I</font><br /><font size="2">>>   > >>   wanted to see if
othershad previous experience in this</font><br /><font size="2">>>   > >>   (answer as many</font><br
/><fontsize="2">>>   > >>   as you like):</font><br /><font size="2">>>   > >></font><br
/><fontsize="2">>>   > >>   1. Is there a performance loss on searches and joins when</font><br /><font
size="2">>>  > >>   I use the OID</font><br /><font size="2">>>   > >>   as a liniking
fieldas opposed to a SERIAL column?</font><br /><font size="2">>>   > >></font><br /><font
size="2">>>  > >>   2. Can I define my own index on the OIDs of a table?</font><br /><font
size="2">>>  > >></font><br /><font size="2">>>   > >>   3. What is the difference
betweenthese two DDL statements</font><br /><font size="2">>>   > >>   in terms of</font><br /><font
size="2">>>  > >>   data access and PG-SQL performance (assuming that </font><br /><font
size="2">>>  table clients has</font><br /><font size="2">>>   > >>   already been
defined):</font><br/><font size="2">>>   > >></font><br /><font size="2">>>   > >>  
CREATETABLE client_addresses AS (</font><br /><font size="2">>>   > >>   client_OID OID REFERENCES
clients,</font><br/><font size="2">>>   > >>   address1 VARCHAR (30),</font><br /><font
size="2">>>  > >>   address2 VARCHAR (30),</font><br /><font size="2">>>   > >>  
address3VARCHAR (30)</font><br /><font size="2">>>   > >>   )</font><br /><font size="2">>>   >
>>  and:</font><br /><font size="2">>>   > >>   CREATE TABLE client_addresses AS (</font><br
/><fontsize="2">>>   > >>   client clients,</font><br /><font size="2">>>   > >>  
address1VARCHAR (30),</font><br /><font size="2">>>   > >>   address2 VARCHAR (30),</font><br /><font
size="2">>>  > >>   address3 VARCHAR (30)</font><br /><font size="2">>>   > >>  
)</font><br/><font size="2">>>   > >></font><br /><font size="2">>>   > >>   (This is
Michael'squestions rephrased)</font><br /><font size="2">>>   > >></font><br /><font size="2">>>  
>>>   4. Int4 seems kinda small to me for a value that </font><br /><font size="2">>>   needs to
enumerate</font><br/><font size="2">>>   > >>   every single database object.  Within a couple of years
of</font><br/><font size="2">>>   > >>   heavy use, a</font><br /><font size="2">>>   >
>>  customer-transaction database could easily exceed 2 </font><br /><font size="2">>>   billion
objects</font><br/><font size="2">>>   > >>   created (and destroyed).  Are there plans to expand
</font><br/><font size="2">>>   this to Int8?</font><br /><font size="2">>>   > >></font><br
/><fontsize="2">>>   > >>   -Josh Berkus</font><br /><font size="2">>>   > >></font><br
/><fontsize="2">>>   > >>   P.S. My aplolgies if I've already posted these </font><br /><font
size="2">>>  questions; I never</font><br /><font size="2">>>   > >>   received them back from the
listmailer.</font><br /><font size="2">>>   > >></font><br /><font size="2">>>   >
>></font><br/><font size="2">>>   > >></font><br /><font size="2">>>   > >>  
--</font><br/><font size="2">>>   > >>   ______AGLIO DATABASE
SOLUTIONS___________________________</font><br/><font size="2">>>   >
>>                                          Josh Berkus</font><br /><font size="2">>>   > >>     
Completeinformation technology      josh@agliodbs.com</font><br /><font size="2">>>   > >>       and
datamanagement solutions       (415) 436-9166</font><br /><font size="2">>>   > >>      for law firms,
smallbusinesses       fax  436-0137</font><br /><font size="2">>>   > >>       and non-profit
organizations.      pager 338-4078</font><br /><font size="2">>>   > >>                           San
Francisco</font><br/><font size="2">>>   > >></font><br /><font size="2">>>   ></font><br
/><fontsize="2">>>   </font>

RE: OID Perfomance - Object-Relational databases

From

Michael Ansley

Date:

03 October 2000, 12:53:13

Regarding all of this, can someone tell me in a fairly technical manner what the difference is between searching for a row using it's primary key, and searching using it's OID. Rumour has it that it's faster. Technically, what's the difference?

Cheers...

-----Original Message-----
From: Michael Ansley [mailto:Michael.Ansley@intec-telecom-systems.com]
Sent: 03 October 2000 17:36
To: 'Mitch Vincent'
Cc: sqllist
Subject: RE: [SQL] OID Perfomance - Object-Relational databases

Unsigned, I think you can double it, although I haven't applied much thought to this, so I could well be wrong.
>>   -----Original Message-----
>>   From: Mitch Vincent [mailto:mitch@venux.net]
>>   Sent: 03 October 2000 17:35
>>   To: Michael Ansley
>>   Cc: sqllist
>>   Subject: Re: [SQL] OID Perfomance - Object-Relational databases
>>
>>
>>   Aren't OIDs just integers? Isn't this limit just the limit
>>   of the value an
>>   int4 can hold?
>>
>>   2,147,483,647 is the max for an int4 (I think) so at 500
>>   million a day
>>   you're looking at more like 4.29 (and change) days
>>
>>   If I'm correct in all the above, there wouldn't be any way
>>   to increase the
>>   limit without the server running on a 64-bit machine
>>   (which you could do, I
>>   guess)..
>>
>>   *shrug* just some thoughts..
>>
>>   -Mitch
>>
>>   ----- Original Message -----
>>   From: "Michael Ansley" <Michael.Ansley@intec-telecom-systems.com>
>>   To: <josh@agliodbs.com>; "sqllist" <pgsql-sql@postgresql.org>
>>   Cc: "Jeff MacDonald" <jeff@pgsql.com>
>>   Sent: Tuesday, October 03, 2000 9:17 AM
>>   Subject: RE: [SQL] OID Perfomance - Object-Relational databases
>>
>>
>>   > Hi, Josh,
>>   >
>>   > In fact, the last point about OIDs is particularly
>>   pertinent, because we
>>   are
>>   > expected to process up to 500 million records daily,
>>   thus exhausting the
>>   > limit in, um, eight days.
>>   >
>>   > Is anybody aware of when this limit will be raised.
>>   >
>>   > Cheers...
>>   >
>>   > >>   -----Original Message-----
>>   > >>   From: Josh Berkus [mailto:josh@agliodbs.com]
>>   > >>   Sent: 03 October 2000 17:06
>>   > >>   To: sqllist
>>   > >>   Cc: Jeff MacDonald
>>   > >>   Subject: [SQL] OID Perfomance - Object-Relational databases
>>   > >>
>>   > >>
>>   > >>   Folks,
>>   > >>
>>   > >>   Because it's a very elegant solution to my database
>>   > >>   structure issues,
>>   > >>   I'm using OID's extensively as referents and foriegn keys.
>>   > >>    However, I
>>   > >>   wanted to see if others had previous experience in this
>>   > >>   (answer as many
>>   > >>   as you like):
>>   > >>
>>   > >>   1. Is there a performance loss on searches and joins when
>>   > >>   I use the OID
>>   > >>   as a liniking field as opposed to a SERIAL column?
>>   > >>
>>   > >>   2. Can I define my own index on the OIDs of a table?
>>   > >>
>>   > >>   3. What is the difference between these two DDL statements
>>   > >>   in terms of
>>   > >>   data access and PG-SQL performance (assuming that
>>   table clients has
>>   > >>   already been defined):
>>   > >>
>>   > >>   CREATE TABLE client_addresses AS (
>>   > >>   client_OID OID REFERENCES clients,
>>   > >>   address1 VARCHAR (30),
>>   > >>   address2 VARCHAR (30),
>>   > >>   address3 VARCHAR (30)
>>   > >>   )
>>   > >>   and:
>>   > >>   CREATE TABLE client_addresses AS (
>>   > >>   client clients,
>>   > >>   address1 VARCHAR (30),
>>   > >>   address2 VARCHAR (30),
>>   > >>   address3 VARCHAR (30)
>>   > >>   )
>>   > >>
>>   > >>   (This is Michael's questions rephrased)
>>   > >>
>>   > >>   4. Int4 seems kinda small to me for a value that
>>   needs to enumerate
>>   > >>   every single database object. Within a couple of years of
>>   > >>   heavy use, a
>>   > >>   customer-transaction database could easily exceed 2
>>   billion objects
>>   > >>   created (and destroyed). Are there plans to expand
>>   this to Int8?
>>   > >>
>>   > >>   -Josh Berkus
>>   > >>
>>   > >>   P.S. My aplolgies if I've already posted these
>>   questions; I never
>>   > >>   received them back from the list mailer.
>>   > >>
>>   > >>
>>   > >>
>>   > >>   --
>>   > >>   ______AGLIO DATABASE SOLUTIONS___________________________
>>   > >>                                           Josh Berkus
>>   > >>      Complete information technology      josh@agliodbs.com
>>   > >>       and data management solutions       (415) 436-9166
>>   > >>      for law firms, small businesses       fax 436-0137
>>   > >>       and non-profit organizations.       pager 338-4078
>>   > >>                           San Francisco
>>   > >>
>>   >
>>

Re: OID Perfomance - Object-Relational databases

From

Tom Lane

Date:

03 October 2000, 15:10:53

Josh Berkus <josh@agliodbs.com> writes:
> 1. Is there a performance loss on searches and joins when I use the OID
> as a liniking field as opposed to a SERIAL column?

> 2. Can I define my own index on the OIDs of a table?

There is absolutely no magic about OIDs, except that the system insists
on adding one to every row you store.  In particular, they don't offer
any magic fast way to find a tuple.  If you want fast retrieval by OID
in a particular table then you *MUST* define an index on the OID column,
like so:

CREATE TABLE foo ( ... );

CREATE INDEX foo_oid_index ON foo (oid);

The performance of an index on OID will be indistinguishable from the
performance of an index on an int4 or serial column.

By and large I'd recommend using a serial column in preference to OIDs,
though, for two reasons:

1. dump/restore is more practical that way (don't have to worry about  saving/reloading OIDs).

2. counter overflow problems hit you only per-table, not  per-installation.
        regards, tom lane

Re: OID Perfomance - Object-Relational databases

From

Josh Berkus

Date:

03 October 2000, 20:33:04

Tom,

> By and large I'd recommend using a serial column in preference to OIDs,
> though, for two reasons:
> 
> 1. dump/restore is more practical that way (don't have to worry about
>    saving/reloading OIDs).
> 
> 2. counter overflow problems hit you only per-table, not
>    per-installation.

Hmmm ... for some tables, switching to Serial would work.  However, one
of the things I've done is add universal mod_data (modification stats)
and notes tables, which have to relate via OID because they relate to
5-7 different tables.  To wit:

CREATE TABLE notes AS (ref_OID        OID,staff_OID    OID    REFERENCES staff,note_date    DATE,note_text    TEXT)

And the ref_oid relates to any of 5 different tables, thus allowing a
single table to hold notes on clients, candidates, bills, etc.  Very
elegant, and using serials instead of the OID not possible.

SO I'm concerned about the problems you mentioned above.  pg_dump has a
-o option; are there problems with this?  And how liekly are counter
overflow problems?
Josh Berkus


-- 
______AGLIO DATABASE SOLUTIONS___________________________                                       Josh Berkus  Complete
informationtechnology      josh@agliodbs.com   and data management solutions       (415) 436-9166  for law firms, small
businesses      fax  436-0137   and non-profit organizations.       pager 338-4078                               San
Francisco

Re: OID Perfomance - Object-Relational databases

From

Michael Fork

Date:

03 October 2000, 21:26:39

Just a thought, but couldn't you create a sequence that is referenced by
multiple tables, ie:

CREATE SEQUENCE test_seq;

CREATE TABLE tblclients {client_id    default nextval('test_seq'::text),...}

CREATE TABLE tblbills {bill_id        default nextval('test_seq'::text),...}

and so on...

I have never tried this, but i don't see why it wouldn't work....

Michael Fork - CCNA - MCP - A+
Network Support - Toledo Internet Access - Toledo Ohio

On Tue, 3 Oct 2000, Josh Berkus wrote:

> Tom,
> 
> > By and large I'd recommend using a serial column in preference to OIDs,
> > though, for two reasons:
> > 
> > 1. dump/restore is more practical that way (don't have to worry about
> >    saving/reloading OIDs).
> > 
> > 2. counter overflow problems hit you only per-table, not
> >    per-installation.
> 
> Hmmm ... for some tables, switching to Serial would work.  However, one
> of the things I've done is add universal mod_data (modification stats)
> and notes tables, which have to relate via OID because they relate to
> 5-7 different tables.  To wit:
> 
> CREATE TABLE notes AS (
>     ref_OID        OID,
>     staff_OID    OID    REFERENCES staff,
>     note_date    DATE,
>     note_text    TEXT
>     )
> 
> And the ref_oid relates to any of 5 different tables, thus allowing a
> single table to hold notes on clients, candidates, bills, etc.  Very
> elegant, and using serials instead of the OID not possible.
> 
> SO I'm concerned about the problems you mentioned above.  pg_dump has a
> -o option; are there problems with this?  And how liekly are counter
> overflow problems?
>     
> Josh Berkus
>     
> 
> 
> -- 
> ______AGLIO DATABASE SOLUTIONS___________________________
>                                         Josh Berkus
>    Complete information technology      josh@agliodbs.com
>     and data management solutions       (415) 436-9166
>    for law firms, small businesses       fax  436-0137
>     and non-profit organizations.       pager 338-4078
>                                 San Francisco
>

Re: OID Perfomance - Object-Relational databases

From

Tom Lane

Date:

03 October 2000, 23:37:06

Michael Fork <mfork@toledolink.com> writes:
> Just a thought, but couldn't you create a sequence that is referenced by
> multiple tables,

I was going to suggest exactly that.  It's not quite as simple as a
"serial" column declaration, but you can split the use of an ID sequence
generator over just as many tables as you need to have unique IDs
across.  That way you don't have a problem at dump/reload time,
and you don't exhaust your ID space any faster than you must.
        regards, tom lane

Re: OID Perfomance - Object-Relational databases

From

Tom Lane

Date:

03 October 2000, 23:43:53

Josh Berkus <josh@agliodbs.com> writes:
> SO I'm concerned about the problems you mentioned above.  pg_dump has a
> -o option; are there problems with this?  And how liekly are counter
> overflow problems?

The trouble with pg_dump -o is that after reload, the OID generator
will be set to max(any OID in the dumped data).  So a dump & reload
doesn't do anything to postpone OID-wraparound Ragnarok.

As for the likelihood of overflow, figure 4G / tuple creation rate
for your installation (not database, but whole installation controlled
by one postmaster).  Unless your installation has just one active
table, per-table sequence values look like a better bet.

BTW, there *is* talk of providing an 8-byte-OID option, but I'm not
holding my breath for it.
        regards, tom lane

Re: OID Perfomance - Object-Relational databases

From

"Josh Berkus"

Date:

04 October 2000, 00:10:36

Tom,

> The trouble with pg_dump -o is that after reload, the OID
> generator
> will be set to max(any OID in the dumped data).  So a
> dump & reload
> doesn't do anything to postpone OID-wraparound Ragnarok.
> 
> As for the likelihood of overflow, figure 4G / tuple
> creation rate
> for your installation (not database, but whole
> installation controlled
> by one postmaster).  Unless your installation has just
> one active
> table, per-table sequence values look like a better bet.

Somebody (urgently) needs to tell all of the above to Bruce
Momjian (I've cc'd him); his book-in-the-making points up
OID's as a convenient and universal way to identify and link
tuples (chapter 7) and doen't mention these problems.  Who
can I bug about how useless the above makes OID's?

Thanks for the warning, and thanks Michael for the
suggestion; I'll use it and send you all back notes on how
it affects performance.

-Josh

RE: OID Perfomance - Object-Relational databases

From

Michael Ansley

Date:

04 October 2000, 05:26:15

<p><font size="2">I'm a little concerned about all this, because my understanding is that what makes an object database
sofast is its ability to directly reference tuples, so that traversing relationships becomes like traversing
pointers.</font><p><fontsize="2">The achilles heel of relational databases is the inability to do exactly that. 
Postgreshas a sufficiently extended query language that traversing relationships in this manner can be dealt with, and
doingthis from a decent OO development tool should remove the general pain of dealing with OIDs.</font><p><font
size="2">Oris there something that I'm missing?  What I expect from an OR database is the speed of an object database
(whichseems to come mainly from it's OID mechanism), with the manipulative power of a traditional relational database,
andthen a whole truckload of extras, like rules, procedures, an OR query language, etc.  So, effectively, it's more
thanthe sum of ODB and RDB.</font><p><font size="2">In order to promote Postgres in the environment that I'm in now, I
needit to attain at least the same order of speed as an ODB.  Are there any benchmarks anywhere that I can quote
againstODBs?  I need some help here, because the general feeling that I'm up against is that we should be using either
ODBor RDB, not ORDB as it is the worst of both worlds, being slow, not completely object-orientated, and not as
flexibleas ODBs, and less robust than RDBs, because of the object extensions.  I need some ammo.</font><p><font
size="2">MikeA</font><br/><br /><br /><p><font size="2">>>   -----Original Message-----</font><br /><font
size="2">>>  From: Josh Berkus [<a href="mailto:josh@agliodbs.com">mailto:josh@agliodbs.com</a>]</font><br
/><fontsize="2">>>   Sent: 04 October 2000 05:11</font><br /><font size="2">>>   To:
pgsql-sql@postgresql.org</font><br/><font size="2">>>   Cc: pgman@candle.pha.pa.us</font><br /><font
size="2">>>  Subject: Re: [SQL] OID Perfomance - Object-Relational databases </font><br /><font
size="2">>>  </font><br /><font size="2">>>   </font><br /><font size="2">>>   Tom,</font><br /><font
size="2">>>  </font><br /><font size="2">>>   > The trouble with pg_dump -o is that after reload, the
OID</font><br/><font size="2">>>   > generator</font><br /><font size="2">>>   > will be set to
max(anyOID in the dumped data).  So a</font><br /><font size="2">>>   > dump & reload</font><br /><font
size="2">>>  > doesn't do anything to postpone OID-wraparound Ragnarok.</font><br /><font size="2">>>  
></font><br /><font size="2">>>   > As for the likelihood of overflow, figure 4G / tuple</font><br /><font
size="2">>>  > creation rate</font><br /><font size="2">>>   > for your installation (not database,
butwhole</font><br /><font size="2">>>   > installation controlled</font><br /><font size="2">>>   >
byone postmaster).  Unless your installation has just</font><br /><font size="2">>>   > one active</font><br
/><fontsize="2">>>   > table, per-table sequence values look like a better bet.</font><br /><font
size="2">>>  </font><br /><font size="2">>>   Somebody (urgently) needs to tell all of the above to
Bruce</font><br/><font size="2">>>   Momjian (I've cc'd him); his book-in-the-making points up</font><br /><font
size="2">>>  OID's as a convenient and universal way to identify and link</font><br /><font size="2">>>  
tuples(chapter 7) and doen't mention these problems.  Who</font><br /><font size="2">>>   can I bug about how
uselessthe above makes OID's?</font><br /><font size="2">>>   </font><br /><font size="2">>>   Thanks for
thewarning, and thanks Michael for the</font><br /><font size="2">>>   suggestion; I'll use it and send you all
backnotes on how</font><br /><font size="2">>>   it affects performance.</font><br /><font size="2">>>  
</font><br/><font size="2">>>   -Josh</font><br /><font size="2">>>   </font><br /><font
size="2">>>  </font><br /><font size="2">>>   </font><br /><font size="2">>>   </font><br /><font
size="2">>>  </font>

Re: OID Perfomance - Object-Relational databases

From

Jeff MacDonald

Date:

04 October 2000, 07:44:08

Hi Josh,

I didn't get right tho this, because well.. you asked
the list, i figured i'd give them a chance first. and they've
answered the same as i would have..

>     Because it's a very elegant solution to my database structure issues,
> I'm using OID's extensively as referents and foriegn keys.  However, I
> wanted to see if others had previous experience in this (answer as many
> as you like):
> 
> 1. Is there a performance loss on searches and joins when I use the OID
> as a liniking field as opposed to a SERIAL column?

yes, because indexes are automatcally created on serials.

> 2. Can I define my own index on the OIDs of a table?

sure, create index oid_idx on table(oid);


Jeff MacDonald,

-----------------------------------------------------
PostgreSQL Inc        | Hub.Org Networking Services
jeff@pgsql.com        | jeff@hub.org
www.pgsql.com        | www.hub.org
1-902-542-0713        | 1-902-542-3657
-----------------------------------------------------
Facsimile : 1 902 542 5386
IRC Nick  : bignose

RE: Re: OID Perfomance - Object-Relational databases

From

Michael Ansley

Date:

04 October 2000, 08:57:57

<p><font size="2">Given what we've been discussing over the past day or so, can anybody explain this to
me:</font><p><fontsize="2">dev=# explain select client.address.postcode from client;</font><br /><font
size="2">NOTICE: QUERY PLAN:</font><p><font size="2">Seq Scan on client  (cost=0.00..1.01 rows=1
width=4)</font><p><fontsize="2">EXPLAIN</font><br /><font size="2">dev=# select client.address.postcode from
client;</font><br/><font size="2">ERROR:  init_fcache: Cache lookup failed for procedure 18935</font><br /><p><font
size="2">Cheers...</font>

Re: OID Perfomance - Object-Relational databases

From

Bruce Momjian

Date:

08 October 2000, 21:56:52

[ Charset ISO-8859-1 unsupported, converting... ]
> Tom,
> 
> > The trouble with pg_dump -o is that after reload, the OID
> > generator
> > will be set to max(any OID in the dumped data).  So a
> > dump & reload
> > doesn't do anything to postpone OID-wraparound Ragnarok.
> > 
> > As for the likelihood of overflow, figure 4G / tuple
> > creation rate
> > for your installation (not database, but whole
> > installation controlled
> > by one postmaster).  Unless your installation has just
> > one active
> > table, per-table sequence values look like a better bet.
> 
> Somebody (urgently) needs to tell all of the above to Bruce
> Momjian (I've cc'd him); his book-in-the-making points up
> OID's as a convenient and universal way to identify and link
> tuples (chapter 7) and doen't mention these problems.  Who
> can I bug about how useless the above makes OID's?
> 

Well, you know, everyone complains about wrap-around, but no one has
ever reported it happening.  It is like the Y2K thing where everyone
thought they would starve.  Please, someone tell me they have had had
OID rollover, and I will start doing something about it.

Also, 500 million transactions a day?  Seems impossible to me.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: OID Perfomance - Object-Relational databases

From

"Josh Berkus"

Date:

09 October 2000, 13:34:57

ruce,
   > The oid counter is preserved with -o on reload. It is   > not reset.
   I'll let you and Tom duke this one out. :-) It's all
beyond   me. 
   > > 2. When OID's "wrap around" does the whole database
go   > > kablooie? If so, why hasn't it happened to anyone
yet?   > If   > > not, can you describe the system PGSQL uses to
allocate   > OIDs   > > once it gets to 2,147,xxx,xxx?   >    > oid's start getting re-used on wraparound.   > 
   This is what I mean. Does the DB engine only recycle   *unused* OIDs (that is, does it check for teh continued
existanceof a tuple with OID 198401)? If that's the   method, then there isn't really a problem even if I do
 
use   OIDs as a primary index. None of my OIDs still in use
will   be touched.
   If OIDs start getting re-used regardless if they are
already   present, then, like Tom says, it's Ragnarok. But it
seems   like somebody would have increased the OID to INT8 if
that   were a prospect.
   -Josh Berkus
   P.S. Bruce, I'm sorry about not sending my comments on
your   book. Do you have any use for copy-editing comments from   the June 28th version, or are you already in
pre-press?

Re: OID Perfomance - Object-Relational databases

From

"Josh Berkus"

Date:

09 October 2000, 15:04:16

Bruce, Tom, etc.:

> > This is what I mean.  Does the DB engine only recycle
> > *unused* OIDs (that is, does it check for teh continued
> > existance of a tuple with OID 198401)?  If that's the
> > method, then there isn't really a problem even if I do
> use
> > OIDs as a primary index.  None of my OIDs still in use
> will
> > be touched.
> 
> 
> No, it uses all oids, and can create duplicates.

Does this mean that Tom's "Wraparound Ragnarok" is the
accurate scenario?

-Josh Berkus

Re: OID Perfomance - Object-Relational databases

From

Bruce Momjian

Date:

09 October 2000, 15:19:09

[ Charset ISO-8859-1 unsupported, converting... ]
> Bruce, Tom, etc.:
> 
> > > This is what I mean.  Does the DB engine only recycle
> > > *unused* OIDs (that is, does it check for teh continued
> > > existance of a tuple with OID 198401)?  If that's the
> > > method, then there isn't really a problem even if I do
> > use
> > > OIDs as a primary index.  None of my OIDs still in use
> > will
> > > be touched.
> > 
> > 
> > No, it uses all oids, and can create duplicates.
> 
> Does this mean that Tom's "Wraparound Ragnarok" is the
> accurate scenario?

Yes.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026

Re: OID Perfomance - Object-Relational databases

From

Tom Lane

Date:

09 October 2000, 16:15:08

"Josh Berkus" <josh@agliodbs.com> writes:
>> No, it uses all oids, and can create duplicates.

> Does this mean that Tom's "Wraparound Ragnarok" is the
> accurate scenario?

There's nothing particularly magic about OIDs.  If your programs
don't use OIDs to identify rows, then you don't care whether there
are duplicate OIDs or not.  If they do, you can still protect yourself
by creating a unique index on OID for the tables where you need OID to
be unique.  You might get unexpected insert failures due to OID conflict
after a wraparound, but the odds are in your favor (ie, retrying will
probably succeed).

Postgres itself assumes that OIDs are unique in some of the system
tables, but I believe we have unique indexes in place for the tables
where it matters.  So, again, transient insert failures seem like the
worst likely consequence ... annoying, but not exactly Ragnarok.

Transaction ID wraparound is a less pleasant scenario :-(.  You'd
probably be well advised to dump/reload your DB when you start to
approach 4G transactions through it.  I think we might be able to
fix this so that things are OK as long as you vacuum more frequently
than once per wraparound, but no one's looked at the problem yet.

As Bruce pointed out, we have yet to hear from anyone who's run a
Postgres DB long enough to run into either sort of wraparound, so
these issues don't seem all that pressing...
        regards, tom lane

Re: OID Perfomance - Object-Relational databases

From

Josh Berkus

Date:

09 October 2000, 19:01:44

Tom, Bruce,
Thanks.  I think that gives me a pretty clear picture.  How can we
submit this whole OID thing to the PGSQL FAQ?  Want me to write it up?
                -Josh Berkus

P.S. BTW, my conclusion based on this discussion is that I will not use
the OIDs for indexing/linking.  Instead, I will embrace Michael's
earlier suggestion and have already created universal_sq and started
migrating primary keys to that sequence.  

P.P.S. Thank you both for taking the time to hash out this issue.


-- 
______AGLIO DATABASE SOLUTIONS___________________________                                       Josh Berkus  Complete
informationtechnology      josh@agliodbs.com   and data management solutions       (415) 436-9166  for law firms, small
businesses      fax  436-0137   and non-profit organizations.       pager 338-4078                               San
Francisco

Re: OID Perfomance - Object-Relational databases

From

Bruce Momjian

Date:

09 October 2000, 19:37:20

There is an FAQ item about oid's already on the web site.  It may not
be in 7.0.2.

> Tom, Bruce,
> 
>     Thanks.  I think that gives me a pretty clear picture.  How can we
> submit this whole OID thing to the PGSQL FAQ?  Want me to write it up?
> 
>                     -Josh Berkus
> 
> P.S. BTW, my conclusion based on this discussion is that I will not use
> the OIDs for indexing/linking.  Instead, I will embrace Michael's
> earlier suggestion and have already created universal_sq and started
> migrating primary keys to that sequence.  
> 
> P.P.S. Thank you both for taking the time to hash out this issue.
> 
> 
> -- 
> ______AGLIO DATABASE SOLUTIONS___________________________
>                                         Josh Berkus
>    Complete information technology      josh@agliodbs.com
>     and data management solutions       (415) 436-9166
>    for law firms, small businesses       fax  436-0137
>     and non-profit organizations.       pager 338-4078
>                                 San Francisco
> 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026