Thread: 4 billion + oids

4 billion + oids

From
"Andrew Bartley"
Date:

Hi all,

 

We have a problem with our PostgreSQL application.  We are trying to decide if this is a major problem with Postgres, or is due to our application design.

 

Our application generates thousands of temp tables and data is moved and sorted several times in each batched insertion of data.

 

We chew though OIDs very quickly.  At our customers’ site, the OID count is exceeding 4 billion and wrapping every two weeks (approx.). 

 

We have found this to be a major problem.  It seems once the OIDs wrap; we constantly get errors due to "Cannot insert a duplicate key into unique index pg_class_oid_index".  There are about 3,000 entries in pg_class at this stage.  As most of the tables are dynamically generated, a failure means lost data.

 

I am in the middle of designing an application change as a stop-gap fix: 

  1. Change all creation of temp tables with "without oids", hoping to reduce the consumption of OIDS
  2. Check for the error string "Cannot insert a duplicate key into unique index pg_class_oid_index" in the batch shells for each function call and re-run if required.
  3. Remove all truncate statements from processing
  4. Trigger dump and restore of the database during nightly processing if the error occurs.

 

Can anyone comment if they know this is a fundamental limitation of PostgreSQL and if other databases have addressed this problem?  Or alternatively, is it a problem with our application that must be corrected?

 

If anyone can instruct me on a better way to handle this, it would be appreciated.

 

We are running PostgreSQL 7.2.1 on Pentium 4 x86 type systems.
 
 
Thanks
 
Andrew Bartley
Evolvo Systems Pty Ltd.
Level 3, 351-353 Elizabeth Street
Melbourne, VIC, 3000
andrewb@evolvosystems.com
tel. +613 9642 3200
fax. +613 9642 4108
mob. +613 0414 520 920

Re: 4 billion + oids

From
"Daniel R. Anderson"
Date:
You need very much to read the manual and administrator's guide.  I
/believe/ it is possible to recompile postgresql with support for more
oids.  Remember to vacuum and analyze routinely.  And when you recompile
upgrade to 7.3.2.
--
Daniel R. Anderson
Chief Lab Rat and Helper Monkey
Great Lakes Industries, Inc.
80 Pineview Ave.
Buffalo, NY 14218
(716) 691-5900 x218

"Never let your schooling interfere with your education"
    -- Mark Twain


Re: 4 billion + oids

From
Neil Conway
Date:
On Sun, 2003-03-23 at 21:42, Daniel R. Anderson wrote:
> You need very much to read the manual and administrator's guide.  I
> /believe/ it is possible to recompile postgresql with support for more
> oids.

No, it's not.

> Remember to vacuum and analyze routinely.

AFAIK, neither of these would have any effect on OID usage.

Cheers,

Neil


Re: 4 billion + oids

From
Martijn van Oosterhout
Date:
On Mon, Mar 24, 2003 at 01:35:54PM +1100, Andrew Bartley wrote:

> We chew though OIDs very quickly.  At our customers' site, the OID count
> is exceeding 4 billion and wrapping every two weeks (approx.).
>
>   1.. Change all creation of temp tables with "without oids", hoping to reduce the consumption of OIDS

This is your solution. Do oyu use OID anywhere within your application. If
not (you shouldn't be) create all table "without oids". If you stop using
OIDs so rapidly, your problems should go away.

>   2.. Check for the error string "Cannot insert a duplicate key into unique index pg_class_oid_index" in the batch
shellsfor each function call and re-run if required.  

> Can anyone comment if they know this is a fundamental limitation of
> PostgreSQL and if other databases have addressed this problem?  Or
> alternatively, is it a problem with our application that must be
> corrected?

Postgresql uses OIDs for tables and triggers internally. If you don't need
OIDs within your data tables you should be able to reduce your OID usage
considerably.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> IT is not something like pizza that you order in at one o'clock in
> the morning. - John Loebenstein, St George CIO

Attachment

Re: 4 billion + oids

From
Neil Conway
Date:
On Sun, 2003-03-23 at 21:35, Andrew Bartley wrote:
>      1. Change all creation of temp tables with "without oids", hoping
>         to reduce the consumption of OIDS

That should be the first thing you should do (in 7.4, there will be an
ALTER TABLE command to disable OIDs). If you're not making use of OIDs
in your application, you can specify WITHOUT OIDS for all user tables.
IMHO this should be the default for CREATE TABLE some time in the near
future.

>      1. Check for the error string "Cannot insert a duplicate key into
>         unique index pg_class_oid_index" in the batch shells for each
>         function call and re-run if required.

If you define all your tables (especially temp tables) using WITHOUT
OIDS, you shouldn't need to bother with this.

> Can anyone comment if they know this is a fundamental limitation of
> PostgreSQL

It's not a "fundamental limitation" at all, merely a problem of
remaining backward compatible with the behavior of previous PostgreSQL
releases.

> We are running PostgreSQL 7.2.1 on Pentium 4 x86 type systems.

Upgrading to the latest stable 7.2 release (7.2.4), or if possible
7.3.2, is always encouraged.

Cheers,

Neil


Re: 4 billion + oids

From
"Andrew Bartley"
Date:
Thanks for your comments.



I don't think we use OIDS in the application.  Is the cluster command
dependent on OIDS at the row/tupple level?



Now that the application is live at our customers' site, have you any
suggestions on how we go about converting all of our permanent and
dynamically created app tables?



I am thinking of doing a schema dump and editing the file globally.  Then a
schema load then a restore.



Is there a PG table I can update to achieve the same end?



Thanks again



Andrew Bartley


----- Original Message -----
From: "Neil Conway" <neilc@samurai.com>
To: "Andrew Bartley" <abartley@evolvosystems.com>
Cc: "PostgreSQL General" <pgsql-general@postgresql.org>
Sent: Monday, March 24, 2003 2:14 PM
Subject: Re: [GENERAL] 4 billion + oids


> On Sun, 2003-03-23 at 21:35, Andrew Bartley wrote:
> >      1. Change all creation of temp tables with "without oids", hoping
> >         to reduce the consumption of OIDS
>
> That should be the first thing you should do (in 7.4, there will be an
> ALTER TABLE command to disable OIDs). If you're not making use of OIDs
> in your application, you can specify WITHOUT OIDS for all user tables.
> IMHO this should be the default for CREATE TABLE some time in the near
> future.
>
> >      1. Check for the error string "Cannot insert a duplicate key into
> >         unique index pg_class_oid_index" in the batch shells for each
> >         function call and re-run if required.
>
> If you define all your tables (especially temp tables) using WITHOUT
> OIDS, you shouldn't need to bother with this.
>
> > Can anyone comment if they know this is a fundamental limitation of
> > PostgreSQL
>
> It's not a "fundamental limitation" at all, merely a problem of
> remaining backward compatible with the behavior of previous PostgreSQL
> releases.
>
> > We are running PostgreSQL 7.2.1 on Pentium 4 x86 type systems.
>
> Upgrading to the latest stable 7.2 release (7.2.4), or if possible
> 7.3.2, is always encouraged.
>
> Cheers,
>
> Neil
>
>
>


Re: 4 billion + oids

From
Tom Lane
Date:
"Andrew Bartley" <abartley@evolvosystems.com> writes:
> We have found this to be a major problem.  It seems once the OIDs wrap; we =
> constantly get errors due to "Cannot insert a duplicate key into unique ind=
> ex pg_class_oid_index".  There are about 3,000 entries in pg_class at this =
> stage.

Once the OID counter wraps, there's certainly some risk of OID
collisions.  However, if you have only 3000 entries in pg_class it's
hard to see why the odds would be worse than 3000/4billion or less than
one failure in 1 million tries.  I think there is something you have not
told us.

The nearby suggestions to minimize the rate of OID consumption seem
rather beside the point from here ... what I'd wonder about is why you
need as many as three thousand tables.  Reducing that footprint should
reduce the odds of OID collision.

            regards, tom lane


Re: 4 billion + oids

From
Amin Abdulghani
Date:
My guess is most of the applications on postgress wouldn't
totally rely on oids, though they may implicitly use them
if they use standard sql create table statements. My
concern is that during the wrap arounds this could create
unintended problems in table, index creations or
elsewhere. Probably its worthwhile to enumerate the list
of potential problems (eg what we now know table creation,
index creation), their error messages (so applicatons can
handle them cleanly) and possibly their workarounds. This
list could then be very useful as part of the discussion
on oids in the documentation.

Thanks...
Amin


On Mon, 24 Mar 2003 01:46:36 -0500
  Tom Lane <tgl@sss.pgh.pa.us> wrote:
>"Andrew Bartley" <abartley@evolvosystems.com> writes:
>> We have found this to be a major problem.  It seems once
>>the OIDs wrap; we =
>> constantly get errors due to "Cannot insert a duplicate
>>key into unique ind=
>> ex pg_class_oid_index".  There are about 3,000 entries
>>in pg_class at this =
>> stage.
>
>Once the OID counter wraps, there's certainly some risk
>of OID
>collisions.  However, if you have only 3000 entries in
>pg_class it's
>hard to see why the odds would be worse than
>3000/4billion or less than
>one failure in 1 million tries.  I think there is
>something you have not
>told us.
>
>The nearby suggestions to minimize the rate of OID
>consumption seem
>rather beside the point from here ... what I'd wonder
>about is why you
>need as many as three thousand tables.  Reducing that
>footprint should
>reduce the odds of OID collision.
>
>            regards, tom lane
>
>
>---------------------------(end of
>broadcast)---------------------------
>TIP 6: Have you searched our list archives?
>
>http://archives.postgresql.org


Re: 4 billion + oids

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Amin Abdulghani [mailto:amin@quantiva.com]
> Sent: Monday, March 24, 2003 8:43 AM
> To: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] 4 billion + oids
>
> My guess is most of the applications on postgress wouldn't
> totally rely on oids, though they may implicitly use them
> if they use standard sql create table statements. My
> concern is that during the wrap arounds this could create
> unintended problems in table, index creations or
> elsewhere. Probably its worthwhile to enumerate the list
> of potential problems (eg what we now know table creation,
> index creation), their error messages (so applicatons can
> handle them cleanly) and possibly their workarounds. This
> list could then be very useful as part of the discussion
> on oids in the documentation.

I have definitely seen serious problems relating to OID wrap.
We have an application that uses the OID's to create joins.
We scan tables from a list of source tables from some external (usually
non-postgresql) database.
For each of these tables we create a 64 bit checksum for each record and
store it in a table, along with an OID.
Then, after the passage of time, we rename the current checksum table
and repeat the process.
Then, we can do joins between the two checksum tables and see which
records are updated, added or deleted.
Often, we deal with systems that have tens or even hundreds of millions
of records, and the tasks run on a nightly basis.
Hence, we eventually run out of new OIDs.

Our current solution is to drop the database and recreate it.  It's an
awful solution to the problem but currently there is nothing better.  It
requires us to run a full synchronization instead of an incremental,
which is orders of magnitude slower.

It sure would be nice to have 64 bit OID values.


Re: 4 billion + oids

From
Tom Lane
Date:
"Dann Corbit" <DCorbit@connx.com> writes:
> I have definitely seen serious problems relating to OID wrap.
> We have an application that uses the OID's to create joins.
> We scan tables from a list of source tables from some external (usually
> non-postgresql) database.
> For each of these tables we create a 64 bit checksum for each record and
> store it in a table, along with an OID.

Why in the world aren't you using a serial int8 column, instead?

            regards, tom lane


Re: 4 billion + oids

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Monday, March 24, 2003 11:17 AM
> To: Dann Corbit
> Cc: pgsql-general@postgresql.org
> Subject: Re: [GENERAL] 4 billion + oids
>
>
> "Dann Corbit" <DCorbit@connx.com> writes:
> > I have definitely seen serious problems relating to OID
> wrap. We have
> > an application that uses the OID's to create joins. We scan tables
> > from a list of source tables from some external (usually
> > non-postgresql) database.
> > For each of these tables we create a 64 bit checksum for
> each record
> > and store it in a table, along with an OID.
>
> Why in the world aren't you using a serial int8 column, instead?

I am using Postgresql 7.1.3 (our own native port) and I was not aware
that feature was available.
If I can create tables without OID's and with 8 byte serial columns I
will change to that.

The documentation I read said that serial values were 4 bytes long.  How
do I declare and 8 byte serial column?


Re: 4 billion + oids

From
Tom Lane
Date:
"Dann Corbit" <DCorbit@connx.com> writes:
>> Why in the world aren't you using a serial int8 column, instead?

> I am using Postgresql 7.1.3 (our own native port) and I was not aware
> that feature was available.

It may not be, in 7.1 --- I forget when we widened sequences to int8.

However, 64-bit OIDs are certainly never going to appear in 7.1 either.

            regards, tom lane


Re: 4 billion + oids

From
Andrew Sullivan
Date:
On Mon, Mar 24, 2003 at 02:29:34PM -0500, Tom Lane wrote:
> It may not be, in 7.1 --- I forget when we widened sequences to int8.

7.2.  The SERIAL fields are int4 still, though.

A
--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: 4 billion + oids

From
Larry Rosenman
Date:

--On Monday, March 24, 2003 16:28:01 -0500 Andrew Sullivan
<andrew@libertyrms.info> wrote:

> On Mon, Mar 24, 2003 at 02:29:34PM -0500, Tom Lane wrote:
>> It may not be, in 7.1 --- I forget when we widened sequences to int8.
>
> 7.2.  The SERIAL fields are int4 still, though.
but there is a BIGSERIAL or SERIAL8 IIRC.


>
> A
> --
> ----
> Andrew Sullivan                         204-4141 Yonge Street
> Liberty RMS                           Toronto, Ontario Canada
> <andrew@libertyrms.info>                              M2P 2A8
>                                          +1 416 646 3304 x110
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster
>



--
Larry Rosenman                     http://www.lerctr.org/~ler
Phone: +1 972-414-9812                 E-Mail: ler@lerctr.org
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


Re: 4 billion + oids

From
Andrew Sullivan
Date:
On Mon, Mar 24, 2003 at 03:35:11PM -0600, Larry Rosenman wrote:
> but there is a BIGSERIAL or SERIAL8 IIRC.

Yes.  I just was trying to note that it's not like you automatically
get the advantage of the larger sequences if you dump a 7.1.x
database and load it into 7.2.  (Please redirect discussions about
how hard/easy Postgres is to upgrade to some other list ;-)

A

--
----
Andrew Sullivan                         204-4141 Yonge Street
Liberty RMS                           Toronto, Ontario Canada
<andrew@libertyrms.info>                              M2P 2A8
                                         +1 416 646 3304 x110


Re: 4 billion + oids

From
"Andrew Bartley"
Date:
I hope this helps.

PostgreSQL OID problem summary

The default object creation properties set OID's to "with" at tuple level.

OIDs start at 0 and reach a limit at 4,294,967,295.  New OID's are assigned
for each object created.

OIDs will reach the maximum limit, this will cause an "OID wrap".  An OID
wrap will cause a problem eventually.
Its a statistical probability that data loss will occur.

Suggested solution(s)

1.reduce oid consumption
 pros - less chance of oid wrap
 cons - wrap will happen (but may take many years)

2.drop and reload database
 pros - starts oid count from 0
 cons - downtime is significant (maybe many hours)

3.reduce number of tables,indexes etc.
 pros - less chance of oid clash
 cons - application may require high number of tables,indexes etc.

Re: our application problem, the cons for solutions 2/3 are significant so
as to prevent
them being used in our application.

Solution 1 is practical.  The database will be dropped and reloaded with a
modified schema that does
not use OID's at the tuple level.  No user tables at the tuple level will
use OID's in the new schema.

We believe that our OID consumption will decrease from 4,294,967,295 every
two weeks (approx.), to 4,294,967,295 every 190 years (approx.)

Thanks

Andrew Bartley

----- Original Message -----
From: "Andrew Sullivan" <andrew@libertyrms.info>
To: <pgsql-general@postgresql.org>
Sent: Tuesday, March 25, 2003 8:39 AM
Subject: Re: [GENERAL] 4 billion + oids


> On Mon, Mar 24, 2003 at 03:35:11PM -0600, Larry Rosenman wrote:
> > but there is a BIGSERIAL or SERIAL8 IIRC.
>
> Yes.  I just was trying to note that it's not like you automatically
> get the advantage of the larger sequences if you dump a 7.1.x
> database and load it into 7.2.  (Please redirect discussions about
> how hard/easy Postgres is to upgrade to some other list ;-)
>
> A
>
> --
> ----
> Andrew Sullivan                         204-4141 Yonge Street
> Liberty RMS                           Toronto, Ontario Canada
> <andrew@libertyrms.info>                              M2P 2A8
>                                          +1 416 646 3304 x110
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>


Re: 4 billion + oids

From
Alvaro Herrera
Date:
On Tue, Mar 25, 2003 at 11:43:57AM +1100, Andrew Bartley wrote:

> 3.reduce number of tables,indexes etc.
>  pros - less chance of oid clash
>  cons - application may require high number of tables,indexes etc.
>
> Re: our application problem, the cons for solutions 2/3 are
> significant so as to prevent them being used in our application.

Note that you probably need to REINDEX the table pg_class if you use too
much temp tables.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"La gente vulgar solo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"


Turple level or object level OIDs?

From
Flower Sun
Date:

Hi, group,

How do I know whether my postgresql database is using Turple level or object level OIDs?

postgresql 7.2.3 on Redhat 8.0

Thank you

 Andrew Bartley <abartley@evolvosystems.com> wrote:

I hope this helps.

PostgreSQL OID problem summary

The default object creation properties set OID's to "with" at tuple level.

OIDs start at 0 and reach a limit at 4,294,967,295. New OID's are assigned
for each object created.

OIDs will reach the maximum limit, this will cause an "OID wrap". An OID
wrap will cause a problem eventually.
Its a statistical probability that data loss will occur.

Suggested solution(s)

1.reduce oid consumption
pros - less chance of oid wrap
cons - wrap will happen (but may take many years)

2.drop and reload database
pros - starts oid count from 0
cons - downtime is significant (maybe many hours)

3.reduce number of tables,indexes etc.
pros - less chance of oid clash
cons - application may require high number of tables,indexes etc.

Re: our application problem, the cons for solu tions 2/3 are significant so
as to prevent
them being used in our application.

Solution 1 is practical. The database will be dropped and reloaded with a
modified schema that does
not use OID's at the tuple level. No user tables at the tuple level will
use OID's in the new schema.

We believe that our OID consumption will decrease from 4,294,967,295 every
two weeks (approx.), to 4,294,967,295 every 190 years (approx.)

Thanks

Andrew Bartley

----- Original Message -----
From: "Andrew Sullivan"
To:
Sent: Tuesday, March 25, 2003 8:39 AM
Subject: Re: [GENERAL] 4 billion + oids


> On Mon, Mar 24, 2003 at 03:35:11PM -0600, Larry Rosenman wrote:
> > but there is a BIGSERIAL or SERIAL8 IIRC.
>
> Yes. I just was trying to note that it's not like you automatically
> get the advantage of the larger sequences if you dump a 7.1.x
> database and load it into 7.2. (Please redirect discussions about
> how hard/easy Postgres is to upgrade to some other list ;-)
>
> A
>
> --
> ----
> Andrew Sullivan 204-4141 Yonge Street
> Liberty RMS Toronto, Ontario Canada
> M2P 2A8
> +1 416 646 3304 x110
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org



Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!