Thread: 4 billion + oids
Hi all,
We have a problem with our PostgreSQL application. We are trying to decide if this is a major problem with Postgres, or is due to our application design.
Our application generates thousands of temp tables and data is moved and sorted several times in each batched insertion of data.
We chew though OIDs very quickly. At our customers site, the OID count is exceeding 4 billion and wrapping every two weeks (approx.).
We have found this to be a major problem. It seems once the OIDs wrap; we constantly get errors due to "Cannot insert a duplicate key into unique index pg_class_oid_index". There are about 3,000 entries in pg_class at this stage. As most of the tables are dynamically generated, a failure means lost data.
I am in the middle of designing an application change as a stop-gap fix:
- Change all creation of temp tables with "without oids", hoping to reduce the consumption of OIDS
- Check for the error string "Cannot insert a duplicate key into unique index pg_class_oid_index" in the batch shells for each function call and re-run if required.
- Remove all truncate statements from processing
- Trigger dump and restore of the database during nightly processing if the error occurs.
Can anyone comment if they know this is a fundamental limitation of PostgreSQL and if other databases have addressed this problem? Or alternatively, is it a problem with our application that must be corrected?
If anyone can instruct me on a better way to handle this, it would be appreciated.
Level 3, 351-353 Elizabeth Street
Melbourne, VIC, 3000
andrewb@evolvosystems.com
tel. +613 9642 3200
fax. +613 9642 4108
mob. +613 0414 520 920
You need very much to read the manual and administrator's guide. I /believe/ it is possible to recompile postgresql with support for more oids. Remember to vacuum and analyze routinely. And when you recompile upgrade to 7.3.2. -- Daniel R. Anderson Chief Lab Rat and Helper Monkey Great Lakes Industries, Inc. 80 Pineview Ave. Buffalo, NY 14218 (716) 691-5900 x218 "Never let your schooling interfere with your education" -- Mark Twain
On Sun, 2003-03-23 at 21:42, Daniel R. Anderson wrote: > You need very much to read the manual and administrator's guide. I > /believe/ it is possible to recompile postgresql with support for more > oids. No, it's not. > Remember to vacuum and analyze routinely. AFAIK, neither of these would have any effect on OID usage. Cheers, Neil
On Mon, Mar 24, 2003 at 01:35:54PM +1100, Andrew Bartley wrote: > We chew though OIDs very quickly. At our customers' site, the OID count > is exceeding 4 billion and wrapping every two weeks (approx.). > > 1.. Change all creation of temp tables with "without oids", hoping to reduce the consumption of OIDS This is your solution. Do oyu use OID anywhere within your application. If not (you shouldn't be) create all table "without oids". If you stop using OIDs so rapidly, your problems should go away. > 2.. Check for the error string "Cannot insert a duplicate key into unique index pg_class_oid_index" in the batch shellsfor each function call and re-run if required. > Can anyone comment if they know this is a fundamental limitation of > PostgreSQL and if other databases have addressed this problem? Or > alternatively, is it a problem with our application that must be > corrected? Postgresql uses OIDs for tables and triggers internally. If you don't need OIDs within your data tables you should be able to reduce your OID usage considerably. -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > IT is not something like pizza that you order in at one o'clock in > the morning. - John Loebenstein, St George CIO
Attachment
On Sun, 2003-03-23 at 21:35, Andrew Bartley wrote: > 1. Change all creation of temp tables with "without oids", hoping > to reduce the consumption of OIDS That should be the first thing you should do (in 7.4, there will be an ALTER TABLE command to disable OIDs). If you're not making use of OIDs in your application, you can specify WITHOUT OIDS for all user tables. IMHO this should be the default for CREATE TABLE some time in the near future. > 1. Check for the error string "Cannot insert a duplicate key into > unique index pg_class_oid_index" in the batch shells for each > function call and re-run if required. If you define all your tables (especially temp tables) using WITHOUT OIDS, you shouldn't need to bother with this. > Can anyone comment if they know this is a fundamental limitation of > PostgreSQL It's not a "fundamental limitation" at all, merely a problem of remaining backward compatible with the behavior of previous PostgreSQL releases. > We are running PostgreSQL 7.2.1 on Pentium 4 x86 type systems. Upgrading to the latest stable 7.2 release (7.2.4), or if possible 7.3.2, is always encouraged. Cheers, Neil
Thanks for your comments. I don't think we use OIDS in the application. Is the cluster command dependent on OIDS at the row/tupple level? Now that the application is live at our customers' site, have you any suggestions on how we go about converting all of our permanent and dynamically created app tables? I am thinking of doing a schema dump and editing the file globally. Then a schema load then a restore. Is there a PG table I can update to achieve the same end? Thanks again Andrew Bartley ----- Original Message ----- From: "Neil Conway" <neilc@samurai.com> To: "Andrew Bartley" <abartley@evolvosystems.com> Cc: "PostgreSQL General" <pgsql-general@postgresql.org> Sent: Monday, March 24, 2003 2:14 PM Subject: Re: [GENERAL] 4 billion + oids > On Sun, 2003-03-23 at 21:35, Andrew Bartley wrote: > > 1. Change all creation of temp tables with "without oids", hoping > > to reduce the consumption of OIDS > > That should be the first thing you should do (in 7.4, there will be an > ALTER TABLE command to disable OIDs). If you're not making use of OIDs > in your application, you can specify WITHOUT OIDS for all user tables. > IMHO this should be the default for CREATE TABLE some time in the near > future. > > > 1. Check for the error string "Cannot insert a duplicate key into > > unique index pg_class_oid_index" in the batch shells for each > > function call and re-run if required. > > If you define all your tables (especially temp tables) using WITHOUT > OIDS, you shouldn't need to bother with this. > > > Can anyone comment if they know this is a fundamental limitation of > > PostgreSQL > > It's not a "fundamental limitation" at all, merely a problem of > remaining backward compatible with the behavior of previous PostgreSQL > releases. > > > We are running PostgreSQL 7.2.1 on Pentium 4 x86 type systems. > > Upgrading to the latest stable 7.2 release (7.2.4), or if possible > 7.3.2, is always encouraged. > > Cheers, > > Neil > > >
"Andrew Bartley" <abartley@evolvosystems.com> writes: > We have found this to be a major problem. It seems once the OIDs wrap; we = > constantly get errors due to "Cannot insert a duplicate key into unique ind= > ex pg_class_oid_index". There are about 3,000 entries in pg_class at this = > stage. Once the OID counter wraps, there's certainly some risk of OID collisions. However, if you have only 3000 entries in pg_class it's hard to see why the odds would be worse than 3000/4billion or less than one failure in 1 million tries. I think there is something you have not told us. The nearby suggestions to minimize the rate of OID consumption seem rather beside the point from here ... what I'd wonder about is why you need as many as three thousand tables. Reducing that footprint should reduce the odds of OID collision. regards, tom lane
My guess is most of the applications on postgress wouldn't totally rely on oids, though they may implicitly use them if they use standard sql create table statements. My concern is that during the wrap arounds this could create unintended problems in table, index creations or elsewhere. Probably its worthwhile to enumerate the list of potential problems (eg what we now know table creation, index creation), their error messages (so applicatons can handle them cleanly) and possibly their workarounds. This list could then be very useful as part of the discussion on oids in the documentation. Thanks... Amin On Mon, 24 Mar 2003 01:46:36 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: >"Andrew Bartley" <abartley@evolvosystems.com> writes: >> We have found this to be a major problem. It seems once >>the OIDs wrap; we = >> constantly get errors due to "Cannot insert a duplicate >>key into unique ind= >> ex pg_class_oid_index". There are about 3,000 entries >>in pg_class at this = >> stage. > >Once the OID counter wraps, there's certainly some risk >of OID >collisions. However, if you have only 3000 entries in >pg_class it's >hard to see why the odds would be worse than >3000/4billion or less than >one failure in 1 million tries. I think there is >something you have not >told us. > >The nearby suggestions to minimize the rate of OID >consumption seem >rather beside the point from here ... what I'd wonder >about is why you >need as many as three thousand tables. Reducing that >footprint should >reduce the odds of OID collision. > > regards, tom lane > > >---------------------------(end of >broadcast)--------------------------- >TIP 6: Have you searched our list archives? > >http://archives.postgresql.org
> -----Original Message----- > From: Amin Abdulghani [mailto:amin@quantiva.com] > Sent: Monday, March 24, 2003 8:43 AM > To: pgsql-general@postgresql.org > Subject: Re: [GENERAL] 4 billion + oids > > My guess is most of the applications on postgress wouldn't > totally rely on oids, though they may implicitly use them > if they use standard sql create table statements. My > concern is that during the wrap arounds this could create > unintended problems in table, index creations or > elsewhere. Probably its worthwhile to enumerate the list > of potential problems (eg what we now know table creation, > index creation), their error messages (so applicatons can > handle them cleanly) and possibly their workarounds. This > list could then be very useful as part of the discussion > on oids in the documentation. I have definitely seen serious problems relating to OID wrap. We have an application that uses the OID's to create joins. We scan tables from a list of source tables from some external (usually non-postgresql) database. For each of these tables we create a 64 bit checksum for each record and store it in a table, along with an OID. Then, after the passage of time, we rename the current checksum table and repeat the process. Then, we can do joins between the two checksum tables and see which records are updated, added or deleted. Often, we deal with systems that have tens or even hundreds of millions of records, and the tasks run on a nightly basis. Hence, we eventually run out of new OIDs. Our current solution is to drop the database and recreate it. It's an awful solution to the problem but currently there is nothing better. It requires us to run a full synchronization instead of an incremental, which is orders of magnitude slower. It sure would be nice to have 64 bit OID values.
"Dann Corbit" <DCorbit@connx.com> writes: > I have definitely seen serious problems relating to OID wrap. > We have an application that uses the OID's to create joins. > We scan tables from a list of source tables from some external (usually > non-postgresql) database. > For each of these tables we create a 64 bit checksum for each record and > store it in a table, along with an OID. Why in the world aren't you using a serial int8 column, instead? regards, tom lane
> -----Original Message----- > From: Tom Lane [mailto:tgl@sss.pgh.pa.us] > Sent: Monday, March 24, 2003 11:17 AM > To: Dann Corbit > Cc: pgsql-general@postgresql.org > Subject: Re: [GENERAL] 4 billion + oids > > > "Dann Corbit" <DCorbit@connx.com> writes: > > I have definitely seen serious problems relating to OID > wrap. We have > > an application that uses the OID's to create joins. We scan tables > > from a list of source tables from some external (usually > > non-postgresql) database. > > For each of these tables we create a 64 bit checksum for > each record > > and store it in a table, along with an OID. > > Why in the world aren't you using a serial int8 column, instead? I am using Postgresql 7.1.3 (our own native port) and I was not aware that feature was available. If I can create tables without OID's and with 8 byte serial columns I will change to that. The documentation I read said that serial values were 4 bytes long. How do I declare and 8 byte serial column?
"Dann Corbit" <DCorbit@connx.com> writes: >> Why in the world aren't you using a serial int8 column, instead? > I am using Postgresql 7.1.3 (our own native port) and I was not aware > that feature was available. It may not be, in 7.1 --- I forget when we widened sequences to int8. However, 64-bit OIDs are certainly never going to appear in 7.1 either. regards, tom lane
On Mon, Mar 24, 2003 at 02:29:34PM -0500, Tom Lane wrote: > It may not be, in 7.1 --- I forget when we widened sequences to int8. 7.2. The SERIAL fields are int4 still, though. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
--On Monday, March 24, 2003 16:28:01 -0500 Andrew Sullivan <andrew@libertyrms.info> wrote: > On Mon, Mar 24, 2003 at 02:29:34PM -0500, Tom Lane wrote: >> It may not be, in 7.1 --- I forget when we widened sequences to int8. > > 7.2. The SERIAL fields are int4 still, though. but there is a BIGSERIAL or SERIAL8 IIRC. > > A > -- > ---- > Andrew Sullivan 204-4141 Yonge Street > Liberty RMS Toronto, Ontario Canada > <andrew@libertyrms.info> M2P 2A8 > +1 416 646 3304 x110 > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: ler@lerctr.org US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749
On Mon, Mar 24, 2003 at 03:35:11PM -0600, Larry Rosenman wrote: > but there is a BIGSERIAL or SERIAL8 IIRC. Yes. I just was trying to note that it's not like you automatically get the advantage of the larger sequences if you dump a 7.1.x database and load it into 7.2. (Please redirect discussions about how hard/easy Postgres is to upgrade to some other list ;-) A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
I hope this helps. PostgreSQL OID problem summary The default object creation properties set OID's to "with" at tuple level. OIDs start at 0 and reach a limit at 4,294,967,295. New OID's are assigned for each object created. OIDs will reach the maximum limit, this will cause an "OID wrap". An OID wrap will cause a problem eventually. Its a statistical probability that data loss will occur. Suggested solution(s) 1.reduce oid consumption pros - less chance of oid wrap cons - wrap will happen (but may take many years) 2.drop and reload database pros - starts oid count from 0 cons - downtime is significant (maybe many hours) 3.reduce number of tables,indexes etc. pros - less chance of oid clash cons - application may require high number of tables,indexes etc. Re: our application problem, the cons for solutions 2/3 are significant so as to prevent them being used in our application. Solution 1 is practical. The database will be dropped and reloaded with a modified schema that does not use OID's at the tuple level. No user tables at the tuple level will use OID's in the new schema. We believe that our OID consumption will decrease from 4,294,967,295 every two weeks (approx.), to 4,294,967,295 every 190 years (approx.) Thanks Andrew Bartley ----- Original Message ----- From: "Andrew Sullivan" <andrew@libertyrms.info> To: <pgsql-general@postgresql.org> Sent: Tuesday, March 25, 2003 8:39 AM Subject: Re: [GENERAL] 4 billion + oids > On Mon, Mar 24, 2003 at 03:35:11PM -0600, Larry Rosenman wrote: > > but there is a BIGSERIAL or SERIAL8 IIRC. > > Yes. I just was trying to note that it's not like you automatically > get the advantage of the larger sequences if you dump a 7.1.x > database and load it into 7.2. (Please redirect discussions about > how hard/easy Postgres is to upgrade to some other list ;-) > > A > > -- > ---- > Andrew Sullivan 204-4141 Yonge Street > Liberty RMS Toronto, Ontario Canada > <andrew@libertyrms.info> M2P 2A8 > +1 416 646 3304 x110 > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org > >
On Tue, Mar 25, 2003 at 11:43:57AM +1100, Andrew Bartley wrote: > 3.reduce number of tables,indexes etc. > pros - less chance of oid clash > cons - application may require high number of tables,indexes etc. > > Re: our application problem, the cons for solutions 2/3 are > significant so as to prevent them being used in our application. Note that you probably need to REINDEX the table pg_class if you use too much temp tables. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "La gente vulgar solo piensa en pasar el tiempo; el que tiene talento, en aprovecharlo"
Hi, group,
How do I know whether my postgresql database is using Turple level or object level OIDs?
postgresql 7.2.3 on Redhat 8.0
Thank you
Andrew Bartley <abartley@evolvosystems.com> wrote:
I hope this helps.
PostgreSQL OID problem summary
The default object creation properties set OID's to "with" at tuple level.
OIDs start at 0 and reach a limit at 4,294,967,295. New OID's are assigned
for each object created.
OIDs will reach the maximum limit, this will cause an "OID wrap". An OID
wrap will cause a problem eventually.
Its a statistical probability that data loss will occur.
Suggested solution(s)
1.reduce oid consumption
pros - less chance of oid wrap
cons - wrap will happen (but may take many years)
2.drop and reload database
pros - starts oid count from 0
cons - downtime is significant (maybe many hours)
3.reduce number of tables,indexes etc.
pros - less chance of oid clash
cons - application may require high number of tables,indexes etc.
Re: our application problem, the cons for solu tions 2/3 are significant so
as to prevent
them being used in our application.
Solution 1 is practical. The database will be dropped and reloaded with a
modified schema that does
not use OID's at the tuple level. No user tables at the tuple level will
use OID's in the new schema.
We believe that our OID consumption will decrease from 4,294,967,295 every
two weeks (approx.), to 4,294,967,295 every 190 years (approx.)
Thanks
Andrew Bartley
----- Original Message -----
From: "Andrew Sullivan"
To:
Sent: Tuesday, March 25, 2003 8:39 AM
Subject: Re: [GENERAL] 4 billion + oids
> On Mon, Mar 24, 2003 at 03:35:11PM -0600, Larry Rosenman wrote:
> > but there is a BIGSERIAL or SERIAL8 IIRC.
>
> Yes. I just was trying to note that it's not like you automatically
> get the advantage of the larger sequences if you dump a 7.1.x
> database and load it into 7.2. (Please redirect discussions about
> how hard/easy Postgres is to upgrade to some other list ;-)
>
> A
>
> --
> ----
> Andrew Sullivan 204-4141 Yonge Street
> Liberty RMS Toronto, Ontario Canada
> M2P 2A8
> +1 416 646 3304 x110
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
>
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
Do you Yahoo!?
Yahoo! Platinum - Watch CBS' NCAA March Madness, live on your desktop!