Thread: Solving the OID-collision problem
I was reminded again today of the problem that once a database has been in existence long enough for the OID counter to wrap around, people will get occasional errors due to OID collisions, eg http://archives.postgresql.org/pgsql-general/2005-08/msg00172.php Getting rid of OID usage in user tables doesn't really do a darn thing to fix this. It may delay wrap of the OID counter, but it doesn't stop it; and what's more, when the problem does happen it will be more serious (because the OIDs assigned to persistent objects will form a more densely packed set, so that you have a greater chance of collisions over a shorter time period). We've sort of brushed this problem aside in the past by telling people they could just retry their transaction ... but why don't we make the database do the retrying? I'm envisioning something like the attached quick-hack, which arranges that the pg_class and pg_type rows for tables will never be given OIDs duplicating an existing entry. It basically just keeps generating and discarding OIDs until it finds one not in the table. (This will of course not work for user-table OIDs, since we don't necessarily have an OID index on them, but it will work for all the system catalogs that have OIDs.) I seem to recall having thought of this idea before, and having rejected it as being too much overhead to solve a problem that occurs only rarely --- but in a quick test involving many repetitions of create temp table t1(f1 int, f2 int);drop table t1; the net penalty was only about a 2% slowdown on one machine, and no measurable difference at all on another. So it seems like it might be worth doing. Comments? regards, tom lane *** src/backend/catalog/heap.c.orig Thu Jul 28 16:56:40 2005 --- src/backend/catalog/heap.c Wed Aug 3 19:20:22 2005 *************** *** 187,192 **** --- 187,229 ---- * ---------------------------------------------------------------- */ + /* + * Quick hack to generate an OID not present in the specified catalog + */ + static Oid + safe_newoid(Oid catalogId, Oid oidIndexId) + { + Oid newOid; + Relation catalogRelation; + SysScanDesc scan; + ScanKeyData key; + bool collides; + + catalogRelation = heap_open(catalogId, AccessShareLock); + + do + { + newOid = newoid(); + + ScanKeyInit(&key, + ObjectIdAttributeNumber, + BTEqualStrategyNumber, F_OIDEQ, + ObjectIdGetDatum(newOid)); + + scan = systable_beginscan(catalogRelation, oidIndexId, true, + SnapshotNow, 1, &key); + + collides = HeapTupleIsValid(systable_getnext(scan)); + + systable_endscan(scan); + } while (collides); + + heap_close(catalogRelation, AccessShareLock); + + return newOid; + } + + /* ---------------------------------------------------------------- * heap_create - Create an uncatalogedheap relation * *************** *** 227,233 **** * Allocate an OID for the relation, unless we were told what to use. */ if (!OidIsValid(relid)) ! relid = newoid(); /* * Decide if we need storage or not, and handle a couple other --- 264,270 ---- * Allocate an OID for the relation, unless we were told what to use. */ if (!OidIsValid(relid)) ! relid = safe_newoid(RelationRelationId, ClassOidIndexId); /* * Decide if we need storage or not, andhandle a couple other *************** *** 714,720 **** new_rel_oid = RelationGetRelid(new_rel_desc); /* Assign an OID for the relation's tuple type */ ! new_type_oid = newoid(); /* * now create an entry in pg_class for the relation. --- 751,757 ---- new_rel_oid = RelationGetRelid(new_rel_desc); /* Assign an OID for the relation's tuple type */ ! new_type_oid = safe_newoid(TypeRelationId, TypeOidIndexId); /* * now create an entry in pg_class for therelation.
On Wed, 3 Aug 2005, Tom Lane wrote: > I seem to recall having thought of this idea before, and having rejected > it as being too much overhead to solve a problem that occurs only rarely > --- but in a quick test involving many repetitions of > > create temp table t1(f1 int, f2 int); > drop table t1; > > the net penalty was only about a 2% slowdown on one machine, and no > measurable difference at all on another. So it seems like it might > be worth doing. > > Comments? Looks good. Another approach would be to put the existing code in a PG_TRY() block and catching the duplicate key violation. This is more optimistic in that we only wear the systable scan(s) when we encounter a problem. The issue is distinguishing between a duplicate key violation caused by the OID and a violation due, say, to a duplication table/namespace entry. We catch that further up the code path at the moment but it isn't future proof. The fact that there is little measurable differences suggests that your solution is fine, though. Thanks, Gavin
Gavin Sherry <swm@linuxworld.com.au> writes: > Looks good. Another approach would be to put the existing code in a > PG_TRY() block and catching the duplicate key violation. Not really feasible from a code-structure point of view, I'm afraid. Also there is the issue of cleaning up leaked resources (buffer pins etc), which would probably require a subtransaction to be safe, at which point it's not looking so fast anymore ... regards, tom lane
> I was reminded again today of the problem that once a database has been > in existence long enough for the OID counter to wrap around, people will > get occasional errors due to OID collisions, eg > > http://archives.postgresql.org/pgsql-general/2005-08/msg00172.php > > Getting rid of OID usage in user tables doesn't really do a darn thing > to fix this. It may delay wrap of the OID counter, but it doesn't stop > it; and what's more, when the problem does happen it will be more > serious (because the OIDs assigned to persistent objects will form a > more densely packed set, so that you have a greater chance of collisions > over a shorter time period). > > We've sort of brushed this problem aside in the past by telling people > they could just retry their transaction ... but why don't we make the > database do the retrying? I'm envisioning something like the attached > quick-hack, which arranges that the pg_class and pg_type rows for tables > will never be given OIDs duplicating an existing entry. It basically > just keeps generating and discarding OIDs until it finds one not in the > table. (This will of course not work for user-table OIDs, since we > don't necessarily have an OID index on them, but it will work for all > the system catalogs that have OIDs.) > > I seem to recall having thought of this idea before, and having rejected > it as being too much overhead to solve a problem that occurs only rarely > --- but in a quick test involving many repetitions of > > create temp table t1(f1 int, f2 int); > drop table t1; > > the net penalty was only about a 2% slowdown on one machine, and no > measurable difference at all on another. So it seems like it might > be worth doing. > > Comments? > > regards, tom lane > I hope I can say this without offense, but the obvious problem is not "collision," but "uniqueness." The most efficient way of dealing with the issue is to remove it. Why is there collision? It is because the number range of an OID is currently smaller than the possible usage. Maybe it is time to rething the OID all together and create something like a GUID (Yes, I hate them too). I know it is ugly, but it think coming up with strategies to work around a design limitation is a waste of time, correcting he design limitation is the best investment.
"Mark Woodward" <pgsql@mohawksoft.com> writes: > Why is there collision? It is because the number range of an OID is > currently smaller than the possible usage. Expanding OIDs to 64 bits is not really an attractive answer, on several grounds: 1. Disk space. 2. Performance. Doing this would require widening Datum to 64 bits, which is a system-wide performance hit on 32-bit machines. 3. Portability. We still manage to run on machines that have no 64-bit int type at all, and I for one am not prepared to give that up until it's necessary. Given that we've agreed to deprecate use of OIDs in user tables, I don't see any particular upside to making them 64 bits anyway. None of the system catalogs seem likely to ever contain enough entries that a 32-bit limit is a problem. These are all more or less the same arguments as concern 64-bit transaction IDs. The hacks we've indulged in to avoid that are far nastier and more invasive than what I'm suggesting for OIDs (vacuuming to forestall XID wraparound is certainly pretty ugly, and it's even user-visible). Perhaps at some point there will be a "64-bit build option" to make all these data types widen to 64 bits together. I don't really foresee it happening in the near future though (even on 64-bit hardware, I doubt the performance tradeoffs are very favorable). And abandoning support for the 32-bit universe altogether is surely a long way away. regards, tom lane
> "Mark Woodward" <pgsql@mohawksoft.com> writes: >> Why is there collision? It is because the number range of an OID is >> currently smaller than the possible usage. > > Expanding OIDs to 64 bits is not really an attractive answer, on several > grounds: > > 1. Disk space. I don't really see this as a problem except in REALLY small installations and PostgreSQL doesn't have that reputation. Also, we have without oid. > > 2. Performance. Doing this would require widening Datum to 64 bits, > which is a system-wide performance hit on 32-bit machines. Do you really think it would make a measurable difference, more so than your proposed solution? (I'm skeptical it would be measurable at all) > > 3. Portability. We still manage to run on machines that have no 64-bit > int type at all, and I for one am not prepared to give that up until > it's necessary. Maybe OID is no longer a number, but is now a structure: typedef struct _pgOID { time_t date; int id; }OID; (Not a serious proposal for the contents of the structure) > > Given that we've agreed to deprecate use of OIDs in user tables, I don't > see any particular upside to making them 64 bits anyway. None of the > system catalogs seem likely to ever contain enough entries that a 32-bit > limit is a problem. > > These are all more or less the same arguments as concern 64-bit > transaction IDs. The hacks we've indulged in to avoid that are far > nastier and more invasive than what I'm suggesting for OIDs (vacuuming > to forestall XID wraparound is certainly pretty ugly, and it's even > user-visible). > > Perhaps at some point there will be a "64-bit build option" to make all > these data types widen to 64 bits together. I don't really foresee it > happening in the near future though (even on 64-bit hardware, I doubt > the performance tradeoffs are very favorable). And abandoning support > for the 32-bit universe altogether is surely a long way away.
"Mark Woodward" <pgsql@mohawksoft.com> writes: >> 2. Performance. Doing this would require widening Datum to 64 bits, >> which is a system-wide performance hit on 32-bit machines. > Do you really think it would make a measurable difference, more so than > your proposed solution? (I'm skeptical it would be measurable at all) I'm too lazy to run an experiment, but I believe it would. Datum is involved in almost every function-call API in the backend. In particular this means that it would affect performance-critical code paths. Creation of tables and such isn't performance-critical in most applications, so a few percent overhead there doesn't bother me. A few percent across the board is another story. regards, tom lane
> "Mark Woodward" <pgsql@mohawksoft.com> writes: >>> 2. Performance. Doing this would require widening Datum to 64 bits, >>> which is a system-wide performance hit on 32-bit machines. > >> Do you really think it would make a measurable difference, more so than >> your proposed solution? (I'm skeptical it would be measurable at all) > > I'm too lazy to run an experiment, but I believe it would. Datum is > involved in almost every function-call API in the backend. In > particular this means that it would affect performance-critical code > paths. I hear you on the "lazy" part, but if OID becomes a structure, then you are still comparing a native type until you get a match, then you make one more comparison to confirm it is the right one, or move on. I think it is a small hit that wouldn't even be noticed. In fact, thinking about it.... typedef struct _pgOID { OLDOID_TYPE oldOID; time_t unique; }OID; Everything works as it did before except that there is 32 bit date identifier to prevent wrap around. Just one additional check is needed only if there is a wrap. > Creation of tables and such isn't performance-critical in most > applications, so a few percent overhead there doesn't bother me. A few > percent across the board is another story. Compared to all the other things going on, I would bet it isn't even measuable.
"Mark Woodward" <pgsql@mohawksoft.com> writes: >> I'm too lazy to run an experiment, but I believe it would. Datum is >> involved in almost every function-call API in the backend. In >> particular this means that it would affect performance-critical code >> paths. > I hear you on the "lazy" part, but if OID becomes a structure, then you > are still comparing a native type until you get a match, then you make one > more comparison to confirm it is the right one, or move on. No, you're missing the point entirely: on 32-bit architectures, passing a 32-bit integral type to a function is an extremely well optimized operation, as is returning a 32-bit integral type. Passing or returning a 64-bit struct is, um, not so well optimized. There's also the small problem that it really has to fit into Datum. regards, tom lane
> "Mark Woodward" <pgsql@mohawksoft.com> writes: >>> I'm too lazy to run an experiment, but I believe it would. Datum is >>> involved in almost every function-call API in the backend. In >>> particular this means that it would affect performance-critical code >>> paths. > >> I hear you on the "lazy" part, but if OID becomes a structure, then you >> are still comparing a native type until you get a match, then you make >> one >> more comparison to confirm it is the right one, or move on. > > No, you're missing the point entirely: on 32-bit architectures, passing > a 32-bit integral type to a function is an extremely well optimized > operation, as is returning a 32-bit integral type. Passing or > returning a 64-bit struct is, um, not so well optimized. That's only if you call by value, call by reference is just as optimized. > > There's also the small problem that it really has to fit into Datum. > Would it break a lot if you add more to a datum?
On Thu, Aug 04, 2005 at 12:20:24PM -0400, Tom Lane wrote: > "Mark Woodward" <pgsql@mohawksoft.com> writes: > >> I'm too lazy to run an experiment, but I believe it would. Datum is > >> involved in almost every function-call API in the backend. In > >> particular this means that it would affect performance-critical code > >> paths. > > I hear you on the "lazy" part, but if OID becomes a structure, then you > > are still comparing a native type until you get a match, then you make one > > more comparison to confirm it is the right one, or move on. > No, you're missing the point entirely: on 32-bit architectures, passing > a 32-bit integral type to a function is an extremely well optimized > operation, as is returning a 32-bit integral type. Passing or > returning a 64-bit struct is, um, not so well optimized. I don't think this is necessarily true. For example, instead of passing the 32-bit integer around, you would instead be passing a 32-bit pointer to a data structure. This doesn't have to be expensive - although, depending on the state of the API, it may require extensive changes to make it inexpensive (or not - I don't know). From my perspective (new to this list - could be good, or could be bad) the concept of the OID was too generalized. As a generalization, it appears to have originally been intended to uniquely identify every row in the database (system tables and user tables). As a generalization, 32-bits was not enough to represent every row in the database. It was a mistake. The work-around for this mistake, was to allow user tables to be specially defined to not unnecessarily steal range from the OID space. This work-around proved to be desirable enough, that as of PostgreSQL 8, tables are no longer created with OIDs by default. It's still a work-around. What has been purchased with this work-around is time to properly address this problem. The problem has not been solved. I see a few ways to solve this: 1) Create OID domains. The system tables could have their own OID counter separate from the user table OID counters.Tables that have no relationship to each other would be put in their own OID domain. It isn't as if youcan map from row OID to table anyways, so any use of OID assumes knowledge of the table relationships. I seethis as being relatively cheap to implement, with no impact on backwards compatibility, except in unusual cases where people have seriously abused the concept of an OID. This is another delay tactic, in that a sufficient numberof changes to the system tables would still cause a wrap-around, however, it is equivalent or better to thesuggestion that all user tables be created without oids, as this at least allows user tables to use oids again. 2) Enlarge the OID to be 64-bit or 128-bit. I don't see this as a necessarily being a performance problem, however,it might require significant changes to the API, which would be expensive. It might be argued that enlargingthe OID merely delays the problem, and doesn't actually address it. Perhaps delaying it by 2^32 is effectivelyindefinately delaying it, or perhaps not. Those who thought 32-bits would be enough, or those who thought2 digit years would be enough, under-estimated the problem. Compatibility can be mostly maintained, althoughthe databases would probably need to be upgraded, and applications that assumed that the OID could fitinto a 32-bit integer would break. 3) Leave OIDs as the general database-wide row identifier, and don't use OIDs to identifier system metadata. Instead,use a UUID (128-bit) or similar. System tables are special. Why shouldn't they have a non-general meansof identifying stored metadata? This has some of the benefits of 1, all of the costs of 2, and it additional breaks compatibility for everything. Based on my suggestions above, I see 1) as the best short and medium term route. How hard would it be? Instead of a database wide OID counter, we have several OID counters, with the table having an OID counter association. Assuming the OID domain is properly defined, all existing code continues to function properly, and wrap-around of the OID in one domain, doesn't break the other domains, such as the system tables. Cheers, mark -- mark@mielke.cc / markm@ncf.ca / markm@nortel.com __________________________ . . _ ._ . . .__ . . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/ |_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bindthem... http://mark.mielke.cc/
Tom Lane <tgl <at> sss.pgh.pa.us> writes: > > I was reminded again today of the problem that once a database has been > in existence long enough for the OID counter to wrap around, people will > get occasional errors due to OID collisions, eg > > http://archives.postgresql.org/pgsql-general/2005-08/msg00172.php > I'm a coworker of the original reporter. I tracked down the cause of the toast table unique constraint violation to the use of OIDs as chunk_id. After the OID wraparound, it tried to use an already used chunk_id. That table has lots of toast records which greatly increases the probability of a collision for the current section of the OID counter. > Getting rid of OID usage in user tables doesn't really do a darn thing > to fix this. It may delay wrap of the OID counter, but it doesn't stop > it; and what's more, when the problem does happen it will be more > serious (because the OIDs assigned to persistent objects will form a > more densely packed set, so that you have a greater chance of collisions > over a shorter time period). > > We've sort of brushed this problem aside in the past by telling people > they could just retry their transaction ... but why don't we make the > database do the retrying? I'm envisioning something like the attached > quick-hack, which arranges that the pg_class and pg_type rows for tables > will never be given OIDs duplicating an existing entry. It basically > just keeps generating and discarding OIDs until it finds one not in the > table. (This will of course not work for user-table OIDs, since we > don't necessarily have an OID index on them, but it will work for all > the system catalogs that have OIDs.) > This will also be needed for toast tables. They have the necessary index. - Ian
On Wed, 2005-08-03 at 19:43 -0400, Tom Lane wrote: > I was reminded again today of the problem that once a database has been > in existence long enough for the OID counter to wrap around, people will > get occasional errors due to OID collisions, eg > > http://archives.postgresql.org/pgsql-general/2005-08/msg00172.php > > Getting rid of OID usage in user tables doesn't really do a darn thing > to fix this. It may delay wrap of the OID counter, but it doesn't stop > it; and what's more, when the problem does happen it will be more > serious (because the OIDs assigned to persistent objects will form a > more densely packed set, so that you have a greater chance of collisions > over a shorter time period). > > We've sort of brushed this problem aside in the past by telling people > they could just retry their transaction ... but why don't we make the > database do the retrying? I'm envisioning something like the attached > quick-hack, which arranges that the pg_class and pg_type rows for tables > will never be given OIDs duplicating an existing entry. It basically > just keeps generating and discarding OIDs until it finds one not in the > table. (This will of course not work for user-table OIDs, since we > don't necessarily have an OID index on them, but it will work for all > the system catalogs that have OIDs.) > Comments? Seems like a practical step, but more helpful for users of earlier releases. Many of those users are now reaching OID wrap and will begin to get messages like this as time continues. Not everybody *can* upgrade. I'd like to suggest that we backpatch this to 7.3 at least. Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > On Wed, 2005-08-03 at 19:43 -0400, Tom Lane wrote: >> We've sort of brushed this problem aside in the past by telling people >> they could just retry their transaction ... but why don't we make the >> database do the retrying? I'm envisioning something like the attached >> quick-hack, which arranges that the pg_class and pg_type rows for tables >> will never be given OIDs duplicating an existing entry. > I'd like to suggest that we backpatch this to 7.3 at least. Considering we don't even have code to do this, much less have expended one day of beta testing on it, back-patching seems a bit premature. regards, tom lane
On Mon, 2005-08-08 at 16:55 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > On Wed, 2005-08-03 at 19:43 -0400, Tom Lane wrote: > >> We've sort of brushed this problem aside in the past by telling people > >> they could just retry their transaction ... but why don't we make the > >> database do the retrying? I'm envisioning something like the attached > >> quick-hack, which arranges that the pg_class and pg_type rows for tables > >> will never be given OIDs duplicating an existing entry. > > > I'd like to suggest that we backpatch this to 7.3 at least. > > Considering we don't even have code to do this, much less have expended > one day of beta testing on it, back-patching seems a bit premature. Tom, You provided a patch and explained your testing of it. It seems to be a useful test to me, and as I said a practical solution to OID wrap. OID wrap is a long-term problem for PostgreSQL installations. I'm not sure that lots of beta testing would do any good at all, since the proposed patch is very unlikely to occur in 8.1, and almost certainly not during a short period of beta testing. As I mentioned, as time goes on, this is very likely to occur with older installations before it occurs with newer ones. Older databases tend to have older releases, hence the comment to backpatch. I regard this as a safety/robustness feature, just as I would other robustness fixes that have been backpatched without a beta test phase. If we have the code it seems strange to wait for many people to start logging complaints before we backpatch. I can see that will only lead to PostgreSQL's robustness falling into disrepute, rather than encouraging anyone to upgrade. In any case, if we choose not to backpatch now, we can at least discuss a plan for it in the future? Planning is never premature, IMHO. Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > On Mon, 2005-08-08 at 16:55 -0400, Tom Lane wrote: >> Considering we don't even have code to do this, much less have expended >> one day of beta testing on it, back-patching seems a bit premature. > You provided a patch and explained your testing of it. It seems to be a > useful test to me, and as I said a practical solution to OID wrap. I didn't provide a patch --- I provided a proof-of-concept hack that covered just two of the seventeen catalogs with OIDs (and not every case even for those two). A real patch would likely be much more invasive than this, anyway, because we'd want to fix things so that you couldn't accidentally forget to use the free-OID-finding code. regards, tom lane
On Mon, 2005-08-08 at 19:50 -0400, Tom Lane wrote: > Simon Riggs <simon@2ndquadrant.com> writes: > > On Mon, 2005-08-08 at 16:55 -0400, Tom Lane wrote: > >> Considering we don't even have code to do this, much less have expended > >> one day of beta testing on it, back-patching seems a bit premature. > > > You provided a patch and explained your testing of it. It seems to be a > > useful test to me, and as I said a practical solution to OID wrap. > > I didn't provide a patch --- I provided a proof-of-concept hack that > covered just two of the seventeen catalogs with OIDs (and not every case > even for those two). A real patch would likely be much more invasive > than this, anyway, because we'd want to fix things so that you couldn't > accidentally forget to use the free-OID-finding code. OK, I see where you are coming from now. But I also see it is a much bigger problem than it was before. We either need to have a special routine for each catalog table, or we scan all tables, all of the time. The latter is a disaster, so lets look at the former: spicing the code with appropriate catalog checks would be a lot of work and probably very error prone and hard to maintain. We would never be sure that any particular check had been done appropriately. Different proposal: 1. When we wrap we set up an OID Free Space Map. We do this once when we wrap, rather than every time we collide. We scan all catalog tables and set the bits in a single 8192 byte block and write it out to disk. We then allocate OIDs from completely untouched chunks, otherwise much as we do now, except for the occasional re-allocation of a chunk every 32768 OIDs. In theory, we will never collide on permanent catalog entries. (If the OIDFSM is not there, we would assume we haven't wrapped yet). 2. We segment the available OID space, to encourage temporary object types not to overlap. The first feature is designed to simplify the OID checking, so that we don't need to add lots of additional code: we can isolate this code. It also performs much better. The segmentation of the OID space mitigates against the possibility that we might use all the bits in the FSM, making it much more unlikely (and so I would propose not to plug that gap). When creating temporary objects, they should start at FirstTemporaryObjectId (max/2) and continue up to the max. When they hit *their* max, they cycle back round to FirstTemporaryObjectId. If we collide on an OID, then we issue another one. The "main" space would then be available for use by all other objects. The OIDFSM is much the same as the CLOG, so perhaps we might even reuse that code. However, since the OIDFSM is so rarely used, there seems less need to cache it, so that would probably be overkill. Since, as I think I've mentioned :-) , we should backpatch this to 7.3, then we wouldn't be able to do that if we used the slru.c approach. (nor would we be able to implement the second feature, temp OID zoning). Since OIDs are already xlogged we need not write anything differently there. We would need to update the recovery code to maintain the OIDFSM, though it would probably be wise to rebuild it completely after a PITR. Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > We either need to have a special routine for each catalog table, or we > scan all tables, all of the time. The latter is a disaster, so lets look > at the former: spicing the code with appropriate catalog checks would be > a lot of work and probably very error prone and hard to maintain. We > would never be sure that any particular check had been done > appropriately. I don't think it's as bad as all that. As my prototype showed, we only need one base routine for this; the trick is to give it the pg_class OIDs of the target catalog and the catalog's index on OID. [ ... click click, grep grep ... ] There are only eight calls to newoid() in the backend, and in six of them we know exactly which catalog we are inserting into and which index could be used to check uniqueness. One of them is actually generating a relfilenode value not a normal OID, so we would need a special routine that looks into the filesystem to check uniqueness, but that's no big deal. The only call that's at all problematic is the one in heap_insert --- there, we know the target relation, but haven't got any direct access to knowledge about whether it has an OID index. There are several ways you could deal with that. The simplest is to just have a small constant table someplace, listing the pg_class OIDs of all the catalogs that have OIDs and the pg_class OIDs of their OID indexes. This would be a little bit of a maintenance gotcha (someone could add a new catalog having OIDs and forget to add it to that table) but certainly there are many worse gotchas than that in the system. I was also toying with the idea of automating it: if the target table has OIDs, look to see if it has a unique index on OID, and if so use that. (If we cache the result in Relation structures, this shouldn't be too terribly expensive timewise --- in fact, we could make RelationGetIndexList do and cache this check, which would make it virtually free since if you're inserting you have certainly got to do RelationGetIndexList somewhere along the line.) The interesting thing about that is that the guaranteed-unique-OID behavior would then be available for user tables too, if we cared to document how to use it. This might be gilding the lily though. > 1. When we wrap we set up an OID Free Space Map. We do this once when we > wrap, rather than every time we collide. We scan all catalog tables and > set the bits in a single 8192 byte block and write it out to disk. > We then allocate OIDs from completely untouched chunks, What if there aren't any "untouched chunks"? With only 64K-chunk granularity, I think you'd hit that condition a lot more than you are hoping. Also, this seems to assume uniqueness across all tables in an entire cluster, which is much more than we want; it makes the 32-bit size of OIDs significantly more worrisome than when they only need to be unique within a table. regards, tom lane
Richard Huxton <dev@archonet.com> writes: > Can I ask what happens if we end up re-using a recently de-allocated > OID? Specifically, can a cached plan (e.g. plpgsql function) end up > referring to an object created after it was planned: Definitely a potential risk, but not one to be solved by the sorts of mechanisms we are discussing here. The answer to that one is invalidation of cached plans using the SI message mechanism or some extension thereof. I think Neil Conway was looking into this fairly recently, but it didn't get done for 8.1. regards, tom lane
Tom Lane wrote: > > What if there aren't any "untouched chunks"? With only 64K-chunk > granularity, I think you'd hit that condition a lot more than you are > hoping. Also, this seems to assume uniqueness across all tables in an > entire cluster, which is much more than we want; it makes the 32-bit > size of OIDs significantly more worrisome than when they only need to be > unique within a table. Can I ask what happens if we end up re-using a recently de-allocated OID? Specifically, can a cached plan (e.g. plpgsql function) end up referring to an object created after it was planned: CREATE FUNCTION f1()... -- oid=1234 CREATE FUNCTION f2()... -- oid=1235, calls f1() or oid=1234 DROP FUNCTION f1() CREATE FUNCTION f3()... -- re-uses oid=1234 -- Richard Huxton Archonet Ltd
On Tue, 2005-08-09 at 16:01 +0100, Richard Huxton wrote: > Tom Lane wrote: > > > > What if there aren't any "untouched chunks"? With only 64K-chunk > > granularity, I think you'd hit that condition a lot more than you are > > hoping. Also, this seems to assume uniqueness across all tables in an > > entire cluster, which is much more than we want; it makes the 32-bit > > size of OIDs significantly more worrisome than when they only need to be > > unique within a table. > > Can I ask what happens if we end up re-using a recently de-allocated > OID? Specifically, can a cached plan (e.g. plpgsql function) end up > referring to an object created after it was planned: > > CREATE FUNCTION f1()... -- oid=1234 > CREATE FUNCTION f2()... -- oid=1235, calls f1() or oid=1234 > DROP FUNCTION f1() > CREATE FUNCTION f3()... -- re-uses oid=1234 Possible, but extremely unlikely... you'd have to keep a session open with a prepared query for as long as it takes to create a 4 billion tables... not a high priority case, eh? Best Regards, Simon Riggs
Just to chime in --- I have been surprised how _few_ complaints we have gotten about oid wraparound hitting system table oid conflicts. I agree that telling people to retry their CREATE statements isn't really an ideal solution, and the idea of looping to find a free oid is a good one. --------------------------------------------------------------------------- Tom Lane wrote: > I was reminded again today of the problem that once a database has been > in existence long enough for the OID counter to wrap around, people will > get occasional errors due to OID collisions, eg > > http://archives.postgresql.org/pgsql-general/2005-08/msg00172.php > > Getting rid of OID usage in user tables doesn't really do a darn thing > to fix this. It may delay wrap of the OID counter, but it doesn't stop > it; and what's more, when the problem does happen it will be more > serious (because the OIDs assigned to persistent objects will form a > more densely packed set, so that you have a greater chance of collisions > over a shorter time period). > > We've sort of brushed this problem aside in the past by telling people > they could just retry their transaction ... but why don't we make the > database do the retrying? I'm envisioning something like the attached > quick-hack, which arranges that the pg_class and pg_type rows for tables > will never be given OIDs duplicating an existing entry. It basically > just keeps generating and discarding OIDs until it finds one not in the > table. (This will of course not work for user-table OIDs, since we > don't necessarily have an OID index on them, but it will work for all > the system catalogs that have OIDs.) > > I seem to recall having thought of this idea before, and having rejected > it as being too much overhead to solve a problem that occurs only rarely > --- but in a quick test involving many repetitions of > > create temp table t1(f1 int, f2 int); > drop table t1; > > the net penalty was only about a 2% slowdown on one machine, and no > measurable difference at all on another. So it seems like it might > be worth doing. > > Comments? > > regards, tom lane > > > *** src/backend/catalog/heap.c.orig Thu Jul 28 16:56:40 2005 > --- src/backend/catalog/heap.c Wed Aug 3 19:20:22 2005 > *************** > *** 187,192 **** > --- 187,229 ---- > * ---------------------------------------------------------------- */ > > > + /* > + * Quick hack to generate an OID not present in the specified catalog > + */ > + static Oid > + safe_newoid(Oid catalogId, Oid oidIndexId) > + { > + Oid newOid; > + Relation catalogRelation; > + SysScanDesc scan; > + ScanKeyData key; > + bool collides; > + > + catalogRelation = heap_open(catalogId, AccessShareLock); > + > + do > + { > + newOid = newoid(); > + > + ScanKeyInit(&key, > + ObjectIdAttributeNumber, > + BTEqualStrategyNumber, F_OIDEQ, > + ObjectIdGetDatum(newOid)); > + > + scan = systable_beginscan(catalogRelation, oidIndexId, true, > + SnapshotNow, 1, &key); > + > + collides = HeapTupleIsValid(systable_getnext(scan)); > + > + systable_endscan(scan); > + } while (collides); > + > + heap_close(catalogRelation, AccessShareLock); > + > + return newOid; > + } > + > + > /* ---------------------------------------------------------------- > * heap_create - Create an uncataloged heap relation > * > *************** > *** 227,233 **** > * Allocate an OID for the relation, unless we were told what to use. > */ > if (!OidIsValid(relid)) > ! relid = newoid(); > > /* > * Decide if we need storage or not, and handle a couple other > --- 264,270 ---- > * Allocate an OID for the relation, unless we were told what to use. > */ > if (!OidIsValid(relid)) > ! relid = safe_newoid(RelationRelationId, ClassOidIndexId); > > /* > * Decide if we need storage or not, and handle a couple other > *************** > *** 714,720 **** > new_rel_oid = RelationGetRelid(new_rel_desc); > > /* Assign an OID for the relation's tuple type */ > ! new_type_oid = newoid(); > > /* > * now create an entry in pg_class for the relation. > --- 751,757 ---- > new_rel_oid = RelationGetRelid(new_rel_desc); > > /* Assign an OID for the relation's tuple type */ > ! new_type_oid = safe_newoid(TypeRelationId, TypeOidIndexId); > > /* > * now create an entry in pg_class for the relation. > > ---------------------------(end of broadcast)--------------------------- > TIP 6: explain analyze is your friend > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
On Mon, Aug 08, 2005 at 11:28:54PM +0100, Simon Riggs wrote: > As I mentioned, as time goes on, this is very likely to occur with older > installations before it occurs with newer ones. Older databases tend to > have older releases, hence the comment to backpatch. I regard this as a Well, as an alternate, they could (as longas they're on 7.3.3 or later, and they oughta be) replicate their database into a completely new instance of the same database version. They don't need to upgrade, and they can avoid this by setting the OID counter back. I hate to sound like a crank (though I confess to being one), but I don't have a huge amount of sympathy for people who run into a known limitation with no plans for what to do with it. We were all over the xid wraparound on one of our systems for exactly this reason -- it was a bit of an axious time, but we had a lot of notice, and we spent quite a bit of time preparing for it. A -- Andrew Sullivan | ajs@crankycanuck.ca A certain description of men are for getting out of debt, yet are against all taxes for raising money to pay it off. --Alexander Hamilton
Simon Riggs wrote: > On Tue, 2005-08-09 at 16:01 +0100, Richard Huxton wrote: > >>Tom Lane wrote: >> >>>What if there aren't any "untouched chunks"? With only 64K-chunk >>>granularity, I think you'd hit that condition a lot more than you are >>>hoping. Also, this seems to assume uniqueness across all tables in an >>>entire cluster, which is much more than we want; it makes the 32-bit >>>size of OIDs significantly more worrisome than when they only need to be >>>unique within a table. >> >>Can I ask what happens if we end up re-using a recently de-allocated >>OID? Specifically, can a cached plan (e.g. plpgsql function) end up >>referring to an object created after it was planned: >> >>CREATE FUNCTION f1()... -- oid=1234 >>CREATE FUNCTION f2()... -- oid=1235, calls f1() or oid=1234 >>DROP FUNCTION f1() >>CREATE FUNCTION f3()... -- re-uses oid=1234 > > > Possible, but extremely unlikely... you'd have to keep a session open > with a prepared query for as long as it takes to create a 4 billion > tables... not a high priority case, eh? Ah, but it does rule out the possibility of keeping a cache of "recently de-allocated" OIDs and re-using those. -- Richard Huxton Archonet Ltd
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Just to chime in --- I have been surprised how _few_ complaints we have > gotten about oid wraparound hitting system table oid conflicts. I agree > that telling people to retry their CREATE statements isn't really an > ideal solution, and the idea of looping to find a free oid is a good one. So in a world where all user tables have OIDs I can see this happening quite easily. A reasonably large database could easily have 4 billion records inserted in a reasonable amount of time. But with no OIDs on user tables it must take a really long time for this to happen. I mean, even if you have thousands of tables you would have to go through thousands (many thousands even) of dump/reload cycles before you push oid to 4 billion. Perhaps just a periodic warning starting when the OID counter hits, say, 2 billion telling people to dump/reload their database before it gets to 4 billion would be enough? All this stuff about retrying OIDs is cool and if someone wants to go and do it I wouldn't say they shouldn't. But it seems like a lot of effort to avoid a situation that I'm unclear will ever arise. A warning could more easily be backpatched to versions that defaulted to OIDs on user tables too. -- greg
On 2005-08-10, Greg Stark <gsstark@mit.edu> wrote: > But with no OIDs on user tables it must take a really long time for this to > happen. I mean, even if you have thousands of tables you would have to go > through thousands (many thousands even) of dump/reload cycles before you push > oid to 4 billion. Don't forget TOAST tables, they use oids too. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
On Wed, 2005-08-10 at 14:42 -0400, Greg Stark wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > > Just to chime in --- I have been surprised how _few_ complaints we have > > gotten about oid wraparound hitting system table oid conflicts. I agree > > that telling people to retry their CREATE statements isn't really an > > ideal solution, and the idea of looping to find a free oid is a good one. > > So in a world where all user tables have OIDs I can see this happening quite > easily. A reasonably large database could easily have 4 billion records > inserted in a reasonable amount of time. > > But with no OIDs on user tables it must take a really long time for this to > happen. I mean, even if you have thousands of tables you would have to go > through thousands (many thousands even) of dump/reload cycles before you push > oid to 4 billion. > > Perhaps just a periodic warning starting when the OID counter hits, say, 2 > billion telling people to dump/reload their database before it gets to 4 > billion would be enough? > > All this stuff about retrying OIDs is cool and if someone wants to go and do > it I wouldn't say they shouldn't. But it seems like a lot of effort to avoid a > situation that I'm unclear will ever arise. > > A warning could more easily be backpatched to versions that defaulted to OIDs > on user tables too. I agree with everything you just said. I think its a non-issue for 8.1+, but an important one for many earlier users. We *can* ask people to upgrade, but if they have not, there is usually a good reason. If we force them, they may upgrade to another RDBMS... Best Regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > I agree with everything you just said. As AndrewSN already pointed out, the argument is all wet because it ignores the use of OIDs for toasted values ... not to mention large objects. Yeah, it would take a while to wrap the counter, but it's hardly out of the question for it to happen. As it happens, I've spent today working on a patch, and I'm about to post it for comment. regards, tom lane