Thread: OID wraparound: summary and proposal
Given Hiroshi's objections, and the likelihood of compatibility problems for existing applications, I am now thinking that it's not a good idea to turn off OID generation by default. (At least not for 7.2 --- maybe in some future release we could change the default.) Based on the discussion so far, here is an attempt to flesh out the details of what to do with OIDs for 7.2: 1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE. The default behavior will be WITH OIDS. Note: there was some discussion of a GUC variable to control the default. I'm leaning against this, mainly because having one would mean that pg_dump *must* write WITH OIDS or WITHOUT OIDS in every CREATE TABLE; else it couldn't be sure that the database schema would be correctly reconstructed. That would create dump-script portability problems and negate some of the point of having a GUC variable in the first place. So I'm thinking a fixed default is better. Note: an alternative syntax possibility is to make it look like the "with" option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)". This is uglier today, but would start to look more attractive if we invent additional CREATE TABLE options in the future --- there'd be a place to put 'em. Comments? 2. A child table will be forced to have OIDs if any of its parents do, even if WITHOUT OIDS is specified in the child's CREATE command. This is on the theory that the OID ought to act like an inherited column. 3. For a table without OIDs, no entry will be made in pg_attribute for the OID column, so an attempt to reference the OID column will draw a "no such column" error. (An alternative is to allow OID to read as nulls, but it seemed that people preferred the error to be raised.) 4. When inserting into an OID-less table, the INSERT result string will always show 0 for the OID. 5. A "relhasoids" boolean column will be added to pg_class to signal whether a table has OIDs or not. 6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the table has no OIDs. (Alternative possibility: raise an error --- is that better?) COPY in WITH OIDS will silently drop the incoming OID values. 7. Physical tuple headers won't change. If no OIDs are assigned for a particular table, the OID field in the header will be left zero. 8. OID generation will be disabled for those system tables that don't need it --- pg_listener, pg_largeobject, and pg_attribute being some major offenders that consume lots of OIDs. 9. To continue to support COMMENT ON COLUMN when columns have no OIDs, pg_description will be modified so that its primary key is (object type, object OID, column number) --- this also solves the problem that comments break if there are duplicate OIDs in different system tables. The object type is the OID of the system catalog in which the object OID appears. The column number field will be zero for all object types except columns. For a column comment, the object type and OID fields will refer to the parent table, and column number will be nonzero. 10. pg_dump will be modified to do the appropriate things with OIDs. Are there any other application programs that need to change? We had also talked about adding an INSERT ... RETURNING feature to allow applications to eliminate their dependence on looking at the OID returned by an INSERT command. I think this is a good idea, but there are still a number of unsolved issues about how it should interact with rules. Accordingly, I'm not going to try to include it in this batch of work. Comments? regards, tom lane
On Wed, 1 Aug 2001, Tom Lane wrote: > Based on the discussion so far, here is an attempt to flesh out the > details of what to do with OIDs for 7.2: > > 1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE. > The default behavior will be WITH OIDS. > > Note: there was some discussion of a GUC variable to control the default. > > Note: an alternative syntax possibility is to make it look like the "with" > option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)". > This is uglier today, but would start to look more attractive if we invent > additional CREATE TABLE options in the future --- there'd be a place to > put 'em. Comments? I think a fixed default and placing it in parentheses are probably good ideas. > 3. For a table without OIDs, no entry will be made in pg_attribute for > the OID column, so an attempt to reference the OID column will draw a > "no such column" error. (An alternative is to allow OID to read as nulls, > but it seemed that people preferred the error to be raised.) Okay, at least the foreign key stuff will need to change (since it does a select oid), but I don't think it ever does anything with that except check for existance, so I could probably make it select 1 as reasonable replacement.
> Given Hiroshi's objections, and the likelihood of compatibility problems > for existing applications, I am now thinking that it's not a good idea to > turn off OID generation by default. (At least not for 7.2 --- maybe in > some future release we could change the default.) This seems good. People with oid concerns usually have 1-2 huge tables and the rest are small. > Based on the discussion so far, here is an attempt to flesh out the > details of what to do with OIDs for 7.2: > > 1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE. > The default behavior will be WITH OIDS. Makes sense. > Note: there was some discussion of a GUC variable to control the default. > I'm leaning against this, mainly because having one would mean that > pg_dump *must* write WITH OIDS or WITHOUT OIDS in every CREATE TABLE; > else it couldn't be sure that the database schema would be correctly > reconstructed. That would create dump-script portability problems and > negate some of the point of having a GUC variable in the first place. > So I'm thinking a fixed default is better. Good point. > Note: an alternative syntax possibility is to make it look like the "with" > option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)". > This is uglier today, but would start to look more attractive if we invent > additional CREATE TABLE options in the future --- there'd be a place to > put 'em. Comments? I don't like the parens. Looks ugly and I am not used to seeing them used that way. I can imagine later using WITH NOOIDS, NOBIBBLE, BABBLE. Maybe the syntax should be WITH OID, WITH NOOID? > 2. A child table will be forced to have OIDs if any of its parents do, > even if WITHOUT OIDS is specified in the child's CREATE command. This is > on the theory that the OID ought to act like an inherited column. Good point. > 3. For a table without OIDs, no entry will be made in pg_attribute for > the OID column, so an attempt to reference the OID column will draw a > "no such column" error. (An alternative is to allow OID to read as nulls, > but it seemed that people preferred the error to be raised.) Makes sense. > 6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the > table has no OIDs. (Alternative possibility: raise an error --- is that > better?) COPY in WITH OIDS will silently drop the incoming OID values. Obviously, the case here is that COPY WITH OIDS alone on a non-oid table should throw an error, while pg_dump -o should work on a database with mixed oid/non-oid. I think the right thing would be to have pg_dump check pg_class.relhasoids and issue a proper COPY statement to match the existing table. > 7. Physical tuple headers won't change. If no OIDs are assigned for a > particular table, the OID field in the header will be left zero. > > 8. OID generation will be disabled for those system tables that don't need > it --- pg_listener, pg_largeobject, and pg_attribute being some major > offenders that consume lots of OIDs. > > 9. To continue to support COMMENT ON COLUMN when columns have no OIDs, > pg_description will be modified so that its primary key is (object type, > object OID, column number) --- this also solves the problem that comments > break if there are duplicate OIDs in different system tables. The object > type is the OID of the system catalog in which the object OID appears. > The column number field will be zero for all object types except columns. > For a column comment, the object type and OID fields will refer to the > parent table, and column number will be nonzero. Sounds like a hack. I still prefer pg_attribute to have oids. Can we have temp tables have no pg_attribute oids? A hack on a hack? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> 6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the >> table has no OIDs. (Alternative possibility: raise an error --- is that >> better?) COPY in WITH OIDS will silently drop the incoming OID values. > Obviously, the case here is that COPY WITH OIDS alone on a non-oid table > should throw an error, while pg_dump -o should work on a database with > mixed oid/non-oid. I think the right thing would be to have pg_dump > check pg_class.relhasoids and issue a proper COPY statement to match the > existing table. pg_dump clearly will need to do that, so it isn't really going to be the issue. The question is what to do when a less-clueful app issues a COPY WITH OIDS on an OID-less table. For input, I see no downside to just ignoring the incoming OIDs. For output, I can see three reasonable possibilities: A. Pretend WITH OIDS wasn't mentioned. This might seem to be"do the right thing", but a rather strong objection is thattheapp will not get back the data it was expecting. B. Return NULLs or 0s for the OIDs column. C. Raise an error and refuse to do the copy at all. C is probably the most conservative answer. >> 9. To continue to support COMMENT ON COLUMN when columns have no OIDs, >> pg_description will be modified so that its primary key is (object type, >> object OID, column number) --- this also solves the problem that comments >> break if there are duplicate OIDs in different system tables. The object >> type is the OID of the system catalog in which the object OID appears. >> The column number field will be zero for all object types except columns. >> For a column comment, the object type and OID fields will refer to the >> parent table, and column number will be nonzero. > Sounds like a hack. How so? pg_description is broken anyway given that we don't enforce OID uniqueness across system catalogs. Also, in the future we could consider overloading the <column number> column to have meanings for other object types. I could imagine using it to attach documentation to each of the input arguments of a function, for example. regards, tom lane
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> 6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the > >> table has no OIDs. (Alternative possibility: raise an error --- is that > >> better?) COPY in WITH OIDS will silently drop the incoming OID values. > > > Obviously, the case here is that COPY WITH OIDS alone on a non-oid table > > should throw an error, while pg_dump -o should work on a database with > > mixed oid/non-oid. I think the right thing would be to have pg_dump > > check pg_class.relhasoids and issue a proper COPY statement to match the > > existing table. > > pg_dump clearly will need to do that, so it isn't really going to be the > issue. The question is what to do when a less-clueful app issues a COPY > WITH OIDS on an OID-less table. For input, I see no downside to just > ignoring the incoming OIDs. For output, I can see three reasonable > possibilities: > > A. Pretend WITH OIDS wasn't mentioned. This might seem to be > "do the right thing", but a rather strong objection is that the > app will not get back the data it was expecting. > > B. Return NULLs or 0s for the OIDs column. > > C. Raise an error and refuse to do the copy at all. > > C is probably the most conservative answer. If we fail on load, we should fail on dump. Why not fail on COPY WITH OIDS on a non-oid table? > >> 9. To continue to support COMMENT ON COLUMN when columns have no OIDs, > >> pg_description will be modified so that its primary key is (object type, > >> object OID, column number) --- this also solves the problem that comments > >> break if there are duplicate OIDs in different system tables. The object > >> type is the OID of the system catalog in which the object OID appears. > >> The column number field will be zero for all object types except columns. > >> For a column comment, the object type and OID fields will refer to the > >> parent table, and column number will be nonzero. > > > Sounds like a hack. > > How so? pg_description is broken anyway given that we don't enforce OID > uniqueness across system catalogs. Also, in the future we could We have a script to detect them and the oid counter it unique. In what way do we not enforce it. > consider overloading the <column number> column to have meanings for > other object types. I could imagine using it to attach documentation to > each of the input arguments of a function, for example. Interesting idea. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> For input, I see no downside to just >> ignoring the incoming OIDs. For output, I can see three reasonable >> possibilities: >> >> A. Pretend WITH OIDS wasn't mentioned. This might seem to be >> "do the right thing", but a rather strong objection is that the >> app will not get back the data it was expecting. >> >> B. Return NULLs or 0s for the OIDs column. >> >> C. Raise an error and refuse to do the copy at all. >> >> C is probably the most conservative answer. > If we fail on load, we should fail on dump. Why not fail on COPY WITH > OIDS on a non-oid table? I'm confused --- I was proposing that we *not* fail on load. What's the point of failing on load? >> How so? pg_description is broken anyway given that we don't enforce OID >> uniqueness across system catalogs. Also, in the future we could > We have a script to detect them and the oid counter it unique. In what > way do we not enforce it. In a running system, once the OID counter wraps around there's no guarantee that you won't have duplicate OIDs in different system tables. The only enforcement mechanism we have is the unique indexes, and those will only check per-table. However, that's fine --- it's as much as we need. For everything except pg_description, that is. Since pg_description currently makes an unchecked and uncheckable assumption of global uniqueness of OIDs, it's broken. regards, tom lane
Tom Lane wrote: > > Given Hiroshi's objections, and the likelihood of compatibility problems > for existing applications, I am now thinking that it's not a good idea to > turn off OID generation by default. (At least not for 7.2 --- maybe in > some future release we could change the default.) > Would OIDs be globally unique or per table ? regards, Hiroshi Inoue
Hiroshi Inoue <Inoue@tpf.co.jp> writes: > Tom Lane wrote: >> >> Given Hiroshi's objections, and the likelihood of compatibility problems >> for existing applications, I am now thinking that it's not a good idea to >> turn off OID generation by default. > Would OIDs be globally unique or per table ? Same as now: if you have a unique index on 'em, they're unique within a table; otherwise, no guarantee at all (once the system wraps around). We should document this state of affairs better, of course, but I'm not proposing to change it. The point here is just to let people suppress OIDs for tables that don't need them, and thereby postpone OID wraparound. regards, tom lane
Tom Lane wrote: > > Hiroshi Inoue <Inoue@tpf.co.jp> writes: > > Tom Lane wrote: > >> > >> Given Hiroshi's objections, and the likelihood of compatibility problems > >> for existing applications, I am now thinking that it's not a good idea to > >> turn off OID generation by default. > > > Would OIDs be globally unique or per table ? > > Same as now: if you have a unique index on 'em, they're unique within a > table; otherwise, no guarantee at all (once the system wraps around). > OIDs per table seems more important than others. Strangely enough, I've seen no objection to optional OIDs other than mine. Probably it was my mistake to have formulated a plan on the flimsy assumption. regards, Hiroshi Inoue
mlw <markw@mohawksoft.com> writes: > how hard would it be to have an OID range on a per > table basis? The existing OID generator is a system-wide counter, and couldn't reasonably be expected to do something like that. There was some talk of (in essence) eliminating the present OID generator mechanism and giving each table its own sequence object for generating per-table OIDs. It's an interesting thought, but I'm concerned about the overhead involved. At the very least we'd need to reimplement sequence objects in a lower-overhead fashion (eg, make 'em rows in a pg_sequence table rather than free-standing almost-tables). Might be worth doing someday, but I think it's orthogonal to what I'm proposing at present. There'd still be a need to suppress OID generation on tables that don't need OIDs and might have more than 4 billion inserts during their lifetime. regards, tom lane
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> For input, I see no downside to just > >> ignoring the incoming OIDs. For output, I can see three reasonable > >> possibilities: > >> > >> A. Pretend WITH OIDS wasn't mentioned. This might seem to be > >> "do the right thing", but a rather strong objection is that the > >> app will not get back the data it was expecting. > >> > >> B. Return NULLs or 0s for the OIDs column. > >> > >> C. Raise an error and refuse to do the copy at all. > >> > >> C is probably the most conservative answer. > > > If we fail on load, we should fail on dump. Why not fail on COPY WITH > > OIDS on a non-oid table? > > I'm confused --- I was proposing that we *not* fail on load. What's the > point of failing on load? I meant to say we should fail on dump _and_ load. If we don't we are throwing away the oid's they are loading because though the table has no oid column. Seems like something that should fail. > > >> How so? pg_description is broken anyway given that we don't enforce OID > >> uniqueness across system catalogs. Also, in the future we could > > > We have a script to detect them and the oid counter it unique. In what > > way do we not enforce it. > > In a running system, once the OID counter wraps around there's no > guarantee that you won't have duplicate OIDs in different system > tables. The only enforcement mechanism we have is the unique indexes, > and those will only check per-table. However, that's fine --- it's > as much as we need. For everything except pg_description, that is. > Since pg_description currently makes an unchecked and uncheckable > assumption of global uniqueness of OIDs, it's broken. If you consider random table creation failures acceptible. In oid wraparound, whether pg_description could point to two rows with the same oid is the smallest part of our problem. I think the whole idea we can run reliably with an oid wraparound is questionable. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Maybe I'm being horribly stupid here, but.... If the thinking is that some tables can escape having an OID, thus meaning OIDs can be controlled by table, how hard would it be to have an OID range on a per table basis? Where each table to have its own notion of an OID, then OID wrap/depletion should be minimal. Tom Lane wrote: > > Given Hiroshi's objections, and the likelihood of compatibility problems > for existing applications, I am now thinking that it's not a good idea to > turn off OID generation by default. (At least not for 7.2 --- maybe in > some future release we could change the default.) > > Based on the discussion so far, here is an attempt to flesh out the > details of what to do with OIDs for 7.2: > > 1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE. > The default behavior will be WITH OIDS. > > Note: there was some discussion of a GUC variable to control the default. > I'm leaning against this, mainly because having one would mean that > pg_dump *must* write WITH OIDS or WITHOUT OIDS in every CREATE TABLE; > else it couldn't be sure that the database schema would be correctly > reconstructed. That would create dump-script portability problems and > negate some of the point of having a GUC variable in the first place. > So I'm thinking a fixed default is better. > > Note: an alternative syntax possibility is to make it look like the "with" > option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)". > This is uglier today, but would start to look more attractive if we invent > additional CREATE TABLE options in the future --- there'd be a place to > put 'em. Comments? > > 2. A child table will be forced to have OIDs if any of its parents do, > even if WITHOUT OIDS is specified in the child's CREATE command. This is > on the theory that the OID ought to act like an inherited column. > > 3. For a table without OIDs, no entry will be made in pg_attribute for > the OID column, so an attempt to reference the OID column will draw a > "no such column" error. (An alternative is to allow OID to read as nulls, > but it seemed that people preferred the error to be raised.) > > 4. When inserting into an OID-less table, the INSERT result string will > always show 0 for the OID. > > 5. A "relhasoids" boolean column will be added to pg_class to signal > whether a table has OIDs or not. > > 6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the > table has no OIDs. (Alternative possibility: raise an error --- is that > better?) COPY in WITH OIDS will silently drop the incoming OID values. > > 7. Physical tuple headers won't change. If no OIDs are assigned for a > particular table, the OID field in the header will be left zero. > > 8. OID generation will be disabled for those system tables that don't need > it --- pg_listener, pg_largeobject, and pg_attribute being some major > offenders that consume lots of OIDs. > > 9. To continue to support COMMENT ON COLUMN when columns have no OIDs, > pg_description will be modified so that its primary key is (object type, > object OID, column number) --- this also solves the problem that comments > break if there are duplicate OIDs in different system tables. The object > type is the OID of the system catalog in which the object OID appears. > The column number field will be zero for all object types except columns. > For a column comment, the object type and OID fields will refer to the > parent table, and column number will be nonzero. > > 10. pg_dump will be modified to do the appropriate things with OIDs. Are > there any other application programs that need to change? > > We had also talked about adding an INSERT ... RETURNING feature to allow > applications to eliminate their dependence on looking at the OID returned > by an INSERT command. I think this is a good idea, but there are still > a number of unsolved issues about how it should interact with rules. > Accordingly, I'm not going to try to include it in this batch of work. > > Comments? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- 5-4-3-2-1 Thunderbirds are GO! ------------------------ http://www.mohawksoft.com
Tom Lane wrote: > > Given Hiroshi's objections, and the likelihood of compatibility problems > for existing applications, I am now thinking that it's not a good idea to > turn off OID generation by default. (At least not for 7.2 --- maybe in > some future release we could change the default.) > > Based on the discussion so far, here is an attempt to flesh out the > details of what to do with OIDs for 7.2: Also OIDS should be promoted to 8-byte integers at some future time. > > 1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE. > The default behavior will be WITH OIDS. > > Note: there was some discussion of a GUC variable to control the default. > I'm leaning against this, mainly because having one would mean that > pg_dump *must* write WITH OIDS or WITHOUT OIDS in every CREATE TABLE; > else it couldn't be sure that the database schema would be correctly > reconstructed. That would create dump-script portability problems and > negate some of the point of having a GUC variable in the first place. > So I'm thinking a fixed default is better. > > Note: an alternative syntax possibility is to make it look like the "with" > option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)". > This is uglier today, but would start to look more attractive if we invent > additional CREATE TABLE options in the future --- there'd be a place to > put 'em. Comments? > > 2. A child table will be forced to have OIDs if any of its parents do, > even if WITHOUT OIDS is specified in the child's CREATE command. This is > on the theory that the OID ought to act like an inherited column. > > 3. For a table without OIDs, no entry will be made in pg_attribute for > the OID column, so an attempt to reference the OID column will draw a > "no such column" error. (An alternative is to allow OID to read as nulls, > but it seemed that people preferred the error to be raised.) > > 4. When inserting into an OID-less table, the INSERT result string will > always show 0 for the OID. > > 5. A "relhasoids" boolean column will be added to pg_class to signal > whether a table has OIDs or not. > > 6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the > table has no OIDs. (Alternative possibility: raise an error --- is that > better?) COPY in WITH OIDS will silently drop the incoming OID values. > > 7. Physical tuple headers won't change. If no OIDs are assigned for a > particular table, the OID field in the header will be left zero. > > 8. OID generation will be disabled for those system tables that don't need > it --- pg_listener, pg_largeobject, and pg_attribute being some major > offenders that consume lots of OIDs. 1-8 sounds good > 9. To continue to support COMMENT ON COLUMN when columns have no OIDs, > pg_description will be modified so that its primary key is (object type, > object OID, column number) --- this also solves the problem that comments > break if there are duplicate OIDs in different system tables. Hm.. To me this sounds like allowing duplicates in an unique index in case there happen to be duplicate keys there ;) IMHO duplicate OID's in system tables should be treated as bug's - if they are there they are meant to break stuff. > The object > type is the OID of the system catalog in which the object OID appears. > The column number field will be zero for all object types except columns. > For a column comment, the object type and OID fields will refer to the > parent table, and column number will be nonzero. What happens to columns added to inherited tables ? > 10. pg_dump will be modified to do the appropriate things with OIDs. Are > there any other application programs that need to change? > > We had also talked about adding an INSERT ... RETURNING feature to allow > applications to eliminate their dependence on looking at the OID returned > by an INSERT command. I think this is a good idea, but there are still > a number of unsolved issues about how it should interact with rules. > Accordingly, I'm not going to try to include it in this batch of work. >
Hannu Krosing <hannu@tm.ee> writes: > Tom Lane wrote: >> Based on the discussion so far, here is an attempt to flesh out the >> details of what to do with OIDs for 7.2: > Also OIDS should be promoted to 8-byte integers at some future time. Perhaps, but I'm trying to focus on what to do for 7.2... >> 9. To continue to support COMMENT ON COLUMN when columns have no OIDs, >> pg_description will be modified so that its primary key is (object type, >> object OID, column number) --- this also solves the problem that comments >> break if there are duplicate OIDs in different system tables. > Hm.. To me this sounds like allowing duplicates in an unique index in > case there happen to be duplicate keys there ;) Unless you want to implement a global unique index that can enforce uniqueness of OIDs across all the system tables, I don't think that approach is tenable. pg_description is broken as it stands. Bruce doesn't like the "column number" part of my proposal --- I suppose he'd rather see the pg_description key as just <object type, object OID> with object type referring to pg_attribute if it's a comment on column. That would work too as far as fixing the lack of uniqueness goes, but it still leaves us with pg_attribute as a significant consumer of OIDs. Since the major point of this exercise (in my mind) is cutting the rate of consumption of OIDs to postpone wraparound, I want to suppress OIDs in pg_attribute, and to do that I have to add the column number to pg_description. >> The column number field will be zero for all object types except columns. >> For a column comment, the object type and OID fields will refer to the >> parent table, and column number will be nonzero. > What happens to columns added to inherited tables ? Uh, nothing as far as I can see. We don't presently support auto inheritance of comments-on-columns, if that's what you were asking for. Offhand, making that happen seems about equally easy with either representation of pg_description, so I don't think it's an issue. regards, tom lane
I posted this question earlier, but it looks like it never made it on. If you can control the OIDs on a per table basis, and some tables need not even have any, why not let each table have its own OID range? Essentially, each record will be numbered relative to 0 on its table? That would really cut down the OID wrap around problem, and allow records to have a notion of serialization. Tom Lane wrote: > > Given Hiroshi's objections, and the likelihood of compatibility problems > for existing applications, I am now thinking that it's not a good idea to > turn off OID generation by default. (At least not for 7.2 --- maybe in > some future release we could change the default.) > > Based on the discussion so far, here is an attempt to flesh out the > details of what to do with OIDs for 7.2: > > 1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE. > The default behavior will be WITH OIDS. > > Note: there was some discussion of a GUC variable to control the default. > I'm leaning against this, mainly because having one would mean that > pg_dump *must* write WITH OIDS or WITHOUT OIDS in every CREATE TABLE; > else it couldn't be sure that the database schema would be correctly > reconstructed. That would create dump-script portability problems and > negate some of the point of having a GUC variable in the first place. > So I'm thinking a fixed default is better. > > Note: an alternative syntax possibility is to make it look like the "with" > option clauses for functions and indexes: "WITH (oids)" or "WITH (noOids)". > This is uglier today, but would start to look more attractive if we invent > additional CREATE TABLE options in the future --- there'd be a place to > put 'em. Comments? > > 2. A child table will be forced to have OIDs if any of its parents do, > even if WITHOUT OIDS is specified in the child's CREATE command. This is > on the theory that the OID ought to act like an inherited column. > > 3. For a table without OIDs, no entry will be made in pg_attribute for > the OID column, so an attempt to reference the OID column will draw a > "no such column" error. (An alternative is to allow OID to read as nulls, > but it seemed that people preferred the error to be raised.) > > 4. When inserting into an OID-less table, the INSERT result string will > always show 0 for the OID. > > 5. A "relhasoids" boolean column will be added to pg_class to signal > whether a table has OIDs or not. > > 6. COPY out WITH OIDS will ignore the "WITH OIDS" specification if the > table has no OIDs. (Alternative possibility: raise an error --- is that > better?) COPY in WITH OIDS will silently drop the incoming OID values. > > 7. Physical tuple headers won't change. If no OIDs are assigned for a > particular table, the OID field in the header will be left zero. > > 8. OID generation will be disabled for those system tables that don't need > it --- pg_listener, pg_largeobject, and pg_attribute being some major > offenders that consume lots of OIDs. > > 9. To continue to support COMMENT ON COLUMN when columns have no OIDs, > pg_description will be modified so that its primary key is (object type, > object OID, column number) --- this also solves the problem that comments > break if there are duplicate OIDs in different system tables. The object > type is the OID of the system catalog in which the object OID appears. > The column number field will be zero for all object types except columns. > For a column comment, the object type and OID fields will refer to the > parent table, and column number will be nonzero. > > 10. pg_dump will be modified to do the appropriate things with OIDs. Are > there any other application programs that need to change? > > We had also talked about adding an INSERT ... RETURNING feature to allow > applications to eliminate their dependence on looking at the OID returned > by an INSERT command. I think this is a good idea, but there are still > a number of unsolved issues about how it should interact with rules. > Accordingly, I'm not going to try to include it in this batch of work. > > Comments? > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- 5-4-3-2-1 Thunderbirds are GO! ------------------------ http://www.mohawksoft.com
mlw <markw@mohawksoft.com> writes: > I posted this question earlier, but it looks like it never made it on. You did post it, and I answered it: no can do with anything close to the current implementation of the OID generator. We have one counter for the whole system, not per-table state. regards, tom lane
Tom Lane wrote: > > Hannu Krosing <hannu@tm.ee> writes: > >> 9. To continue to support COMMENT ON COLUMN when columns have no OIDs, > >> pg_description will be modified so that its primary key is (object type, > >> object OID, column number) --- this also solves the problem that comments > >> break if there are duplicate OIDs in different system tables. > > > Hm.. To me this sounds like allowing duplicates in an unique index in > > case there happen to be duplicate keys there ;) > > Unless you want to implement a global unique index Or insert/update trigger > that can enforce uniqueness of OIDs across all the system tables, I don't > think that approach is tenable. As I wrote in another mail to this list, AFAIK OID is supposed to be Object Identifier - something that can be used to identify any object in a unique fashion. When (and if ;) we will implement SQL3's UNDER, we should, IMHO, make a primary key inherited and *unique* over all tables created UNDER the main table, meaning that we will need a way to have uniqueness constraint spanning multiple tables. ( At least logically multiple tables, as IMHO UNDER with its single inheritance is best implemented in a single table with a bit more flexible column structure. ) At that time we could theoretically inherit all system tables that have OID column from table "pg_system(oid oid primary key);" > pg_description is broken as it stands. Bruce > doesn't like the "column number" part of my proposal --- I suppose he'd > rather see the pg_description key as just <object type, object OID> with > object type referring to pg_attribute if it's a comment on column. > That would work too as far as fixing the lack of uniqueness goes, but it > still leaves us with pg_attribute as a significant consumer of OIDs. That would probably be a problem with 4-byte OIDs, there is an ample supply of 8-byte ones I do like dropping OID from pg_listener, as it is a mostly empty and really rapidly changing table, but I see little value in dropping oid from pg_attribute. BTW, don't indexes, triggers or saved plans use OIDs from pg_attribute ? > Since the major point of this exercise (in my mind) is cutting the rate > of consumption of OIDs to postpone wraparound, I want to suppress OIDs > in pg_attribute, and to do that I have to add the column number to > pg_description. I still think that going to 8-byte OIDs would be the best use of your time ;) If you can make the size of oid's a compile time option, then even better. Postponing the wraparound by the means you describe may be a fools errand anyway, as there are other ways to quickly consume oids that are very likely as common as those involving pg_listener, pg_largeobject, and pg_attribute. Also computers still get faster, and disks still get bigger at the rate I doubt you will be able to match by finding ways to postpone the wraparound. So here I'd like to contradict Vadim's claim that the time of simple solutions is over for PostgreSQL - making OID bigger is at least conceptually simple, it's just "a small matter of programming" ;) -------------- Hannu
mlw wrote: > > I posted this question earlier, but it looks like it never made it on. > > If you can control the OIDs on a per table basis, and some tables need not even > have any, why not let each table have its own OID range? Essentially, each > record will be numbered relative to 0 on its table? > > That would really cut down the OID wrap around problem, and allow records to > have a notion of serialization. What would the meaning of such an "OID" be ? Apart from duplicating the primary key that is ? ------------------ Hannu
Hannu Krosing <hannu@tm.ee> writes: > That would probably be a problem with 4-byte OIDs, there is an ample > supply of 8-byte ones Sure, but I think we are still a few years away from being able to assume that every platform of interest can support 8-byte OIDs (and furthermore, won't see a significant performance degradation --- keep in mind that widening Datum to 8 bytes is a change that affects all datatypes not just Oid). There's also the Oids-are-in-the-wire-protocol problem. In short, that's a long-term solution not a near-term one. > BTW, don't indexes, triggers or saved plans use OIDs from pg_attribute ? Nope. pg_description is the only offender. regards, tom lane
Tom Lane wrote: > > mlw <markw@mohawksoft.com> writes: > > I posted this question earlier, but it looks like it never made it on. > > You did post it, Sorry, it never got to me. > and I answered it: no can do with anything close to the > current implementation of the OID generator. We have one counter for > the whole system, not per-table state. That's a bummer. The concept of a ROWID is really useful, especially for those that come from an Oracle background, or porting Oracle queries. -- 5-4-3-2-1 Thunderbirds are GO! ------------------------ http://www.mohawksoft.com
Hannu Krosing wrote: > > mlw wrote: > > > > I posted this question earlier, but it looks like it never made it on. > > > > If you can control the OIDs on a per table basis, and some tables need not even > > have any, why not let each table have its own OID range? Essentially, each > > record will be numbered relative to 0 on its table? > > > > That would really cut down the OID wrap around problem, and allow records to > > have a notion of serialization. > > What would the meaning of such an "OID" be ? > > Apart from duplicating the primary key that is ? Some other databases have the notion of a ROWID which uniquely identifies a row within a table. OID can be used for that, but it means if you use it, you must limit the size of your whole database system. The other alternative is to make a column called "rowid" and a sequence for it and a default of nextval('table_rowid'). That means more work for those porting. My thinking was that if the "OID" issue was being addressed, maybe it could be changed quite a bit. The problem with the current OID is that it severely limits the capacity of the database AND does not carry with it enough information. For instance, as far as I can see, one can not take an OID and make any sort of determination about what it is. One also needs to know the table and the database from which it was retrieved. So an OID is meaningless without the contextual information. Why should it be a system wide limitation when it needs to be used in the context of a specific table? That way PostgreSQL has a knowable 4B (or 2B signed) record limit per table, not per system. One could create a new virtual OID like thing, called SYSOID, or something, which is a 64 bit value, the upper 4 bytes being the OID of the table from the catalog, and the lower 4 bytes being the OID of the record. The SYSOID would really tell you something! Given a SYSOID you could find the database, the table, and the record. -- 5-4-3-2-1 Thunderbirds are GO! ------------------------ http://www.mohawksoft.com
Tom Lane wrote: > [Snipped] I think making "WITHOUT OIDS" the default for table creation is the right thing to do. Here is my reasoning: An OID is a system wide limitation. 4B or 2B depending on sign-ness. (could there be some bugs still lurking on high OIDs?) Since the OID is a shared system wide limited resource, it should be limited to "system" tables which require it. To new comers to PostgreSQL this limitation will not be obvious until they hit it. Then they will kick themselves for not reading more carefully. An OID does not add any real value to the database developer. Given an OID, one can not determine anything about the record it represents. One also needs the table and database from which it came, and even then one has to create an index on the OID column on the table to get to the record efficiently. It can only indicate the order in which records were entered. If people need something like OID for their tables, the documented "preferred way" could be: create sequence fubar; create table fubar (rowid integer default nextval('fubar_seq'),... ); Then explain that they can use "WITH OID" but there is a system wide limit. On a side note: I know it is probably a lot of work, and it has been shot down once, but the notion of a rowid built into a table would be useful. It would solve wrap around and keep the useful functionality of OID, and be more efficient and robust than using the sequence.
The analog of ROWID in PostgreSQL is TID rather than OID because TID is a physical address of a tuple within a table. However there's a significant difference. Unfortunately TID is transient. It is changed by UPDATE and VACUUM. Though TIDs are unavailable for critical use, OIDs could compensate the drawback. TIDs and OIDs must help each other if PostgreSQL needs the concept like ROWID. regards, Hiroshi Inoue > -----Original Message----- > From: mlw > > Hannu Krosing wrote: > > > > mlw wrote: > > > > > > I posted this question earlier, but it looks like it never made it on. > > > > > > If you can control the OIDs on a per table basis, and some > tables need not even > > > have any, why not let each table have its own OID range? > Essentially, each > > > record will be numbered relative to 0 on its table? > > > > > > That would really cut down the OID wrap around problem, and > allow records to > > > have a notion of serialization. > > > > What would the meaning of such an "OID" be ? > > > > Apart from duplicating the primary key that is ? > > Some other databases have the notion of a ROWID which uniquely > identifies a row > within a table. OID can be used for that, but it means if you use > it, you must > limit the size of your whole database system. The other > alternative is to make > a column called "rowid" and a sequence for it and a default of > nextval('table_rowid'). That means more work for those porting. >
Hiroshi Inoue wrote: > > The analog of ROWID in PostgreSQL is TID rather than OID > because TID is a physical address of a tuple within a table. > However there's a significant difference. Unfortunately TID > is transient. It is changed by UPDATE and VACUUM. > Though TIDs are unavailable for critical use, OIDs could > compensate the drawback. TIDs and OIDs must help each > other if PostgreSQL needs the concept like ROWID. That is true now, but I am saying that it should not be true. Rather than have a single limited global resource, the current OID, if possible, tables should get their own notion of an OID, like a ROWID. The ability to eliminated OID from tables is a great step, but, if one needs a OID behavior on tables, then one has a limit of 2B-4B rows in an entire database system for which all tables compete. You have even said you need the notion of an OID for some ODBC cursor stuff you are doing. Thus eliminating OIDs is not an option for you. The options are: No OID on a table. This breaks any code that assumes an OID must always exist. Use OIDs on a table. This limits the size of the database, I have already had to drop and reload a database once because of OID depletion (3 months). If OIDs can become the equivalent of a ROWID, then code designed that assumes OID are always valid will still work, and Postgres will not run out of OIDs in system wide sense. I know I won't be doing the work to make the changes, so I am sensitive to that issue, but as a PostgreSQL user, I can say that I have hit the OID limit once already and will continue to hit it periodically. Getting rid of OIDs may not be an option for me because I planning to do some replication across several boxes, and that means I would use OID or use a sequence and "default nextval(...)." -- 5-4-3-2-1 Thunderbirds are GO! ------------------------ http://www.mohawksoft.com
mlw wrote: > > Hiroshi Inoue wrote: > > > > The analog of ROWID in PostgreSQL is TID rather than OID > > because TID is a physical address of a tuple within a table. > > However there's a significant difference. Unfortunately TID > > is transient. It is changed by UPDATE and VACUUM. > > Though TIDs are unavailable for critical use, OIDs could > > compensate the drawback. TIDs and OIDs must help each > > other if PostgreSQL needs the concept like ROWID. > > That is true now, but I am saying that it should not be true. Rather than have > a single limited global resource, the current OID, if possible, tables should > get their own notion of an OID, like a ROWID. > I've objected optional OID but never objected OIDs per table. OIDs per table is more important than others IMHO. regards, Hiroshi Inoue
> Based on the discussion so far, here is an attempt to flesh out the > details of what to do with OIDs for 7.2: > > 1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE. > The default behavior will be WITH OIDS. What about having an additional Oid generator which solely serves for supplying user tables' per row Oids? It seems relatively easy to implement, comparing with 64-bit Oids or Oid-less tables. I assume that the Oid wraparound problem is not so serious with user tables. -- Tatsuo Ishii
> > Based on the discussion so far, here is an attempt to flesh out the > > details of what to do with OIDs for 7.2: > > > > 1. Add an optional clause "WITH OIDS" or "WITHOUT OIDS" to CREATE TABLE. > > The default behavior will be WITH OIDS. > > What about having an additional Oid generator which solely serves for > supplying user tables' per row Oids? It seems relatively easy to > implement, comparing with 64-bit Oids or Oid-less tables. I assume > that the Oid wraparound problem is not so serious with user tables. This is a very interesting idea. Have two oid counters, one for system tables and another for user tables. It isolates problems with oid wraparound caused by large user tables. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > This is a very interesting idea. Have two oid counters, one for system > tables and another for user tables. It isolates problems with oid > wraparound caused by large user tables. Well, it'd keep user-space wraparound from affecting the system tables, but given that the system tables have adequate defenses already (ie, unique indexes) I'm not sure that there's any point. It'd not improve the picture for user-table OID uniqueness by any measurable degree. regards, tom lane
> Well, it'd keep user-space wraparound from affecting the system tables, > but given that the system tables have adequate defenses already (ie, > unique indexes) I'm not sure that there's any point. It'd not improve > the picture for user-table OID uniqueness by any measurable degree. But from the point of users' view, it does not prevent "create XXX comand fails due to Oid wraparounding" problems, no? Also I am worried about the performance of the per table Oid generators. Even the system tables going to have that kind of generators? What would happend if there are 5k tables in a database? It's not very rare situation in a large installation. -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes: > Also I am worried about the performance of the per table Oid > generators. I think performance would be a big problem if we tried to implement them just like sequences are done now. But a shared hashtable of Oid generators (each one handled roughly like the single Oid generator currently is) would probably work okay. We'd have to work out how to have a backing disk table for this hashtable, since we couldn't expect to have room in shared memory for all generators at all times --- but we could cache all the active generators in shared memory, I'd think. regards, tom lane
Tom Lane wrote: > Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > Also I am worried about the performance of the per table Oid > > generators. > > I think performance would be a big problem if we tried to implement them > just like sequences are done now. But a shared hashtable of Oid > generators (each one handled roughly like the single Oid generator > currently is) would probably work okay. We'd have to work out how to > have a backing disk table for this hashtable, since we couldn't expect > to have room in shared memory for all generators at all times --- but we > could cache all the active generators in shared memory, I'd think. Maybe I'm confused, is there no shared memory for each unique table in use? If so, couldn't the Oid generator be stored there? if not, how does that work? Second, IMHO I think you are a bit too conservative with shared memory. If one has so many active tables that their Oid generators wouldn't fit in shared memory, this would indicate a fairly large database, I think one could be justified in requiring more resources than the minimum. PostgreSQL is already increasing in resource requirements. The introduction of WAL added a lot of disk space for operation. A few K of shared RAM doesn't seem like a lot. (Maybe I am jaded as I have bumped my shared memory to 128M) Lastly, were PostgreSQL to have multiple Oid generators, each of these could have its own spinlock or mutex, thus reducing competition. In an active system with activity on multiple tables, this could improve performance.
Tom Lane wrote: > Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > Also I am worried about the performance of the per table Oid > > generators. > > I think performance would be a big problem if we tried to implement them > just like sequences are done now. But a shared hashtable of Oid > generators (each one handled roughly like the single Oid generator > currently is) would probably work okay. We'd have to work out how to > have a backing disk table for this hashtable, since we couldn't expect > to have room in shared memory for all generators at all times --- but we > could cache all the active generators in shared memory, I'd think. Keep also in mind that actually the uniqueness of Oid's across all tables is used by TOAST to determine that a toasted value found in the new tuple is the same than in the old one on heap_update() or not. If we go for aseparate Oid per table, an UPDATE with a subselect from another table could get misinterpreted in the toaster,not duplicating the value but referencing the external value in another tables toast-shadow table. It's no big deal, some additional checks of the va_toastrelid beeing the same as the target relations toast relationshould do it. Now since toast needs the row Oid allways, I think the idea of making Oid's in user tables optional is dead. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck@Yahoo.com # _________________________________________________________ Do You Yahoo!? Get your free @yahoo.com address at http://mail.yahoo.com
Jan Wieck <JanWieck@yahoo.com> writes: > Keep also in mind that actually the uniqueness of Oid's > across all tables is used by TOAST to determine that a > toasted value found in the new tuple is the same than in the > old one on heap_update() or not. > It's no big deal, some additional checks of the va_toastrelid > beeing the same as the target relations toast relation should > do it. Good point. > Now since toast needs the row Oid allways, I think the idea > of making Oid's in user tables optional is dead. Why? I see where it's looking at the main-row OID and attno to decide if it's the same value or not, but this seems strange and wrong. Why doesn't it just compare va_toastrelid and va_valueid? Considering that the main point of this comparison is to distinguish values associated with different versions of the same row, neither main row OID nor attribute number seem helpful. I don't see why we expend space on storing va_rowid + va_attno at all. BTW, I've already completed implementing optional OIDs, so I'm not going to give up the idea lightly at this point ;-) regards, tom lane