Thread:
In some older mails in the archive I found rumors about difficulcies that might occur when OIDs are used as an integral part of a data model. I am considering the option of placing an index on the already existing oid and using it as the primary key for all tables (saves some space and a sequence lookup). This includes saving the oid in foreign keys (virtual ones, not actually declared references). I read that using OID in keys is generally a bad idea. Is it really? Why exactly? Are there any disadvantages as to reliability or performance apart from accidentally forgetting to use the -o option with pg_dump? If so, please give details. I felt especially worried by a postgres developer's statement in another archived mail: "As far as I know, there is no reason oid's have to be unique, especially if they are in different tables." (http://archives.postgresql.org/pgsql-hackers/1998-12/msg00570.php) How unique are oids as of version 7.3 of postgres ? Is it planned to keep oids semantically the same in future releases of postgres? Will the oid type be extended so that oids can be larger than 4 bytes (if this is still correct for 7.3) and do not rotate in large systems? Thanks for your time and advice. Daniel Alvarez <d-alvarez@gmx.de> -- +++ GMX - Mail, Messaging & more http://www.gmx.net +++ Bitte lächeln! Fotogalerie online mit GMX ohne eigene Homepage!
On Wednesday 26 Feb 2003 1:04 pm, daniel alvarez wrote: > In some older mails in the archive I found rumors about difficulcies that > might > occur when OIDs are used as an integral part of a data model. > > I am considering the option of placing an index on the already existing oid > and using > it as the primary key for all tables (saves some space and a sequence > lookup). This > includes saving the oid in foreign keys (virtual ones, not actually > declared references). > I read that using OID in keys is generally a bad idea. Is it really? Why > exactly? OIDs are not even guaranteed to be there any more - you can create a table WITHOUT OIDs if you want to save some space. If you want a numeric primary key, I'd recommend int4/int8 attached to a sequence - it's much clearer what's going on then. > Are there any disadvantages as to reliability or performance apart from > accidentally > forgetting to use the -o option with pg_dump? If so, please give details. > > I felt especially worried by a postgres developer's statement in another > archived mail: > "As far as I know, there is no reason oid's have to be unique, especially > if they are in different tables." > (http://archives.postgresql.org/pgsql-hackers/1998-12/msg00570.php) > > How unique are oids as of version 7.3 of postgres ? OIDs are unique per object (table) I believe, no more so. See chapter 5.10 of the user guide for details. They are used to identify system objects and so the fact that a function and a table could both have the same OID should cause no problems. > Is it planned to keep oids semantically the same in future releases of > postgres? Couldn't say - don't see why not. > Will the oid type be extended so that oids can be larger than 4 bytes (if > this is still > correct for 7.3) and do not rotate in large systems? Strikes me as unlikely, though I'm not a developer. Look into 8-byte serial/sequences. -- Richard Huxton
> > I am considering the option of placing an index on the already existing oid > > and using it as the primary key for all tables (saves some space and a sequence > > lookup). This includes saving the oid in foreign keys (virtual ones, not actually > > declared references). I read that using OID in keys is generally a bad idea. > > Is it really? Why exactly? > OIDs are not even guaranteed to be there any more - you can create a table > WITHOUT OIDs if you want to save some space. If you want a numeric primary > key, I'd recommend int4/int8 attached to a sequence - it's much clearer what's > going on then. Of course this is a cleaner solution. I did not know that oids can be supressed and was looking for a way to make space usage more efficient. Trying to get rid of user- defined surrogate primary keys and substitute them by the already existing OID is obviously the wrong approch, as postgres already defines a cleaner option. There can also be some problems when using replication, because one needs to make sure that OIDs are the same on all machines in the cluster. Why should user-defined tables have OIDs by default? Other DBMS use ROWIDs as the physical storage location used for pointers in index leafs, but this is equivalent to Postgres TIDs. To the user an OID column is not different than any other column he can define himself. I'd find it more natural if the column wasn't there at all. Daniel Alvarez -- +++ GMX - Mail, Messaging & more http://www.gmx.net +++ Bitte lächeln! Fotogalerie online mit GMX ohne eigene Homepage!
daniel alvarez <d-alvarez@gmx.de> writes: > Why should user-defined tables have OIDs by default? At this point it's just for historical reasons. There actually is a proposal on the table to flip the default to WITHOUT OIDS, but so far it's not been accepted because of worries about compatibility. See the pghackers archives a few weeks back. regards, tom lane
Richard Huxton <dev@archonet.com> writes: >> How unique are oids as of version 7.3 of postgres ? > OIDs are unique per object (table) I believe, no more so. Even then, you should only assume uniqueness if you put a unique index on the table's OID column to enforce it. (The system catalogs that use OID all have such indexes.) Without that, you might have duplicates after the OID counter wraps around. >> Will the oid type be extended so that oids can be larger than 4 bytes (if >> this is still correct for 7.3) and do not rotate in large systems? > Strikes me as unlikely, though I'm not a developer. I tend to agree. At one point that was a live possibility, but now we're more likely to change the default for user tables to WITHOUT OIDS and declare the problem solved. Making OIDs 8 bytes looks like too much of a performance hit for non-64-bit machines. (Not to mention machines that haven't got "long long" at all; right now the only thing that doesn't work for them is type int8, and I'd like it to stay that way, at least for a few more years.) regards, tom lane
On Wednesday 26 Feb 2003 2:59 pm, daniel alvarez wrote: > > Why should user-defined tables have OIDs by default? Other DBMS use ROWIDs > as the physical storage location used for pointers in index leafs, but this > is equivalent > to Postgres TIDs. To the user an OID column is not different than any other > column > he can define himself. I'd find it more natural if the column wasn't there > at all. I believe the plan is to phase them out, but some people are using them, so the default is still to create them. Imagine if you were using OIDs as keys and after a dump/restore they were all gone... -- Richard Huxton
Daniel, > There can also be some problems when using replication, because one needs > to make > sure that OIDs are the same on all machines in the cluster. See the "uniqueidentifier" contrib package if you need a universally unique id for replication. > I'd find it more natural if the column wasn't there > at all. Probably by Postgres 8.0, it won't be. -- Josh Berkus Aglio Database Solutions San Francisco
> daniel alvarez <d-alvarez@gmx.de> writes: > > Why should user-defined tables have OIDs by default? > > At this point it's just for historical reasons. There actually is a > proposal on the table to flip the default to WITHOUT OIDS, but so far > it's not been accepted because of worries about compatibility. See > the pghackers archives a few weeks back. Shall I include a patch to pg_dump that will explicitly set WITH OIDS when I submit this SET STORAGE dumping patch? Chris
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > Shall I include a patch to pg_dump that will explicitly set WITH OIDS when I > submit this SET STORAGE dumping patch? Not if you want it to be accepted ;-) We pretty much agreed we did not want that in the prior thread. regards, tom lane
> "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > > Shall I include a patch to pg_dump that will explicitly set WITH OIDS when I > > submit this SET STORAGE dumping patch? > > Not if you want it to be accepted ;-) > > We pretty much agreed we did not want that in the prior thread. The patch I submitted did not include OID stuff, I decided that it's better to submit orthogonal patches :) Chris
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > The patch I submitted did not include OID stuff, I decided that it's better > to submit orthogonal patches :) Right. But the problem with switching the OID default is not a matter of code --- it's of working out what the compatibility issues are. As I recall, one thing people did not want was for pg_dump to plaster WITH OIDS or WITHOUT OIDS on every single CREATE TABLE, as this would pretty much destroy any shot at loading PG dumps into any other database. What we need is an agreement on the behavior we want (making the best possible compromise between this and other compatibility desires). After that, the actual patch is probably trivial, while in advance of some consensus on the behavior, offering a patch is a waste of time. regards, tom lane
"Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: >> As I recall, one thing people did not want was for pg_dump to plaster >> WITH OIDS or WITHOUT OIDS on every single CREATE TABLE, as this would >> pretty much destroy any shot at loading PG dumps into any other >> database. > Ummm...what about SERIAL columns, ALTER TABLE / SET STATS, SET STORAGE, > custom types, 'btree' in CREATE INDEX, SET SEARCH_PATH, '::" cast operator, > stored procedures, rules, etc. - how is adding WITH OIDS going to change > that?! It's moving in the wrong direction. We've been slowly eliminating unnecessary nonstandardisms in pg_dump output; this puts in a new one in a quite fundamental place. You could perhaps expect another DB to drop commands it didn't understand like SET SEARCH_PATH ... but if it drops all your CREATE TABLEs, you ain't got much dump left to load. I'm not necessarily wedded to the above argument myself, mind you; but it is a valid point that needs to be weighed in the balance of what we're trying to accomplish. The bottom line is that "code first, design later" is no way to approach this problem. regards, tom lane
Tom wrote: > >As I recall, one thing people did not want was for pg_dump to plaster >WITH OIDS or WITHOUT OIDS on every single CREATE TABLE, as this would >pretty much destroy any shot at loading PG dumps into any other >database. Has there been any talk about adding a flag to pg_dump to explicitly ask for a standard format? (sql3? sql99? etc?) Ideally for me, such a flag would produce a "portable" dump file of the subset that did follow the standard, and also produce a separate log file that could contain any constructs that could not be standardly dumped. If such a flag existed, it might be the easiest way to load to other databases, and people might be less opposed to plastering more postgres specific stuff to the default format. If I were going to try to write portable dumps for other databases, I'd want as few postgresisms in the big file as possible. The separate log file would make it easier for me to make hand-ported separate files to set up functions, views, etc. Yes, I know that would restrict the functionality I could depend on, including types, sequences, etc. However if I were in an environment where developers did prototyping on postgres / mssql /etc and migrated functionality to whatever the company's official standard system was, I think developers would want to constrain themselves to standards as much as possible, and this option may help them do so. Ron PS: I'm not sure if I'm volunteering or not, unless someone tells me how easy/hard it would be. Last thing I tried was harder than it first appeared.
> Right. But the problem with switching the OID default is not a matter > of code --- it's of working out what the compatibility issues are. > As I recall, one thing people did not want was for pg_dump to plaster > WITH OIDS or WITHOUT OIDS on every single CREATE TABLE, as this would > pretty much destroy any shot at loading PG dumps into any other > database. Ummm...what about SERIAL columns, ALTER TABLE / SET STATS, SET STORAGE, custom types, 'btree' in CREATE INDEX, SET SEARCH_PATH, '::" cast operator, stored procedures, rules, etc. - how is adding WITH OIDS going to change that?! > What we need is an agreement on the behavior we want (making > the best possible compromise between this and other compatibility > desires). After that, the actual patch is probably trivial, while in > advance of some consensus on the behavior, offering a patch is a waste > of time. Sure. Chris
> Right. But the problem with switching the OID default is not a matter > of code --- it's of working out what the compatibility issues are. > As I recall, one thing people did not want was for pg_dump to plaster > WITH OIDS or WITHOUT OIDS on every single CREATE TABLE, as this would > pretty much destroy any shot at loading PG dumps into any other > database. Ummm...what about SERIAL columns, ALTER TABLE / SET STATS, SET STORAGE, custom types, 'btree' in CREATE INDEX, SET SEARCH_PATH, '::" cast operator, stored procedures, rules, etc. - how is adding WITH OIDS going to change that?! > What we need is an agreement on the behavior we want (making > the best possible compromise between this and other compatibility > desires). After that, the actual patch is probably trivial, while in > advance of some consensus on the behavior, offering a patch is a waste > of time. Sure. Chris
On Thu, 2003-02-27 at 02:30, Tom Lane wrote: > It's moving in the wrong direction. We've been slowly eliminating > unnecessary nonstandardisms in pg_dump output; this puts in a new one > in a quite fundamental place. You could perhaps expect another DB > to drop commands it didn't understand like SET SEARCH_PATH ... but if > it drops all your CREATE TABLEs, you ain't got much dump left to load. Rather than specifying the use of OIDs by WITH OIDS clauses for each CREATE TABLE in a dump, couldn't we do it by adding a SET command that toggles the 'use_oids' GUC option prior to every CREATE TABLE? That way, a user concerned with portability could fairly easily strip out (or just ignore) the SET commands. Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
On Wed, 2003-03-05 at 08:54, Neil Conway wrote: > On Thu, 2003-02-27 at 02:30, Tom Lane wrote: > > It's moving in the wrong direction. We've been slowly eliminating > > unnecessary nonstandardisms in pg_dump output; this puts in a new one > > in a quite fundamental place. You could perhaps expect another DB > > to drop commands it didn't understand like SET SEARCH_PATH ... but if > > it drops all your CREATE TABLEs, you ain't got much dump left to load. > > Rather than specifying the use of OIDs by WITH OIDS clauses for each > CREATE TABLE in a dump, couldn't we do it by adding a SET command that > toggles the 'use_oids' GUC option prior to every CREATE TABLE? That way, > a user concerned with portability could fairly easily strip out (or just > ignore) the SET commands. Toggling the SET command prior to each table creation? Thats an excellent idea. It should also allow us to easily transition to the default being off after a release or two. -- Rod Taylor <rbt@rbt.ca> PGP Key: http://www.rbt.ca/rbtpub.asc
Attachment
Neil Conway <neilc@samurai.com> writes: > Rather than specifying the use of OIDs by WITH OIDS clauses for each > CREATE TABLE in a dump, couldn't we do it by adding a SET command that > toggles the 'use_oids' GUC option prior to every CREATE TABLE? Seems better than cluttering the CREATE TABLE itself with them, I guess. regards, tom lane
Tom Lane wrote: > "Christopher Kings-Lynne" <chriskl@familyhealth.com.au> writes: > >> As I recall, one thing people did not want was for pg_dump to plaster > >> WITH OIDS or WITHOUT OIDS on every single CREATE TABLE, as this would > >> pretty much destroy any shot at loading PG dumps into any other > >> database. > > > Ummm...what about SERIAL columns, ALTER TABLE / SET STATS, SET STORAGE, > > custom types, 'btree' in CREATE INDEX, SET SEARCH_PATH, '::" cast operator, > > stored procedures, rules, etc. - how is adding WITH OIDS going to change > > that?! > > It's moving in the wrong direction. We've been slowly eliminating > unnecessary nonstandardisms in pg_dump output; this puts in a new one > in a quite fundamental place. You could perhaps expect another DB > to drop commands it didn't understand like SET SEARCH_PATH ... but if > it drops all your CREATE TABLEs, you ain't got much dump left to load. Why was the schema path called search_path rather than schema_path? Standards? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Tom Lane wrote: > Neil Conway <neilc@samurai.com> writes: > > Rather than specifying the use of OIDs by WITH OIDS clauses for each > > CREATE TABLE in a dump, couldn't we do it by adding a SET command that > > toggles the 'use_oids' GUC option prior to every CREATE TABLE? > > Seems better than cluttering the CREATE TABLE itself with them, I guess. It would be good to somehow SET the use_oids GUC value on restore start, and just use SET when the table is different than the default, but then there is no mechanism to do that when you restore a single table. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Why was the schema path called search_path rather than schema_path? Nobody suggested anything different ... it's a bit late now ... regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Why was the schema path called search_path rather than schema_path? > > Nobody suggested anything different ... it's a bit late now ... I started to think about it when we were talking about a config_path variable. Search path then looked confusing. :-( -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Thu, 2003-03-06 at 16:13, Bruce Momjian wrote: > It would be good to somehow SET the use_oids GUC value on restore start, > and just use SET when the table is different than the default, but then > there is no mechanism to do that when you restore a single table. What if the default value changes? IMHO, running a SET per CREATE TABLE isn't too ugly... Cheers, Neil -- Neil Conway <neilc@samurai.com> || PGP Key ID: DB3C29FC
Neil Conway wrote: > On Thu, 2003-03-06 at 16:13, Bruce Momjian wrote: > > It would be good to somehow SET the use_oids GUC value on restore start, > > and just use SET when the table is different than the default, but then > > there is no mechanism to do that when you restore a single table. > > What if the default value changes? > > IMHO, running a SET per CREATE TABLE isn't too ugly... Not ugly, but a little noisy. However, my idea of having a single SET at the top is never going to work, so I don't have a better idea. The killer for me is that you are never going to know the GUC default when you are loading, so we are _always_ going to have that SET for each table. I suppose we could set the default to off, and set it ON in the dump only when we want OID. If they set GUC to on, they will get oid's from the load, but it will cut down on the cruft and over time, they will only have the SET for cases where they really want an oid. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
> Neil Conway wrote: > > On Thu, 2003-03-06 at 16:13, Bruce Momjian wrote: > > > It would be good to somehow SET the use_oids GUC value on restore > start, > > > and just use SET when the table is different than the default, but > then > > > there is no mechanism to do that when you restore a single table. > > > > What if the default value changes? > > > > IMHO, running a SET per CREATE TABLE isn't too ugly... > > Not ugly, but a little noisy. However, my idea of having a single SET > at the top is never going to work, so I don't have a better idea. Why isn't this done on a per-session basis? Having a session setting for the common case and a CREATE-TABLE clause for the specifics sounds natural. When a single table needs to be restored all one needs to to is changing the session setting before running the CREATE command. The alternative clause in CREATE-TABLE statements would be used as a cleaner way of expressing the same thing without affecting the session, when the statement's text can be entered manually (as opposed to loading it from an existing dumpfile). The default for the session setting could be set in the configuration file then. regards, Daniel Alvarez Arribas <d-alvarez@gmx.de> -- +++ GMX - Mail, Messaging & more http://www.gmx.net +++ Bitte lächeln! Fotogalerie online mit GMX ohne eigene Homepage!
daniel alvarez <d-alvarez@gmx.de> writes: >> Not ugly, but a little noisy. However, my idea of having a single SET >> at the top is never going to work, so I don't have a better idea. > Why isn't this done on a per-session basis? Because pg_dump can't know what the session default will be when the dump is reloaded. The scheme you are proposing will only succeed in making pg_dump unreliable. regards, tom lane
> daniel alvarez <d-alvarez@gmx.de> writes: > >> Not ugly, but a little noisy. However, my idea of having a single SET > >> at the top is never going to work, so I don't have a better idea. > > > Why isn't this done on a per-session basis? > > Because pg_dump can't know what the session default will be when the > dump is reloaded. The scheme you are proposing will only succeed in > making pg_dump unreliable. Ouch. Why is this? Doesn't it read the config because of portability reasons? -- +++ GMX - Mail, Messaging & more http://www.gmx.net +++ Bitte lächeln! Fotogalerie online mit GMX ohne eigene Homepage!
daniel alvarez wrote: > > daniel alvarez <d-alvarez@gmx.de> writes: > > >> Not ugly, but a little noisy. However, my idea of having a single SET > > >> at the top is never going to work, so I don't have a better idea. > > > > > Why isn't this done on a per-session basis? > > > > Because pg_dump can't know what the session default will be when the > > dump is reloaded. The scheme you are proposing will only succeed in > > making pg_dump unreliable. > > Ouch. Why is this? Doesn't it read the config because of portability > reasons? Remember the dump output is just an SQL script, so there is no 'logic' in the script, and it can be loaded right into psql. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073