Thread: searching oid's
I've been contemplating the idea of replacing my current char(32) identification numbers with oid's. Apart from the significant space gained, I was hoping there would also be a speed gain or some other incentives. First, I tried searching tables by their oid, but explain returned sequential scans. Second, I tried specifing the oid as the primary key in a table, but the oid column wasn't found. Finally, I created an index for oid which worked fine. In the end, I feel I'm back to square one having to use the same index as with my char(32) id's. It seems my only gain would be 28 bytes per row and no speed gain, apart perhaps for building the index which should be a bit slower for a char(32) datatype rather than an integer. If there's something I'm missing, I'd appreciate if someone could share their tips and tricks to using oid's more efficiently. Thanks, Marc
> First, I tried searching tables by their oid, but explain returned > sequential scans. Second, I tried specifing the oid as the primary key in > a table, but the oid column wasn't found. Finally, I created an index for > oid which worked fine. In the end, I feel I'm back to square one having to > use the same index as with my char(32) id's. It seems my only gain would > be 28 bytes per row and no speed gain, apart perhaps for building the > index which should be a bit slower for a char(32) datatype rather than an > integer. Using oid's is not a good idea, as they don't automatically get dumped with pg_dump. And once your referential integrity gets screwed up and you are using oid's you are really in a mess, as you cannot change oids. Use a serial field to generate a key for every row, which generates you a sequence of integers. It is much better than oids at a cost of 4 bytes. As far as i have understood you need an index if you want to avoid a sequential scan as tuples are not stored in a hierarchy in the table. Only in indexes do you get b-trees etc. So define your serial field as a primary key and you are done. And comparing 4-byte ints is much faster than comparing 32-byte text fields, that's for sure. Adriaan
Actually, I think pg_dump with the -o flag keeps oid's, therefore allowing me to keep referential integrity after backup. > > First, I tried searching tables by their oid, but explain returned > > sequential scans. Second, I tried specifing the oid as the primary key in > > a table, but the oid column wasn't found. Finally, I created an index for > > oid which worked fine. In the end, I feel I'm back to square one having to > > use the same index as with my char(32) id's. It seems my only gain would > > be 28 bytes per row and no speed gain, apart perhaps for building the > > index which should be a bit slower for a char(32) datatype rather than an > > integer. > > Using oid's is not a good idea, as they don't automatically get dumped with > pg_dump. And once your referential integrity gets screwed up and you are > using oid's you are really in a mess, as you cannot change oids. Use a serial > field to generate a key for every row, which generates you a sequence of > integers. It is much better than oids at a cost of 4 bytes. > > As far as i have understood you need an index if you want to avoid a > sequential scan as tuples are not stored in a hierarchy in the table. Only in > indexes do you get b-trees etc. So define your serial field as a primary key > and you are done. And comparing 4-byte ints is much faster than comparing > 32-byte text fields, that's for sure. > > Adriaan > >
admin wrote: > Actually, I think pg_dump with the -o flag keeps oid's, therefore allowing > me to keep referential integrity after backup. > > > Using oid's is not a good idea, as they don't automatically get dumped with > > pg_dump. And once your referential integrity gets screwed up and you are > > using oid's you are really in a mess, as you cannot change oids. Use a serial > > field to generate a key for every row, which generates you a sequence of > > integers. It is much better than oids at a cost of 4 bytes. Yes oids get dumped with the -o flag. That is why I said automatically. Fact remains that you cannot manipulate oids. Should you ever want to copy a table into an exisiting system you would have to do a new initdb to make sure that the oids in your table are not in use. And if anything ever gets corrupted it is much harder to recover and fix it, as you have no control over the oid values that the system assigns. I would definitely recommend a separate serial value, and I believe this is also what is recommended in the postgres docs. Adriaan
Adriaan Joubert wrote: > > Yes oids get dumped with the -o flag. That is why I said automatically. Fact > remains that you cannot manipulate oids. Should you ever want to copy a table into > an exisiting system you would have to do a new initdb to make sure that the oids > in your table are not in use. And if anything ever gets corrupted it is much > harder to recover and fix it, as you have no control over the oid values that the > system assigns. I would definitely recommend a separate serial value, and I > believe this is also what is recommended in the postgres docs. You make some good points, but where is this recommended in the docs? I'd like to see more of "serial vs oid" if there is something. Bruce's book doesn't recommend one over the other, though it does mention some oid limitations. It doesn't mention that the sequence for a serial value isn't dropped when its table is, btw. I'd be interested in what others have to say on this topic. --Rick
> Adriaan Joubert wrote: > > > > Yes oids get dumped with the -o flag. That is why I said automatically. Fact > > remains that you cannot manipulate oids. Should you ever want to copy a table into > > an exisiting system you would have to do a new initdb to make sure that the oids > > in your table are not in use. And if anything ever gets corrupted it is much > > harder to recover and fix it, as you have no control over the oid values that the > > system assigns. I would definitely recommend a separate serial value, and I > > believe this is also what is recommended in the postgres docs. > > You make some good points, but where is this recommended in the docs? I'd like > to see more of "serial vs oid" if there is something. > > Bruce's book doesn't recommend one over the other, though it does mention some > oid limitations. It doesn't mention that the sequence for a serial value isn't > dropped when its table is, btw. Is there some issue in the comparison I missed? That is a good point about the sequence not being dropped. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000 + If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania 19026
On 2000-01-14, Rick Delaney mentioned: > You make some good points, but where is this recommended in the docs? I'd like > to see more of "serial vs oid" if there is something. Although I don't represent any official opinion, the tendency among the developers would seem to be not using oids for unique values. The serial type is guaranteed to be supported for this but oids just happen to work. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden