Thread: Unique rows without a key
Hi, Is there any way to force every row in a table to be unique without creating a unique key? My tables are fairly large (growing by a gig a day) and creating a primary key would basically double my disk usage. Thanks for your help, Steve Micallef
On Wed, Jun 13, 2001 at 09:00:33AM +1000, some SMTP stream spewed forth: > Hi, > > Is there any way to force every row in a table to be unique without > creating a unique key? Not exactly what you want, I suppose, but you could create a unique index on the oid column --- the only additional overhead would be that of the index. > My tables are fairly large (growing by a gig a day) and creating a primary > key would basically double my disk usage. Obviously, the system would have to have some way of knowing what other values exist -- you would have to have a unique index on some column. > Thanks for your help, > Steve Micallef gh
You could have one additional column that contains a hash of the entire row, so you'd only need a unique index on this hashcolumn. cheers, thalis On Wed, 13 Jun 2001, Steve Micallef wrote: > Hi, > > Is there any way to force every row in a table to be unique without > creating a unique key? > > My tables are fairly large (growing by a gig a day) and creating a primary > key would basically double my disk usage. > > Thanks for your help, > > Steve Micallef > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
On Thu, 14 Jun 2001, Steve Micallef wrote: > > You could have one additional column that contains a hash of the entire row, so you'd only need a unique index on thishash column. > > > > Thanks for your help. However I was thinking more along the lines of using > the table itself as the index. Creating a hash column with all the other > columns as part of it still significantly increases my table size. It's not that much of an overhead especially if your table has many attributes. There is a function hashname() that willreturn an integer after hashing the text it takes as argument. I didn't find any documentation, but seems to work. I tried the following and it was ok: thalis=# CREATE TABLE xoxo (name char(10),id int4,hashval int4); CREATE thalis=# CREATE FUNCTION mytrig_func() RETURNS OPAQUE AS 'BEGIN NEW.hashval=hashname(NEW.name||NEW.id); RETURN NEW; END;'LANGUAGE 'plpgsql'; CREATE thalis=# CREATE TRIGGER xoxo_trig_ins BEFORE INSERT ON xoxo FOR EACH ROW EXECUTE PROCEDURE mytrig_func(); CREATE lis=# insert into xoxo(name,id) values('thalis',13); INSERT 9347654 1 thalis=# select * from xoxo; name | id | hashval ------------+----+----------- thalis | 13 | 644124107 (1 row) Then you just add a unique index on hashval and you are done :-) cheers, thalis > > Thanks, > > Steve. > >
"Thalis A. Kalfigopoulos" <thalis@cs.pitt.edu> writes: > > Thanks for your help. However I was thinking more along the lines of using > > the table itself as the index. Creating a hash column with all the other > > columns as part of it still significantly increases my table size. > > It's not that much of an overhead especially if your table has many > attributes. There is a function hashname() that will return an > integer after hashing the text it takes as argument. I didn't find > any documentation, but seems to work. If it's not documented I'd worry about it going away in the future. > Then you just add a unique index on hashval and you are done :-) And if you get a hash collision? Going to disallow a perfectly legitimate and unique row because of it? -Doug -- The rain man gave me two cures; he said jump right in, The first was Texas medicine--the second was just railroad gin, And like a fool I mixed them, and it strangled up my mind, Now people just get uglier, and I got no sense of time... --Dylan
On 14 Jun 2001, Doug McNaught wrote: > "Thalis A. Kalfigopoulos" <thalis@cs.pitt.edu> writes: > > > > Thanks for your help. However I was thinking more along the lines of using > > > the table itself as the index. Creating a hash column with all the other > > > columns as part of it still significantly increases my table size. > > > > It's not that much of an overhead especially if your table has many > > attributes. There is a function hashname() that will return an > > integer after hashing the text it takes as argument. I didn't find > > any documentation, but seems to work. > > If it's not documented I'd worry about it going away in the future. > > > Then you just add a unique index on hashval and you are done :-) > > And if you get a hash collision? Going to disallow a perfectly > legitimate and unique row because of it? > > -Doug Right on both points (especially the second one). Is it my imagination or are there plans to add md5 hashing? Anyway, worst case scenario, one will have to write one's own md5 function and extract the hashval from there. If you geta collision there, then you should be very happy with your achievement ;-) cheers, thalis > -- > The rain man gave me two cures; he said jump right in, > The first was Texas medicine--the second was just railroad gin, > And like a fool I mixed them, and it strangled up my mind, > Now people just get uglier, and I got no sense of time... --Dylan >
On Thu, Jun 14, 2001 at 11:14:05AM -0400, Thalis A. Kalfigopoulos wrote: > [snip] > Is it my imagination or are there plans to add md5 hashing? See contrib/pgcrypto. I've been using it for a while and I haven't had any problems. Cheers, Neil