Thread: Unique rows without a key

Unique rows without a key

From
Steve Micallef
Date:
Hi,

Is there any way to force every row in a table to be unique without
creating a unique key?

My tables are fairly large (growing by a gig a day) and creating a primary
key would basically double my disk usage.

Thanks for your help,

Steve Micallef


Re: Unique rows without a key

From
GH
Date:
On Wed, Jun 13, 2001 at 09:00:33AM +1000, some SMTP stream spewed forth:
> Hi,
>
> Is there any way to force every row in a table to be unique without
> creating a unique key?

Not exactly what you want, I suppose, but you could create a unique index
on the oid column --- the only additional overhead would be that of the
index.

> My tables are fairly large (growing by a gig a day) and creating a primary
> key would basically double my disk usage.

Obviously, the system would have to have some way of knowing what other
values exist -- you would have to have a unique index on some column.

> Thanks for your help,
> Steve Micallef

gh


Re: Unique rows without a key

From
"Thalis A. Kalfigopoulos"
Date:
You could have one additional column that contains a hash of the entire row, so you'd only need a unique index on this
hashcolumn. 

cheers,
thalis


On Wed, 13 Jun 2001, Steve Micallef wrote:

> Hi,
>
> Is there any way to force every row in a table to be unique without
> creating a unique key?
>
> My tables are fairly large (growing by a gig a day) and creating a primary
> key would basically double my disk usage.
>
> Thanks for your help,
>
> Steve Micallef
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>


Re: Unique rows without a key

From
"Thalis A. Kalfigopoulos"
Date:
On Thu, 14 Jun 2001, Steve Micallef wrote:

> > You could have one additional column that contains a hash of the entire row, so you'd only need a unique index on
thishash column. 
> >
>
> Thanks for your help. However I was thinking more along the lines of using
> the table itself as the index. Creating a hash column with all the other
> columns as part of it still significantly increases my table size.

It's not that much of an overhead especially if your table has many attributes. There is a function hashname() that
willreturn an integer after hashing the text it takes as argument. I didn't find any documentation, but seems to work. 
I tried the following and it was ok:

thalis=# CREATE TABLE xoxo (name char(10),id int4,hashval int4);
CREATE
thalis=# CREATE FUNCTION mytrig_func() RETURNS OPAQUE AS 'BEGIN NEW.hashval=hashname(NEW.name||NEW.id); RETURN NEW;
END;'LANGUAGE 'plpgsql'; 
CREATE
thalis=# CREATE TRIGGER xoxo_trig_ins BEFORE INSERT ON xoxo FOR EACH ROW EXECUTE PROCEDURE mytrig_func();
CREATE
lis=# insert into xoxo(name,id) values('thalis',13);
INSERT 9347654 1
thalis=# select * from xoxo;
    name    | id |  hashval
------------+----+-----------
 thalis     | 13 | 644124107
(1 row)

Then you just add a unique index on hashval and you are done :-)

cheers,
thalis


>
> Thanks,
>
> Steve.
>
>


Re: Unique rows without a key

From
Doug McNaught
Date:
"Thalis A. Kalfigopoulos" <thalis@cs.pitt.edu> writes:

> > Thanks for your help. However I was thinking more along the lines of using
> > the table itself as the index. Creating a hash column with all the other
> > columns as part of it still significantly increases my table size.
>
> It's not that much of an overhead especially if your table has many
> attributes. There is a function hashname() that will return an
> integer after hashing the text it takes as argument. I didn't find
> any documentation, but seems to work.

If it's not documented I'd worry about it going away in the future.

> Then you just add a unique index on hashval and you are done :-)

And if you get a hash collision?  Going to disallow a perfectly
legitimate and unique row because of it?

-Doug
--
The rain man gave me two cures; he said jump right in,
The first was Texas medicine--the second was just railroad gin,
And like a fool I mixed them, and it strangled up my mind,
Now people just get uglier, and I got no sense of time...          --Dylan

Re: Unique rows without a key

From
"Thalis A. Kalfigopoulos"
Date:
On 14 Jun 2001, Doug McNaught wrote:

> "Thalis A. Kalfigopoulos" <thalis@cs.pitt.edu> writes:
>
> > > Thanks for your help. However I was thinking more along the lines of using
> > > the table itself as the index. Creating a hash column with all the other
> > > columns as part of it still significantly increases my table size.
> >
> > It's not that much of an overhead especially if your table has many
> > attributes. There is a function hashname() that will return an
> > integer after hashing the text it takes as argument. I didn't find
> > any documentation, but seems to work.
>
> If it's not documented I'd worry about it going away in the future.
>
> > Then you just add a unique index on hashval and you are done :-)
>
> And if you get a hash collision?  Going to disallow a perfectly
> legitimate and unique row because of it?
>
> -Doug

Right on both points (especially the second one). Is it my imagination or are there plans to add md5 hashing?
Anyway, worst case scenario, one will have to write one's own md5 function and extract the hashval from there. If you
geta collision there, then you should be very happy with your achievement ;-) 

cheers,
thalis


> --
> The rain man gave me two cures; he said jump right in,
> The first was Texas medicine--the second was just railroad gin,
> And like a fool I mixed them, and it strangled up my mind,
> Now people just get uglier, and I got no sense of time...          --Dylan
>


Re: Unique rows without a key

From
Neil Conway
Date:
On Thu, Jun 14, 2001 at 11:14:05AM -0400, Thalis A. Kalfigopoulos wrote:
> [snip]
> Is it my imagination or are there plans to add md5 hashing?

See contrib/pgcrypto. I've been using it for a while and I haven't had
any problems.

Cheers,

Neil