Re: duplicates - Mailing list pgsql-sql

From Bryan White
Subject Re: duplicates
Date
Msg-id 008701bfa09a$4c09ed70$2dd260d1@bryan
Whole thread Raw
In response to duplicates  (Allan Kelly <allan.kelly@buildstore.co.uk>)
List pgsql-sql
> I have a system bug which means that I have occasional dupicate entries in
> my 'subscribers' table. I'm trying to find a query which performs
something
> like a 'where user_name is not unique' query.
>
> At the moment we use this cludge:
>
> select count(*), user_name from subscribers
> group by user_name order by count;
>
> (I'm surprised I can't add 'where count > 1' but it seems conditionals on
> aggregate fields are not allowed).
>
> This gives me a very long list with the 'not unique' entries at the
bottom, eg
>
> count | user_name
> ------+------------------
> 1     | bill.hicks
>   [ ..cut 9 zillion results.. ]
> 1     | margaret.thatcher
> 4     | linus.torvalds
> 9     | bill.gates
>
> I then have to do
>
> select oid from subscribers where user_name = 'linus.torvalds';
>
> and delete all but one of the records. Is there a better way to do this?
> And yes, we're working on a system fix to avoid the problem in the 1st
place!

I would create a unique index on the user_name field (it sounds like a field
you want indexed anyway).
This way the insert will fail when it is a duplicate.  Of course you may
have to alter the code that does the insert to ckeck for failure and handle
it.  Also you will have to delete your duplicates before creating the unique
index.



pgsql-sql by date:

Previous
From: "Gerhard Dieringer"
Date:
Subject: Antw: duplicates
Next
From: Tom Lane
Date:
Subject: Re: Maxima per row