Re: A DISTINCT problem removing duplicates - Mailing list pgsql-sql

From Richard Huxton
Subject Re: A DISTINCT problem removing duplicates
Date
Msg-id 493E8910.7010007@archonet.com
Whole thread Raw
In response to Re: A DISTINCT problem removing duplicates  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: A DISTINCT problem removing duplicates  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql
Tom Lane wrote:
> Richard Huxton <dev@archonet.com> writes:
>> Anyone got anything more elegant?
> 
> Seems to me that no document should have an empty dup_set.  If it's not
> a match to any existing document, then immediately assign a new dup_set
> number to it.

That was my initial thought too, but it means when I actually find a
duplicate I have to decide which "direction" to renumber them in. It
also means probably keeping a summary table with counts to show which
are duplicates, since the duplicates table is now the same size as the
documents table.


--  Richard Huxton Archonet Ltd


pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: A DISTINCT problem removing duplicates
Next
From: Tom Lane
Date:
Subject: Re: A DISTINCT problem removing duplicates