Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace - Mailing list pgsql-hackers

From Mike Mascari
Subject Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace
Date
Msg-id 388D0F84.FF844DF@mascari.com
Whole thread Raw
In response to DISTINCT ON: speak now or forever hold your peace  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] DISTINCT ON: speak now or forever hold your peace  (Bruce Momjian <pgman@candle.pha.pa.us>)
List pgsql-hackers
Tom Lane wrote:
> 
> If I don't hear loud hollers very soon, I'm going to eliminate the
> DISTINCT ON "feature" for 7.0.  As previously discussed, this feature
> is not standard SQL and has no clear semantic interpretation.
> 

I grepped our source code and found this query:

INSERT INTO temp_sales
SELECT DISTINCT on key supplysources.supplysource, 
incharges.supply, targets.target, incharges.saledate, 
incharges.supplyunit, '', incharges.quantity, incharges.company, 
incharges.costcntr, 'Replenish', incharges.price, '','','', 0, 
text(supplysources.supplysource)||
text(incharges.supply)||
text(targets.target)||
text(incharges.saledate) as key 
FROM supplysources, incharges, targets WHERE 
supplysources.warehouse = incharges.warehouse AND 
(targets.site,targets.area) = (incharges.site,incharges.area);

What happens is that a large charges file which is transferred to
a mainframe ERP application is first brought into PostgreSQL.
Depending upon certain race conditions, duplicate "sales" records
can appear in the data file. We use DISTINCT ON to pick (as you
point out) an arbitrary record when duplicates appear. I suppose
we could do a DELETE ... WHERE NOT EXISTS after the import. Using
DISTINCT ON just saves a step. I don't have any arguments beyond
the grounds that we're using it in existing code as a duplicate
record filter - :-(

Just FYI, 

Mike Mascari


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] Happy column dropping
Next
From: Michael Robinson
Date:
Subject: Re: [HACKERS] fatal copy in/out error (6.5.3)