Re: Median - Mailing list pgsql-sql

From Tom Lane
Subject Re: Median
Date
Msg-id 19147.963956583@sss.pgh.pa.us
Whole thread Raw
In response to Re: Median  (JanWieck@t-online.de (Jan Wieck))
List pgsql-sql
JanWieck@t-online.de (Jan Wieck) writes:
>     I don't see any quick solution how to solve this problem with
>     an aggregate.  Aggregates get all selected values in unsorted
>     order, and don't know ahead how many  items  there  will  be.
>     Even if, all this wouldn't be of any use, because you need to
>     look at the entire sorted list of selected items.

Hmm.  It would be a pretty straightforward extension of the existing
support for DISTINCT aggregates to allow the agg function to receive
all the inputs in sorted order along with a count of how many there
are, whereupon a percentile aggregate would be trivial.

Slow, but trivial.

Probably a better way would be to skip evaluating the agg's transition
function as such, and instead call the agg's final function just once
with a pointer to the tuplesort object that contains the sorted input
data.  Then you reach in and pull out just the items you want, instead
of having to read 'em all.  Or you can scan 'em if you want.  We
might need to add a few features to the tuplesort API to allow access
to the N'th item in the sorted data, but it's surely doable.

Anyone care to work up a detailed proposal for something along this
line?  It seems like more work than it's worth to me, but if someone
else wants to do the legwork...
        regards, tom lane


pgsql-sql by date:

Previous
From: JanWieck@t-online.de (Jan Wieck)
Date:
Subject: Re: Median
Next
From: Peter Eisentraut
Date:
Subject: Re: Database authentication and configuration