Re: median for postgresql 8.3 - Mailing list pgsql-general

From maarten
Subject Re: median for postgresql 8.3
Date
Msg-id 1289980506.4363.107.camel@localhost
Whole thread Raw
In response to Re: median for postgresql 8.3  ("Brent Wood" <b.wood@niwa.co.nz>)
List pgsql-general
Hello Brent,

well I said "using only the db" because this is a one time need and just
wanted to avoid crufting around with calc, and doing 'manual' work.

But that seems to be the fastest approach now.

When it becomes a frequent need I'll probably end up doing what you
suggest, or upgrade to 9.x, for which I've found simple examples due to
dynamic limit and offset.

thanks and regards,
Maarten


On Wed, 2010-11-17 at 09:53 +1300, Brent Wood wrote:
> Hi Maarten,
>
> The best way I know of to do this is not to do statistical queries
> "in" the DB at all, but use a stats capability embedded in your
> database, so they still appear to the user to be done in the db. I
> don't see how you can easily get the functionality you want without
> user defined functions or addons, While PL/R is a "special addon", and
> you created a custom median function to do this, there are very good
> instructions to follow to do this.
>
>  I think it might be worth your while if you are looking to retrieve
> stats from SQL queries.
>
> See PL/R, and the median how-to at:
> http://www.bostongis.com/PrinterFriendly.aspx?content_name=postgresql_plr_tut01
>
> HTH,
>
>   Brent Wood
>
>
>
> Brent Wood
> DBA/GIS consultant
> NIWA, Wellington
> New Zealand
>
> >>> maarten <maarten.foque@edchq.com> 11/17/10 9:15 AM >>>
> Hello everyone,
>
> I was doing some analysis of data to find average delays between some
> timestamp values etc...
> When the number of rows the average is computed over is small, this
> can
> give distorted values. So I've obviously added a count column to see
> if
> the average represents much data.
> However, I would also like to add the median value to give me a pretty
> good idea of whats happening even for smaller counts.
>
> I couldn't find such an aggregate function in the manual (version 8.3)
> and some websearching didn't uncover it either.
>
> I was thinking about
> SELECT max(id) FROM test ORDER BY id ASC LIMIT
>     (SELECT count(*)/2 FROM test)
>
> But two things are wrong with that:
> Limit can't use subqueries :(
> And ORDER BY gives me the error: 'must be used in aggregate function
> etc...) but I can probably work around this by using an ordered
> subquery
> in stead of the table directly.
>
> Furthermore, I need the median for a timestamp column, which would
> probably complicate things more than when it is a number column.
>
> I'd like to be able to do this using only the database. (So no
> programming functions, special addons etc...)
>
> Any ideas anyone?
>
> regards,
> Maarten
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>
> Please consider the environment before printing this email.
>
> NIWA is the trading name of the National Institute of Water &
> Atmospheric Research Ltd.


pgsql-general by date:

Previous
From: Willy-Bas Loos
Date:
Subject: Re: where is pg_stat_activity (and others) in the documentation?
Next
From: Magnus Hagander
Date:
Subject: Re: Postgres forums ... take 2