Re: Tricky math in PSQL - Mailing list pgsql-novice
From | Merlin Moncure |
---|---|
Subject | Re: Tricky math in PSQL |
Date | |
Msg-id | CAHyXU0xDBQfROpb349C128PQ-kf=fMAgMSrndeQNmCVnOmvJmQ@mail.gmail.com Whole thread Raw |
In response to | Re: Tricky math in PSQL (Kip Warner <kip@thevertigo.com>) |
Responses |
Re: Tricky math in PSQL
|
List | pgsql-novice |
On Mon, Apr 25, 2016 at 8:46 PM, Kip Warner <kip@thevertigo.com> wrote: > On Fri, 2016-04-22 at 14:37 -0500, Kevin Grittner wrote: >> Please include such information in the post, either directly or as >> an attachment. Messages on these lists go back over 18 years at >> this point, and while you did flag that paste as "never delete", >> who's to say whether pastebin will be around to keep the promise >> that the URL will be valid 18 years from now? > > Hey Kevin, > > Noted. In general you are right. Because these queries are a bit long > and sometimes wrap in ugly ways in folks' MUAs making them harder to > read, you'll have to forgive me for this one. No, we don't :-). Anyways, the relevant portion is here. -- I would like to know the most efficient way of calculating the following, -- preferably within the DBMS, to avoid having to shift intermediate -- calculations back and forth between the client application. It is easy to -- write and understand mathematically, but translating into PSQL is tricky... --total_diff(id_a, id_b) = -- (statistic_constant * statistic_diff(id_a, id_b)) + -- (statistic_child_constant * -- sigma( -- sort_ordinal=1, -- sort_ordinal= -- min(COUNT(statistics_child.id=id_a), COUNT(statistics_child.id=id_b)), -- statistic_child_diff(id_a, id_b, sort_ordinal)) / constantG)) -- ...the sigma sums all statistic_child_diff() for the two id's to compare -- for sort_ordinal 1,2,3,...to N where N is the lesser of the number of rows in -- either statistics_child with id_a or id_b. I'm not going to work out the whole problem for you, but you probably want to start looking in the direction custom aggregate function. Most people are not aware of this but custom aggregates can be applied over a window. The state can be a composite type and thus hold arbitrarily complex information that you can maintain as you walk through the set (this works great for time series and things like that). Barring that, if you are still looking for solutions to this type of problem in the database, I'd point you in the direction of pl/r. The learning curve is somewhat steep but rewards mastery. CREATE AGGREGATE RateChange(NUMERIC, NUMERIC, INT) ( SFUNC = AggRateChange, STYPE = NUMERIC ); CREATE OR REPLACE FUNCTION AggRateChange( State NUMERIC, Seed NUMERIC, RateChange NUMERIC, NDigits INT) RETURNS NUMERIC AS $$ SELECT ROUND(COALESCE($1, $2) * (1.0 + $3), NDigits); $$ LANGUAGE SQL IMMUTABLE; select ratechange(100, (random() * 0.05)::numeric, 3) over(order by s) from generate_series(1,10) s; ratechange ──────────── 101.524 104.570 104.674 106.599 111.685 114.583 120.174 125.561 130.291 merlin
pgsql-novice by date: