Home > mailing lists

Re: Tricky math in PSQL - Mailing list pgsql-novice

From	Merlin Moncure
Subject	Re: Tricky math in PSQL
Date	April 26, 2016 14:13:48
Msg-id	CAHyXU0xDBQfROpb349C128PQ-kf=fMAgMSrndeQNmCVnOmvJmQ@mail.gmail.com Whole thread
In response to	Re: Tricky math in PSQL (Kip Warner <kip@thevertigo.com>)
Responses	Re: Tricky math in PSQL
List	pgsql-novice

Tree view

On Mon, Apr 25, 2016 at 8:46 PM, Kip Warner <kip@thevertigo.com> wrote:
> On Fri, 2016-04-22 at 14:37 -0500, Kevin Grittner wrote:
>> Please include such information in the post, either directly or as
>> an attachment.  Messages on these lists go back over 18 years at
>> this point, and while you did flag that paste as "never delete",
>> who's to say whether pastebin will be around to keep the promise
>> that the URL will be valid 18 years from now?
>
> Hey Kevin,
>
> Noted. In general you are right. Because these queries are a bit long
> and sometimes wrap in ugly ways in folks' MUAs making them harder to
> read, you'll have to forgive me for this one.

No, we don't :-).    Anyways, the relevant portion is here.

-- I would like to know the most efficient way of calculating the following,
--  preferably within the DBMS, to avoid having to shift intermediate
--  calculations back and forth between the client application. It is easy to
--  write and understand mathematically, but translating into PSQL is tricky...

--total_diff(id_a, id_b) =
--    (statistic_constant * statistic_diff(id_a, id_b)) +
--    (statistic_child_constant *
--        sigma(
--            sort_ordinal=1,
--            sort_ordinal=
--                min(COUNT(statistics_child.id=id_a),
COUNT(statistics_child.id=id_b)),
--            statistic_child_diff(id_a, id_b, sort_ordinal)) / constantG))

-- ...the sigma sums all statistic_child_diff() for the two id's to compare
-- for sort_ordinal 1,2,3,...to N where N is the lesser of the number of rows in
-- either statistics_child with id_a or id_b.

I'm not going to work out the whole problem for you, but you probably
want to start looking in the direction custom aggregate function.
Most people are not aware of this but custom aggregates can be applied
over a window.  The state can be a composite type and thus hold
arbitrarily complex information that you can maintain as you walk
through the set (this works great for time series and things like
that).

Barring that, if you are still looking for solutions to this type of
problem in the database, I'd point you in the direction of pl/r.  The
learning curve is somewhat steep but rewards mastery.

CREATE AGGREGATE RateChange(NUMERIC, NUMERIC, INT)
(
  SFUNC = AggRateChange,
  STYPE = NUMERIC
);

CREATE OR REPLACE FUNCTION AggRateChange(
  State NUMERIC,
  Seed NUMERIC,
  RateChange NUMERIC,
  NDigits INT) RETURNS NUMERIC AS
$$
  SELECT ROUND(COALESCE($1, $2) * (1.0 + $3), NDigits);
$$ LANGUAGE SQL IMMUTABLE;

select ratechange(100, (random() * 0.05)::numeric, 3) over(order by s)
from generate_series(1,10) s;
 ratechange
────────────
    101.524
    104.570
    104.674
    106.599
    111.685
    114.583
    120.174
    125.561
    130.291


merlin

pgsql-novice by date:

From: Kip Warner
Date: 26 April 2016, 01:46:25
Subject: Re: Tricky math in PSQL

From: Kip Warner
Date: 27 April 2016, 07:09:51
Subject: Re: Tricky math in PSQL

Re: Tricky math in PSQL - Mailing list pgsql-novice

Previous

Next