Re: Denormalizing during select - Mailing list pgsql-sql

From Greg Stark
Subject Re: Denormalizing during select
Date
Msg-id 87d6lbkm6y.fsf@stark.dyndns.tv
Whole thread Raw
In response to Re: Denormalizing during select  (Rod Taylor <rbt@rbt.ca>)
List pgsql-sql
Rod Taylor <rbt@rbt.ca> writes:

> I've been trying to figure out how to give a running total (similar
> issue I think).

Running totals are a "hard problem". They certainly cannot be solved using
aggregates. They're similar to the ranking problem of assigning a sequential
number to each item within each group.

The problem is they share certain properties of aggregate functions, namely
that they require persistent state storage and a state transition function.
But they definitely aren't aggregate functions in that they return a value for
every record, not one for the whole group.

I'm not even clear how to write an embedded (plpgsql or perl or python)
function, since I'm not clear how to allocate space for the state that will be
available for each call on each record but independent from other calls in the
same query. You have to be able to handle two running totals at the same time.

Note that running totals are not very sql-ish. Since sql deals in unordered
sets a running total is pretty ill-defined. It would have to be calculated
after the sort operation or else require you to sort the input tables in a
subquery or something. 

To write a proper well-defined sql-ish query for running totals you would have
to do the very inefficient:

select employee_id, salary,       (select count(*)    from employees x where x.salary < employee.salary) as
salary_rank,     (select sum(salary) from employees x where x.salary < employee.salary) as running_total from
employeesorderby salary desc
 

Finding a way to transform that into the single-scan plan that's obviously the
right way to execute it would be really cool but seems pretty far-fetched. I
don't think any database is capable of it today.

-- 
greg



pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: Denormalizing during select
Next
From: "Zodiac"
Date:
Subject: PL/pgSQL