Thread: Access a window's frame_end row from a window function

Access a window's frame_end row from a window function

From
Clodoaldo Neto
Date:
Suppose there is the need to get the average of a value v over a 6 hours
time window starting 7 hours before the current row's time.

    create table t (ts timestamp, v integer);
    insert into t (ts, v) values
    ('2013-01-01 00:46', 2),
    ('2013-01-01 03:54', 4),
    ('2013-01-01 06:28', 4),
    ('2013-01-01 11:19', 2),
    ('2013-01-01 14:44', 1),
    ('2013-01-01 15:56', 5),
    ('2013-01-01 18:01', 4),
    ('2013-01-01 19:40', 0),
    ('2013-01-01 20:38', 5),
    ('2013-01-01 21:22', 0);

I can do it with a correlated subquery:

    select ts, v,
        (
            select avg(v)
            from t s
            where ts between
                t.ts - interval '7 hours'
                and t.ts - interval '1 hour'
        ) average
    from t
    order by ts
    ;
             ts          | v |      average
    ---------------------+---+--------------------
     2013-01-01 00:46:00 | 2 |
     2013-01-01 03:54:00 | 4 | 2.0000000000000000
     2013-01-01 06:28:00 | 4 | 3.0000000000000000
     2013-01-01 11:19:00 | 2 | 4.0000000000000000
     2013-01-01 14:44:00 | 1 | 2.0000000000000000
     2013-01-01 15:56:00 | 5 | 1.5000000000000000
     2013-01-01 18:01:00 | 4 | 2.6666666666666667
     2013-01-01 19:40:00 | 0 | 3.3333333333333333
     2013-01-01 20:38:00 | 5 | 3.3333333333333333
     2013-01-01 21:22:00 | 0 | 2.5000000000000000

But if I could access a window's frame_end row as a record from a window
function:

    select ts,
        avg(case when ts between
                frame_end.ts - interval '7 hours'
                and frame_end.ts - interval '1 hour'
            then v else null end
        ) over(order by ts)
    from t
    order by ts

I'm naively posting this as I have no idea how complex would it be to add
this feature. Would it perform better than the correlated subquery?

Regards, Clodoaldo

Re: Access a window's frame_end row from a window function

From
Merlin Moncure
Date:
On Fri, Feb 22, 2013 at 8:26 AM, Clodoaldo Neto
<clodoaldo.pinto.neto@gmail.com> wrote:
> Suppose there is the need to get the average of a value v over a 6 hours
> time window starting 7 hours before the current row's time.
>
>     create table t (ts timestamp, v integer);
>     insert into t (ts, v) values
>     ('2013-01-01 00:46', 2),
>     ('2013-01-01 03:54', 4),
>     ('2013-01-01 06:28', 4),
>     ('2013-01-01 11:19', 2),
>     ('2013-01-01 14:44', 1),
>     ('2013-01-01 15:56', 5),
>     ('2013-01-01 18:01', 4),
>     ('2013-01-01 19:40', 0),
>     ('2013-01-01 20:38', 5),
>     ('2013-01-01 21:22', 0);
>
> I can do it with a correlated subquery:
>
>     select ts, v,
>         (
>             select avg(v)
>             from t s
>             where ts between
>                 t.ts - interval '7 hours'
>                 and t.ts - interval '1 hour'
>         ) average
>     from t
>     order by ts
>     ;
>              ts          | v |      average
>     ---------------------+---+--------------------
>      2013-01-01 00:46:00 | 2 |
>      2013-01-01 03:54:00 | 4 | 2.0000000000000000
>      2013-01-01 06:28:00 | 4 | 3.0000000000000000
>      2013-01-01 11:19:00 | 2 | 4.0000000000000000
>      2013-01-01 14:44:00 | 1 | 2.0000000000000000
>      2013-01-01 15:56:00 | 5 | 1.5000000000000000
>      2013-01-01 18:01:00 | 4 | 2.6666666666666667
>      2013-01-01 19:40:00 | 0 | 3.3333333333333333
>      2013-01-01 20:38:00 | 5 | 3.3333333333333333
>      2013-01-01 21:22:00 | 0 | 2.5000000000000000
>
> But if I could access a window's frame_end row as a record from a window
> function:
>
>     select ts,
>         avg(case when ts between
>                 frame_end.ts - interval '7 hours'
>                 and frame_end.ts - interval '1 hour'
>             then v else null end
>         ) over(order by ts)
>     from t
>     order by ts
>
> I'm naively posting this as I have no idea how complex would it be to add
> this feature. Would it perform better than the correlated subquery?


Well, correlated subquery is about the bottom of the barrel in
performance terms, so anything would be an improvement.

merlin

Re: Access a window's frame_end row from a window function

From
Tom Lane
Date:
Clodoaldo Neto <clodoaldo.pinto.neto@gmail.com> writes:
> Suppose there is the need to get the average of a value v over a 6 hours
> time window starting 7 hours before the current row's time.
> ...
> But if I could access a window's frame_end row as a record from a window
> function:

>     select ts,
>         avg(case when ts between
>                 frame_end.ts - interval '7 hours'
>                 and frame_end.ts - interval '1 hour'
>             then v else null end
>         ) over(order by ts)
>     from t
>     order by ts

> I'm naively posting this as I have no idea how complex would it be to add
> this feature. Would it perform better than the correlated subquery?

Doubt it.  As stated, it'd likely perform worse, since it's not obvious
in this construction that rows outside the desired time window need not
be scanned to compute the avg().  But even if you rearranged the SQL to
avoid that pitfall, I'm not sure how the implementation could look
noticeably different from a correlated subselect.  It'd still end up
scanning all the desired rows for each row of the outer query.

            regards, tom lane