Thread: Window function sort order help

Window function sort order help

From
Dianna Harter
Date:
<div class="WordSection1"><p class="MsoNormal">Having trouble getting a window function to sort correctly. <p
class="MsoNormal"> <pclass="MsoNormal">Given this data<p class="MsoNormal"> <p class="MsoNormal"> consumer_id |
move_date |     history_timestamp<p class="MsoNormal">-------------+------------+----------------------------<p
class="MsoNormal">      12345| 2008-01-05 | 2007-12-11 06:02:26.842171<p class="MsoNormal">       12345| 2008-02-29 |
2008-02-0507:22:38.04067<p class="MsoNormal">       12345| 2008-02-29 | 2008-07-11 09:03:42.44044<p
class="MsoNormal">      23456| 2009-01-01 | 2008-11-12 07:33:32.656658 <--<p class="MsoNormal">       23456|
2009-01-28| 2008-11-14 01:57:40.264335<p class="MsoNormal">       23456| 2009-01-01 | 2008-12-04 17:14:20.279999
<--<pclass="MsoNormal">       23456| 2009-01-01 | 2008-12-31 00:33:37.204968<p class="MsoNormal">       23456|
2009-01-01| 2011-06-08 04:16:41.646521<p class="MsoNormal">       34567| 2010-05-07 | 2010-06-08 05:14:43.842172<p
class="MsoNormal"> <pclass="MsoNormal">I’m trying to get the timestamp when the consumer last changed their move_date. 
(Noticeconsumer_id 23456 set their move_date to 2009-01-01 then changed and then changed it back.  In the end, I want
thetimestamp from when they changed it to 2009-01-01 the second time.)<p class="MsoNormal"> <p class="MsoNormal">My
thoughtwas to do an intermediary step to find the timestamp for each time it switched.  From there I can grab the
max(timestamp)for each consumer.<p class="MsoNormal"> <p class="MsoNormal">Here’s my query.  <p
class="MsoNormal">selectch.consumer_id, ch.move_date,ch.history_timestamp, min(ch.history_timestamp) over w as
start_time<pclass="MsoNormal">       from consumer_hist ch<p class="MsoNormal">window w as (partition by
ch.consumer_id,ch.move_date order by ch.consumer_id,ch.history_timestamp asc)<p class="MsoNormal">       order by
ch.consumer_id, ch.history_timestamp asc<p class="MsoNormal"> <p class="MsoNormal"> <p class="MsoNormal">I expect the
followingresult<p class="MsoNormal">consumer_id | move_date  |     history_timestamp      |         start_time<p
class="MsoNormal">-------------+------------+----------------------------+----------------------------<p
class="MsoNormal">      12345 | 2008-01-05 | 2007-12-11 06:02:26.842171 | 2007-12-11 06:02:26.842171<p
class="MsoNormal">      12345 | 2008-02-29 | 2008-02-05 07:22:38.04067  | 2008-02-05 07:22:38.04067<p
class="MsoNormal">      12345 | 2008-02-29 | 2008-07-11 09:03:42.44044  | 2008-02-05 07:22:38.04067<p
class="MsoNormal">      23456 | 2009-01-01 | 2008-11-12 07:33:32.656658 | 2008-11-12 07:33:32.656658 <--<p
class="MsoNormal">      23456 | 2009-01-28 | 2008-11-14 01:57:40.264335 | 2008-11-14 01:57:40.264335<p
class="MsoNormal">      23456 | 2009-01-01 | 2008-12-04 17:14:20.279999 | 2008-12-04 17:14:20.279999 <--<p
class="MsoNormal">      23456 | 2009-01-01 | 2008-12-31 00:33:37.204968 | 2008-12-04 17:14:20.279999<p
class="MsoNormal">      23456 | 2009-01-01 | 2011-06-08 04:16:41.646521 | 2008-12-04 17:14:20.279999<p
class="MsoNormal">      34567 | 2010-05-07 | 2010-06-08 05:14:43.842172 | 2010-06-08 05:14:43.842172<p
class="MsoNormal"> <pclass="MsoNormal">But the windowing function appears to be doing the partitioning first then the
orderby. <p class="MsoNormal"> <p class="MsoNormal">So instead, I’m getting<p class="MsoNormal"> <p
class="MsoNormal">consumer_id| move_date  |     history_timestamp      |         start_time<p
class="MsoNormal">-------------+------------+----------------------------+----------------------------<p
class="MsoNormal">      12345 | 2008-01-05 | 2007-12-11 06:02:26.842171 | 2007-12-11 06:02:26.842171<p
class="MsoNormal">      12345 | 2008-02-29 | 2008-02-05 07:22:38.04067  | 2008-02-05 07:22:38.04067<p
class="MsoNormal">      12345 | 2008-02-29 | 2008-07-11 09:03:42.44044  | 2008-02-05 07:22:38.04067<p
class="MsoNormal">      23456 | 2009-01-01 | 2008-11-12 07:33:32.656658 | 2008-11-12 07:33:32.656658 <--<p
class="MsoNormal">      23456 | 2009-01-28 | 2008-11-14 01:57:40.264335 | 2008-11-14 01:57:40.264335<p
class="MsoNormal">      23456 | 2009-01-01 | 2008-12-04 17:14:20.279999 | 2008-11-12 07:33:32.656658 <--<p
class="MsoNormal">      23456 | 2009-01-01 | 2008-12-31 00:33:37.204968 | 2008-11-12 07:33:32.656658<p
class="MsoNormal">      23456 | 2009-01-01 | 2011-06-08 04:16:41.646521 | 2008-11-12 07:33:32.656658<p
class="MsoNormal">      34567 | 2010-05-07 | 2010-06-08 05:14:43.842172 | 2010-06-08 05:14:43.842172<p
class="MsoNormal"> <pclass="MsoNormal">Any suggestions to get the order by to occur first then the partition by or
maybethere another approach that I could use?<p class="MsoNormal"> <p class="MsoNormal">Dianna</div> 

Re: Window function sort order help

From
Nicoletta Maia
Date:
2011/9/13 Dianna Harter <dharter@mynewplace.com>:

Hi,

> [snip]
> Any suggestions to get the order by to occur first then the partition by or
> maybe there another approach that I could use?

I tried to write the query without using the window:

SELECT  `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp` ,       MIN(  `Y`.`history_timestamp` ) AS
`start_time`
FROM      `Table` AS `X`              JOIN      `Table` AS `Y`              ON  `X`.`consumer_id` = `Y`.`consumer_id`
                  AND `X`.`move_date` =  `Y`.`move_date`                      AND  `X`.`history_timestamp` >=
`Y`.`history_timestamp`             LEFT JOIN      `Table` AS `Z`              ON  `X`.`consumer_id` =
`Z`.`consumer_id`             AND `X`.`move_date` <>  `Z`.`move_date`              AND  `X`.`history_timestamp` >=
`Z`.`history_timestamp`             AND  `Y`.`history_timestamp` <=  `Z`.`history_timestamp`
 
WHERE  `Z`.`consumer_id` IS NULL
GROUP BY   `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp`
ORDER BY  `X`.`consumer_id`, `X`.`history_timestamp` ASC

To avoid the LEFT JOIN, you can move the control in the sub-query:

SELECT  `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp` ,      MIN(  `Y`.`history_timestamp` ) AS
`start_time`
FROM      `Table` AS `X`              JOIN      `Table` AS `Y`              ON  `X`.`consumer_id` = `Y`.`consumer_id`
                  AND `X`.`move_date` =  `Y`.`move_date`                      AND  `X`.`history_timestamp` >=
`Y`.`history_timestamp`
WHERE  NOT EXISTS (      SELECT *      FROM `Table` AS `Z`      WHERE `X`.`consumer_id` = `Z`.`consumer_id`
AND `X`.`move_date` <>  `Z`.`move_date`              AND  `X`.`history_timestamp` >=  `Z`.`history_timestamp`
  AND  `Y`.`history_timestamp` <=  `Z`.`history_timestamp`      )
 
GROUP BY   `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp`
ORDER BY  `X`.`consumer_id`, `X`.`history_timestamp` ASC

With Y I select history_timestamp preceding the current row with the
same move_date.
With Z I verify that no changes have occurred to move_date between
X.history_timestamp and Y.history_timestamp.

Sorry, I responded with the wrong email address..

Nicoletta


Re: Window function sort order help

From
Thomas Kellerer
Date:
Nicoletta Maia, 14.09.2011 10:30:
> SELECT  `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp` ,
>         MIN(  `Y`.`history_timestamp` ) AS `start_time`
> FROM
>         `Table` AS `X`
>                 JOIN
>         `Table` AS `Y`
>                 ON  `X`.`consumer_id` = `Y`.`consumer_id`
>                         AND `X`.`move_date` =  `Y`.`move_date`
>                         AND  `X`.`history_timestamp`>=  `Y`.`history_timestamp`
> WHERE  NOT EXISTS (
>         SELECT *
>         FROM `Table` AS `Z`
>         WHERE `X`.`consumer_id` = `Z`.`consumer_id`
>                 AND `X`.`move_date`<>   `Z`.`move_date`
>                 AND  `X`.`history_timestamp`>=  `Z`.`history_timestamp`
>                 AND  `Y`.`history_timestamp`<=  `Z`.`history_timestamp`
>         )
> GROUP BY   `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp`
> ORDER BY  `X`.`consumer_id`, `X`.`history_timestamp` ASC
>
> With Y I select history_timestamp preceding the current row with the
> same move_date.
> With Z I verify that no changes have occurred to move_date between
> X.history_timestamp and Y.history_timestamp.
>
That is not a valid PostgreSQL SQL statement.

Postgres does not use "backticks" for quoting, it uses the standard double quotes.

Thomas




Re: Window function sort order help

From
Dianna Harter
Date:
Thank you.  Interesting solution.  Unfortunately, it's performance is not very good, since it involves joining a large
table3 times.  I do have a solution that uses a temp table, but I was trying to rework it into a single query to
improveperformance.   

Thank you again for your help.
Dianna

2011/9/13 nicoletta maia :

[snip]
> I tried to write the query without using the window:
>
> SELECT  `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp` ,
>     MIN(  `Y`.`history_timestamp` ) AS `start_time`
> FROM
>    `Table` AS `X`
>        JOIN
>    `Table` AS `Y`
>        ON  `X`.`consumer_id` = `Y`.`consumer_id`
>            AND `X`.`move_date` =  `Y`.`move_date`
>            AND  `X`.`history_timestamp` >=  `Y`.`history_timestamp`
>        LEFT JOIN
>    `Table` AS `Z`
>        ON  `X`.`consumer_id` = `Z`.`consumer_id`
>        AND `X`.`move_date` <>  `Z`.`move_date`
>        AND  `X`.`history_timestamp` >=  `Z`.`history_timestamp`
>        AND  `Y`.`history_timestamp` <=  `Z`.`history_timestamp`
> WHERE  `Z`.`consumer_id` IS NULL
> GROUP BY   `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp`
> ORDER BY  `X`.`consumer_id`, `X`.`history_timestamp` ASC




Re: Window function sort order help

From
Nicoletta Maia
Date:
2011/9/14 Dianna Harter <dharter@mynewplace.com>:

> [snip]
> Thank you again for your help.
You're welcome. Sorry about the syntax, but I could only try on
MySQL... and I was curious about the problem :)

Nicoletta


Re: Window function sort order help

From
Tim Landscheidt
Date:
Dianna Harter <dharter@mynewplace.com> wrote:

> Having trouble getting a window function to sort correctly.

> Given this data

> consumer_id | move_date  |     history_timestamp
> -------------+------------+----------------------------
>        12345| 2008-01-05 | 2007-12-11 06:02:26.842171
>        12345| 2008-02-29 | 2008-02-05 07:22:38.04067
>        12345| 2008-02-29 | 2008-07-11 09:03:42.44044
>        23456| 2009-01-01 | 2008-11-12 07:33:32.656658 <--
>        23456| 2009-01-28 | 2008-11-14 01:57:40.264335
>        23456| 2009-01-01 | 2008-12-04 17:14:20.279999 <--
>        23456| 2009-01-01 | 2008-12-31 00:33:37.204968
>        23456| 2009-01-01 | 2011-06-08 04:16:41.646521
>        34567| 2010-05-07 | 2010-06-08 05:14:43.842172

> I'm trying to get the timestamp when the consumer last changed their move_date.  (Notice consumer_id 23456 set their
move_dateto 2009-01-01 then changed and then changed it back.  In the end, I want the timestamp from when they changed
itto 2009-01-01 the second time.)
 

> My thought was to do an intermediary step to find the timestamp for each time it switched.  From there I can grab the
max(timestamp)for each consumer.
 

> [...]

> Any suggestions to get the order by to occur first then the partition by or maybe there another approach that I could
use?

If I understand the question correctly, try:

| SELECT DISTINCT ON (consumer_id) consumer_id, move_date, history_timestamp
|        FROM (SELECT consumer_id, move_date, history_timestamp,
|                     LAG(move_date) OVER (PARTITION BY consumer_id ORDER BY consumer_id, history_timestamp) AS
previous_move_date
|                     FROM consumer_hist) AS SubQuery
|        WHERE move_date IS DISTINCT FROM previous_move_date
|        ORDER BY consumer_id, history_timestamp DESC;

Tim



Re: Window function sort order help

From
nicoletta maia
Date:
2011/9/13 Dianna Harter <dharter@mynewplace.com>:

Hi,

> [snip]
> Any suggestions to get the order by to occur first then the partition by or
> maybe there another approach that I could use?

I tried to write the query without using the window:

SELECT  `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp` , MIN(  `Y`.`history_timestamp` ) AS
`start_time`
FROM`Table` AS `X`    JOIN`Table` AS `Y`    ON  `X`.`consumer_id` = `Y`.`consumer_id`        AND `X`.`move_date` =
`Y`.`move_date`       AND  `X`.`history_timestamp` >=  `Y`.`history_timestamp`    LEFT JOIN`Table` AS `Z`    ON
`X`.`consumer_id`= `Z`.`consumer_id`    AND `X`.`move_date` <>  `Z`.`move_date`    AND  `X`.`history_timestamp` >=
`Z`.`history_timestamp`   AND  `Y`.`history_timestamp` <=  `Z`.`history_timestamp`
 
WHERE  `Z`.`consumer_id` IS NULL
GROUP BY   `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp`
ORDER BY  `X`.`consumer_id`, `X`.`history_timestamp` ASC

To avoid the LEFT JOIN, you can move the control in the sub-query:

SELECT  `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp` ,MIN(  `Y`.`history_timestamp` ) AS `start_time`
FROM`Table` AS `X`    JOIN`Table` AS `Y`    ON  `X`.`consumer_id` = `Y`.`consumer_id`        AND `X`.`move_date` =
`Y`.`move_date`       AND  `X`.`history_timestamp` >=  `Y`.`history_timestamp`
 
WHERE  NOT EXISTS (SELECT *FROM `Table` AS `Z`WHERE `X`.`consumer_id` = `Z`.`consumer_id`    AND `X`.`move_date` <>
`Z`.`move_date`   AND  `X`.`history_timestamp` >=  `Z`.`history_timestamp`    AND  `Y`.`history_timestamp` <=
`Z`.`history_timestamp`)
GROUP BY   `X`.`consumer_id`, `X`.`move_date` ,  `X`.`history_timestamp`
ORDER BY  `X`.`consumer_id`, `X`.`history_timestamp` ASC

With Y I select history_timestamp preceding the current row with the
same move_date.
With Z I verify that no changes have occurred to move_date between
X.history_timestamp and Y.history_timestamp.

Ciao!

Nicoletta


Re: Window function sort order help

From
Dianna Harter
Date:
Thank you.  It works perfect.

Dianna


Tim Landscheidt < tim@tim-landscheidt.de > wrote:


> If I understand the question correctly, try:
>
> | SELECT DISTINCT ON (consumer_id) consumer_id, move_date, history_timestamp
> |        FROM (SELECT consumer_id, move_date, history_timestamp,
> |                     LAG(move_date) OVER (PARTITION BY consumer_id ORDER BY consumer_id, history_timestamp) AS
previous_move_date
> |                     FROM consumer_hist) AS SubQuery
> |        WHERE move_date IS DISTINCT FROM previous_move_date
> |        ORDER BY consumer_id, history_timestamp DESC;
>
> Tim