Thread: updating rows which have a common value forconsecutive dates

updating rows which have a common value forconsecutive dates

From
Lonni J Friedman
Date:
Greetings,
I have a table full of automated test data, which continuously has new
unique data inserted:

    Column     |            Type             |
 Modifiers
----------------+-----------------------------+-------------------------------------------------------------
 id             | integer                     | not null default
nextval('dbltests_id_seq'::regclass)
 testname       | text                        | not null
 last_update    | timestamp without time zone | not null default now()
 current_status | text                        | not null
 os             | text                        | not null
 arch           | text                        | not null
 build_type     | text                        | not null
 branch         | text                        | not null

The 'testname' column contains many different tests, and each unique
'testname' has numerous different associated os,arch,build_type &
branch values.  For example, testname='foo' will run on
os='Linux',arch='i686',build_type='debug',branch='t38', and also on
os='Windows7',arch='x86_64',build_type='release',branch='r40', etc,
and there will be many other different testname's with similar
permutations of the os,arch,build_type & branch columns.  So for
example, there will also be testname='bar' or testname='omega' or
testname='sigma' for the other assorted os,arch,build_type & branch
permutations.

The current_status column is either 'PASSED' or 'FAILED'.

What I'm struggling with is how to put together a SQL query which
shows me all instances of a given testname permutation
(os,arch,build_type,branch) which has a current_status value of
'FAILED' for two or more most recent consecutive 'last_update' values.

Suggestions welcome.

thanks in advance!

Re: updating rows which have a common value forconsecutive dates

From
"David Johnston"
Date:
If you have the ability to use Window functions you can group (as necessary), order by last_update, and then use rank()
tonumber each test run sequentially.  Then you can limit the results to  ( rank() <= 2 AND current_status = 'FAILED' ). 

David J.

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Lonni J Friedman
Sent: Wednesday, April 13, 2011 3:34 AM
To: pgsql-general
Subject: [GENERAL] updating rows which have a common value forconsecutive dates

Greetings,
I have a table full of automated test data, which continuously has new unique data inserted:

    Column     |            Type             |
 Modifiers
----------------+-----------------------------+-------------------------
----------------+-----------------------------+-------------------------
----------------+-----------------------------+-----------
 id             | integer                     | not null default
nextval('dbltests_id_seq'::regclass)
 testname       | text                        | not null
 last_update    | timestamp without time zone | not null default now()
 current_status | text                        | not null
 os             | text                        | not null
 arch           | text                        | not null
 build_type     | text                        | not null
 branch         | text                        | not null

The 'testname' column contains many different tests, and each unique 'testname' has numerous different associated
os,arch,build_type& branch values.  For example, testname='foo' will run on
os='Linux',arch='i686',build_type='debug',branch='t38',and also on
os='Windows7',arch='x86_64',build_type='release',branch='r40',etc, and there will be many other different testname's
withsimilar permutations of the os,arch,build_type & branch columns.  So for example, there will also be testname='bar'
ortestname='omega' or testname='sigma' for the other assorted os,arch,build_type & branch permutations. 

The current_status column is either 'PASSED' or 'FAILED'.

What I'm struggling with is how to put together a SQL query which shows me all instances of a given testname
permutation
(os,arch,build_type,branch) which has a current_status value of 'FAILED' for two or more most recent consecutive
'last_update'values. 

Suggestions welcome.

thanks in advance!

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: updating rows which have a common value forconsecutive dates

From
Lonni J Friedman
Date:
Hi David,
Thanks for your reply.  I'm using 8.4.7, so window functions are
certainly an option, although I've admittedly never used them before.
I've spent the past few hours reading the dox, and I now have a
rudimentary understanding of window functions.  I tried to compose a
query based on your suggestion, but I think i'm running up against my
lack of experience.  This query seems to give me all failures but not
neccesarily when there are two in a row for a unique group (although
I'm still not 100% certain its actually returning only last_update
consecutive rows):
SELECT testname,os,arch,build_type,branch,current_status,last_update,rank()
OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY
last_update DESC) FROM mytable WHERE current_status='FAILED' AND
age(now(),last_update) <= INTERVAL '15 days'

However, if I'm understanding how this works, what I really care about
is when a rank=2 exists, as that's truly when something failed for two
consecutive last_update's.  I thought this might do it, but apparently
I'm doing it wrong:

SELECT testname,os,arch,build_type,branch,current_status,last_update,rank()
OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY
last_update DESC) FROM mytable WHERE current_status='FAILED' AND
age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ;
ERROR:  window function call requires an OVER clause
LINE 1: ... age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ;

How do I restrict the results to only show when rank=2 ?

thanks!


On Wed, Apr 13, 2011 at 9:57 AM, David Johnston <polobo@yahoo.com> wrote:
> If you have the ability to use Window functions you can group (as necessary), order by last_update, and then use
rank()to number each test run sequentially.  Then you can limit the results to  ( rank() <= 2 AND current_status =
'FAILED'). 
>
> David J.
>
> -----Original Message-----
> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Lonni J Friedman
> Sent: Wednesday, April 13, 2011 3:34 AM
> To: pgsql-general
> Subject: [GENERAL] updating rows which have a common value forconsecutive dates
>
> Greetings,
> I have a table full of automated test data, which continuously has new unique data inserted:
>
>    Column     |            Type             |
>  Modifiers
> ----------------+-----------------------------+-------------------------
> ----------------+-----------------------------+-------------------------
> ----------------+-----------------------------+-----------
>  id             | integer                     | not null default
> nextval('dbltests_id_seq'::regclass)
>  testname       | text                        | not null
>  last_update    | timestamp without time zone | not null default now()
>  current_status | text                        | not null
>  os             | text                        | not null
>  arch           | text                        | not null
>  build_type     | text                        | not null
>  branch         | text                        | not null
>
> The 'testname' column contains many different tests, and each unique 'testname' has numerous different associated
os,arch,build_type& branch values.  For example, testname='foo' will run on
os='Linux',arch='i686',build_type='debug',branch='t38',and also on
os='Windows7',arch='x86_64',build_type='release',branch='r40',etc, and there will be many other different testname's
withsimilar permutations of the os,arch,build_type & branch columns.  So for example, there will also be testname='bar'
ortestname='omega' or testname='sigma' for the other assorted os,arch,build_type & branch permutations. 
>
> The current_status column is either 'PASSED' or 'FAILED'.
>
> What I'm struggling with is how to put together a SQL query which shows me all instances of a given testname
permutation
> (os,arch,build_type,branch) which has a current_status value of 'FAILED' for two or more most recent consecutive
'last_update'values. 
>
> Suggestions welcome.
>
> thanks in advance!

Re: updating rows which have a common value forconsecutive dates

From
David Johnston
Date:
You need to turn the query with the window function into a sub-query and then in the outer query you can refer to the
inner-query'srank() column.  The inner query should effectively get you the last two test results for each context and
thenyou can check to see if any of those failed. 

I have a hunch you might need a third layer of sub-queries to handle the failure aspect of the requirement properly;
possiblyas part of a "WITH" CTE.  You for sure need to in order to. Properly utilize the rank() function limiting. 

Dave

On Apr 14, 2011, at 0:52, Lonni J Friedman <netllama@gmail.com> wrote:

> Hi David,
> Thanks for your reply.  I'm using 8.4.7, so window functions are
> certainly an option, although I've admittedly never used them before.
> I've spent the past few hours reading the dox, and I now have a
> rudimentary understanding of window functions.  I tried to compose a
> query based on your suggestion, but I think i'm running up against my
> lack of experience.  This query seems to give me all failures but not
> neccesarily when there are two in a row for a unique group (although
> I'm still not 100% certain its actually returning only last_update
> consecutive rows):
> SELECT testname,os,arch,build_type,branch,current_status,last_update,rank()
> OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY
> last_update DESC) FROM mytable WHERE current_status='FAILED' AND
> age(now(),last_update) <= INTERVAL '15 days'
>
> However, if I'm understanding how this works, what I really care about
> is when a rank=2 exists, as that's truly when something failed for two
> consecutive last_update's.  I thought this might do it, but apparently
> I'm doing it wrong:
>
> SELECT testname,os,arch,build_type,branch,current_status,last_update,rank()
> OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY
> last_update DESC) FROM mytable WHERE current_status='FAILED' AND
> age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ;
> ERROR:  window function call requires an OVER clause
> LINE 1: ... age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ;
>
> How do I restrict the results to only show when rank=2 ?
>
> thanks!
>
>
> On Wed, Apr 13, 2011 at 9:57 AM, David Johnston <polobo@yahoo.com> wrote:
>> If you have the ability to use Window functions you can group (as necessary), order by last_update, and then use
rank()to number each test run sequentially.  Then you can limit the results to  ( rank() <= 2 AND current_status =
'FAILED'). 
>>
>> David J.
>>
>> -----Original Message-----
>> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Lonni J Friedman
>> Sent: Wednesday, April 13, 2011 3:34 AM
>> To: pgsql-general
>> Subject: [GENERAL] updating rows which have a common value forconsecutive dates
>>
>> Greetings,
>> I have a table full of automated test data, which continuously has new unique data inserted:
>>
>>    Column     |            Type             |
>>  Modifiers
>> ----------------+-----------------------------+-------------------------
>> ----------------+-----------------------------+-------------------------
>> ----------------+-----------------------------+-----------
>>  id             | integer                     | not null default
>> nextval('dbltests_id_seq'::regclass)
>>  testname       | text                        | not null
>>  last_update    | timestamp without time zone | not null default now()
>>  current_status | text                        | not null
>>  os             | text                        | not null
>>  arch           | text                        | not null
>>  build_type     | text                        | not null
>>  branch         | text                        | not null
>>
>> The 'testname' column contains many different tests, and each unique 'testname' has numerous different associated
os,arch,build_type& branch values.  For example, testname='foo' will run on
os='Linux',arch='i686',build_type='debug',branch='t38',and also on
os='Windows7',arch='x86_64',build_type='release',branch='r40',etc, and there will be many other different testname's
withsimilar permutations of the os,arch,build_type & branch columns.  So for example, there will also be testname='bar'
ortestname='omega' or testname='sigma' for the other assorted os,arch,build_type & branch permutations. 
>>
>> The current_status column is either 'PASSED' or 'FAILED'.
>>
>> What I'm struggling with is how to put together a SQL query which shows me all instances of a given testname
permutation
>> (os,arch,build_type,branch) which has a current_status value of 'FAILED' for two or more most recent consecutive
'last_update'values. 
>>
>> Suggestions welcome.
>>
>> thanks in advance!
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general

Re: updating rows which have a common value forconsecutive dates

From
Lonni J Friedman
Date:
Hi David,
I had just figured out the sub-query requirement when you replied.  So
now I've got this working:
SELECT * FROM (
SELECT testname,os,arch,build_type,branch,current_status,last_update,rank()
OVER
(PARTITION BY testname,os,arch,branch,build_type ORDER BY last_update
DESC) AS myrank
FROM mytable
WHERE current_status='FAILED' AND age(now(),last_update) <= INTERVAL
'15 days') blah
where myrank=2 ;

However, I'm getting a bit lost at this point as to what I need to do
next.  I've been googling on "WITH CTE" and its mostly going over my
head.  Could you give me more of a hint about this 3rd layer of
sub-query?

thanks


On Wed, Apr 13, 2011 at 10:17 PM, David Johnston <polobo@yahoo.com> wrote:
> You need to turn the query with the window function into a sub-query and then in the outer query you can refer to the
inner-query'srank() column.  The inner query should effectively get you the last two test results for each context and
thenyou can check to see if any of those failed. 
>
> I have a hunch you might need a third layer of sub-queries to handle the failure aspect of the requirement properly;
possiblyas part of a "WITH" CTE.  You for sure need to in order to. Properly utilize the rank() function limiting. 
>
> Dave
>
> On Apr 14, 2011, at 0:52, Lonni J Friedman <netllama@gmail.com> wrote:
>
>> Hi David,
>> Thanks for your reply.  I'm using 8.4.7, so window functions are
>> certainly an option, although I've admittedly never used them before.
>> I've spent the past few hours reading the dox, and I now have a
>> rudimentary understanding of window functions.  I tried to compose a
>> query based on your suggestion, but I think i'm running up against my
>> lack of experience.  This query seems to give me all failures but not
>> neccesarily when there are two in a row for a unique group (although
>> I'm still not 100% certain its actually returning only last_update
>> consecutive rows):
>> SELECT testname,os,arch,build_type,branch,current_status,last_update,rank()
>> OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY
>> last_update DESC) FROM mytable WHERE current_status='FAILED' AND
>> age(now(),last_update) <= INTERVAL '15 days'
>>
>> However, if I'm understanding how this works, what I really care about
>> is when a rank=2 exists, as that's truly when something failed for two
>> consecutive last_update's.  I thought this might do it, but apparently
>> I'm doing it wrong:
>>
>> SELECT testname,os,arch,build_type,branch,current_status,last_update,rank()
>> OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY
>> last_update DESC) FROM mytable WHERE current_status='FAILED' AND
>> age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ;
>> ERROR:  window function call requires an OVER clause
>> LINE 1: ... age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ;
>>
>> How do I restrict the results to only show when rank=2 ?
>>
>> thanks!
>>
>>
>> On Wed, Apr 13, 2011 at 9:57 AM, David Johnston <polobo@yahoo.com> wrote:
>>> If you have the ability to use Window functions you can group (as necessary), order by last_update, and then use
rank()to number each test run sequentially.  Then you can limit the results to  ( rank() <= 2 AND current_status =
'FAILED'). 
>>>
>>> David J.
>>>
>>> -----Original Message-----
>>> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Lonni J Friedman
>>> Sent: Wednesday, April 13, 2011 3:34 AM
>>> To: pgsql-general
>>> Subject: [GENERAL] updating rows which have a common value forconsecutive dates
>>>
>>> Greetings,
>>> I have a table full of automated test data, which continuously has new unique data inserted:
>>>
>>>    Column     |            Type             |
>>>  Modifiers
>>> ----------------+-----------------------------+-------------------------
>>> ----------------+-----------------------------+-------------------------
>>> ----------------+-----------------------------+-----------
>>>  id             | integer                     | not null default
>>> nextval('dbltests_id_seq'::regclass)
>>>  testname       | text                        | not null
>>>  last_update    | timestamp without time zone | not null default now()
>>>  current_status | text                        | not null
>>>  os             | text                        | not null
>>>  arch           | text                        | not null
>>>  build_type     | text                        | not null
>>>  branch         | text                        | not null
>>>
>>> The 'testname' column contains many different tests, and each unique 'testname' has numerous different associated
os,arch,build_type& branch values.  For example, testname='foo' will run on
os='Linux',arch='i686',build_type='debug',branch='t38',and also on
os='Windows7',arch='x86_64',build_type='release',branch='r40',etc, and there will be many other different testname's
withsimilar permutations of the os,arch,build_type & branch columns.  So for example, there will also be testname='bar'
ortestname='omega' or testname='sigma' for the other assorted os,arch,build_type & branch permutations. 
>>>
>>> The current_status column is either 'PASSED' or 'FAILED'.
>>>
>>> What I'm struggling with is how to put together a SQL query which shows me all instances of a given testname
permutation
>>> (os,arch,build_type,branch) which has a current_status value of 'FAILED' for two or more most recent consecutive
'last_update'values. 
>>>
>>> Suggestions welcome.
>>>
>>> thanks in advance!