Thread: updating rows which have a common value forconsecutive dates
Greetings, I have a table full of automated test data, which continuously has new unique data inserted: Column | Type | Modifiers ----------------+-----------------------------+------------------------------------------------------------- id | integer | not null default nextval('dbltests_id_seq'::regclass) testname | text | not null last_update | timestamp without time zone | not null default now() current_status | text | not null os | text | not null arch | text | not null build_type | text | not null branch | text | not null The 'testname' column contains many different tests, and each unique 'testname' has numerous different associated os,arch,build_type & branch values. For example, testname='foo' will run on os='Linux',arch='i686',build_type='debug',branch='t38', and also on os='Windows7',arch='x86_64',build_type='release',branch='r40', etc, and there will be many other different testname's with similar permutations of the os,arch,build_type & branch columns. So for example, there will also be testname='bar' or testname='omega' or testname='sigma' for the other assorted os,arch,build_type & branch permutations. The current_status column is either 'PASSED' or 'FAILED'. What I'm struggling with is how to put together a SQL query which shows me all instances of a given testname permutation (os,arch,build_type,branch) which has a current_status value of 'FAILED' for two or more most recent consecutive 'last_update' values. Suggestions welcome. thanks in advance!
If you have the ability to use Window functions you can group (as necessary), order by last_update, and then use rank() tonumber each test run sequentially. Then you can limit the results to ( rank() <= 2 AND current_status = 'FAILED' ). David J. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Lonni J Friedman Sent: Wednesday, April 13, 2011 3:34 AM To: pgsql-general Subject: [GENERAL] updating rows which have a common value forconsecutive dates Greetings, I have a table full of automated test data, which continuously has new unique data inserted: Column | Type | Modifiers ----------------+-----------------------------+------------------------- ----------------+-----------------------------+------------------------- ----------------+-----------------------------+----------- id | integer | not null default nextval('dbltests_id_seq'::regclass) testname | text | not null last_update | timestamp without time zone | not null default now() current_status | text | not null os | text | not null arch | text | not null build_type | text | not null branch | text | not null The 'testname' column contains many different tests, and each unique 'testname' has numerous different associated os,arch,build_type& branch values. For example, testname='foo' will run on os='Linux',arch='i686',build_type='debug',branch='t38',and also on os='Windows7',arch='x86_64',build_type='release',branch='r40',etc, and there will be many other different testname's withsimilar permutations of the os,arch,build_type & branch columns. So for example, there will also be testname='bar' ortestname='omega' or testname='sigma' for the other assorted os,arch,build_type & branch permutations. The current_status column is either 'PASSED' or 'FAILED'. What I'm struggling with is how to put together a SQL query which shows me all instances of a given testname permutation (os,arch,build_type,branch) which has a current_status value of 'FAILED' for two or more most recent consecutive 'last_update'values. Suggestions welcome. thanks in advance! -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
Hi David, Thanks for your reply. I'm using 8.4.7, so window functions are certainly an option, although I've admittedly never used them before. I've spent the past few hours reading the dox, and I now have a rudimentary understanding of window functions. I tried to compose a query based on your suggestion, but I think i'm running up against my lack of experience. This query seems to give me all failures but not neccesarily when there are two in a row for a unique group (although I'm still not 100% certain its actually returning only last_update consecutive rows): SELECT testname,os,arch,build_type,branch,current_status,last_update,rank() OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY last_update DESC) FROM mytable WHERE current_status='FAILED' AND age(now(),last_update) <= INTERVAL '15 days' However, if I'm understanding how this works, what I really care about is when a rank=2 exists, as that's truly when something failed for two consecutive last_update's. I thought this might do it, but apparently I'm doing it wrong: SELECT testname,os,arch,build_type,branch,current_status,last_update,rank() OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY last_update DESC) FROM mytable WHERE current_status='FAILED' AND age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ; ERROR: window function call requires an OVER clause LINE 1: ... age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ; How do I restrict the results to only show when rank=2 ? thanks! On Wed, Apr 13, 2011 at 9:57 AM, David Johnston <polobo@yahoo.com> wrote: > If you have the ability to use Window functions you can group (as necessary), order by last_update, and then use rank()to number each test run sequentially. Then you can limit the results to ( rank() <= 2 AND current_status = 'FAILED'). > > David J. > > -----Original Message----- > From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Lonni J Friedman > Sent: Wednesday, April 13, 2011 3:34 AM > To: pgsql-general > Subject: [GENERAL] updating rows which have a common value forconsecutive dates > > Greetings, > I have a table full of automated test data, which continuously has new unique data inserted: > > Column | Type | > Modifiers > ----------------+-----------------------------+------------------------- > ----------------+-----------------------------+------------------------- > ----------------+-----------------------------+----------- > id | integer | not null default > nextval('dbltests_id_seq'::regclass) > testname | text | not null > last_update | timestamp without time zone | not null default now() > current_status | text | not null > os | text | not null > arch | text | not null > build_type | text | not null > branch | text | not null > > The 'testname' column contains many different tests, and each unique 'testname' has numerous different associated os,arch,build_type& branch values. For example, testname='foo' will run on os='Linux',arch='i686',build_type='debug',branch='t38',and also on os='Windows7',arch='x86_64',build_type='release',branch='r40',etc, and there will be many other different testname's withsimilar permutations of the os,arch,build_type & branch columns. So for example, there will also be testname='bar' ortestname='omega' or testname='sigma' for the other assorted os,arch,build_type & branch permutations. > > The current_status column is either 'PASSED' or 'FAILED'. > > What I'm struggling with is how to put together a SQL query which shows me all instances of a given testname permutation > (os,arch,build_type,branch) which has a current_status value of 'FAILED' for two or more most recent consecutive 'last_update'values. > > Suggestions welcome. > > thanks in advance!
You need to turn the query with the window function into a sub-query and then in the outer query you can refer to the inner-query'srank() column. The inner query should effectively get you the last two test results for each context and thenyou can check to see if any of those failed. I have a hunch you might need a third layer of sub-queries to handle the failure aspect of the requirement properly; possiblyas part of a "WITH" CTE. You for sure need to in order to. Properly utilize the rank() function limiting. Dave On Apr 14, 2011, at 0:52, Lonni J Friedman <netllama@gmail.com> wrote: > Hi David, > Thanks for your reply. I'm using 8.4.7, so window functions are > certainly an option, although I've admittedly never used them before. > I've spent the past few hours reading the dox, and I now have a > rudimentary understanding of window functions. I tried to compose a > query based on your suggestion, but I think i'm running up against my > lack of experience. This query seems to give me all failures but not > neccesarily when there are two in a row for a unique group (although > I'm still not 100% certain its actually returning only last_update > consecutive rows): > SELECT testname,os,arch,build_type,branch,current_status,last_update,rank() > OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY > last_update DESC) FROM mytable WHERE current_status='FAILED' AND > age(now(),last_update) <= INTERVAL '15 days' > > However, if I'm understanding how this works, what I really care about > is when a rank=2 exists, as that's truly when something failed for two > consecutive last_update's. I thought this might do it, but apparently > I'm doing it wrong: > > SELECT testname,os,arch,build_type,branch,current_status,last_update,rank() > OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY > last_update DESC) FROM mytable WHERE current_status='FAILED' AND > age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ; > ERROR: window function call requires an OVER clause > LINE 1: ... age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ; > > How do I restrict the results to only show when rank=2 ? > > thanks! > > > On Wed, Apr 13, 2011 at 9:57 AM, David Johnston <polobo@yahoo.com> wrote: >> If you have the ability to use Window functions you can group (as necessary), order by last_update, and then use rank()to number each test run sequentially. Then you can limit the results to ( rank() <= 2 AND current_status = 'FAILED'). >> >> David J. >> >> -----Original Message----- >> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Lonni J Friedman >> Sent: Wednesday, April 13, 2011 3:34 AM >> To: pgsql-general >> Subject: [GENERAL] updating rows which have a common value forconsecutive dates >> >> Greetings, >> I have a table full of automated test data, which continuously has new unique data inserted: >> >> Column | Type | >> Modifiers >> ----------------+-----------------------------+------------------------- >> ----------------+-----------------------------+------------------------- >> ----------------+-----------------------------+----------- >> id | integer | not null default >> nextval('dbltests_id_seq'::regclass) >> testname | text | not null >> last_update | timestamp without time zone | not null default now() >> current_status | text | not null >> os | text | not null >> arch | text | not null >> build_type | text | not null >> branch | text | not null >> >> The 'testname' column contains many different tests, and each unique 'testname' has numerous different associated os,arch,build_type& branch values. For example, testname='foo' will run on os='Linux',arch='i686',build_type='debug',branch='t38',and also on os='Windows7',arch='x86_64',build_type='release',branch='r40',etc, and there will be many other different testname's withsimilar permutations of the os,arch,build_type & branch columns. So for example, there will also be testname='bar' ortestname='omega' or testname='sigma' for the other assorted os,arch,build_type & branch permutations. >> >> The current_status column is either 'PASSED' or 'FAILED'. >> >> What I'm struggling with is how to put together a SQL query which shows me all instances of a given testname permutation >> (os,arch,build_type,branch) which has a current_status value of 'FAILED' for two or more most recent consecutive 'last_update'values. >> >> Suggestions welcome. >> >> thanks in advance! > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
Hi David, I had just figured out the sub-query requirement when you replied. So now I've got this working: SELECT * FROM ( SELECT testname,os,arch,build_type,branch,current_status,last_update,rank() OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY last_update DESC) AS myrank FROM mytable WHERE current_status='FAILED' AND age(now(),last_update) <= INTERVAL '15 days') blah where myrank=2 ; However, I'm getting a bit lost at this point as to what I need to do next. I've been googling on "WITH CTE" and its mostly going over my head. Could you give me more of a hint about this 3rd layer of sub-query? thanks On Wed, Apr 13, 2011 at 10:17 PM, David Johnston <polobo@yahoo.com> wrote: > You need to turn the query with the window function into a sub-query and then in the outer query you can refer to the inner-query'srank() column. The inner query should effectively get you the last two test results for each context and thenyou can check to see if any of those failed. > > I have a hunch you might need a third layer of sub-queries to handle the failure aspect of the requirement properly; possiblyas part of a "WITH" CTE. You for sure need to in order to. Properly utilize the rank() function limiting. > > Dave > > On Apr 14, 2011, at 0:52, Lonni J Friedman <netllama@gmail.com> wrote: > >> Hi David, >> Thanks for your reply. I'm using 8.4.7, so window functions are >> certainly an option, although I've admittedly never used them before. >> I've spent the past few hours reading the dox, and I now have a >> rudimentary understanding of window functions. I tried to compose a >> query based on your suggestion, but I think i'm running up against my >> lack of experience. This query seems to give me all failures but not >> neccesarily when there are two in a row for a unique group (although >> I'm still not 100% certain its actually returning only last_update >> consecutive rows): >> SELECT testname,os,arch,build_type,branch,current_status,last_update,rank() >> OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY >> last_update DESC) FROM mytable WHERE current_status='FAILED' AND >> age(now(),last_update) <= INTERVAL '15 days' >> >> However, if I'm understanding how this works, what I really care about >> is when a rank=2 exists, as that's truly when something failed for two >> consecutive last_update's. I thought this might do it, but apparently >> I'm doing it wrong: >> >> SELECT testname,os,arch,build_type,branch,current_status,last_update,rank() >> OVER (PARTITION BY testname,os,arch,branch,build_type ORDER BY >> last_update DESC) FROM mytable WHERE current_status='FAILED' AND >> age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ; >> ERROR: window function call requires an OVER clause >> LINE 1: ... age(now(),last_update) <= INTERVAL '15 days' AND rank()=2 ; >> >> How do I restrict the results to only show when rank=2 ? >> >> thanks! >> >> >> On Wed, Apr 13, 2011 at 9:57 AM, David Johnston <polobo@yahoo.com> wrote: >>> If you have the ability to use Window functions you can group (as necessary), order by last_update, and then use rank()to number each test run sequentially. Then you can limit the results to ( rank() <= 2 AND current_status = 'FAILED'). >>> >>> David J. >>> >>> -----Original Message----- >>> From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Lonni J Friedman >>> Sent: Wednesday, April 13, 2011 3:34 AM >>> To: pgsql-general >>> Subject: [GENERAL] updating rows which have a common value forconsecutive dates >>> >>> Greetings, >>> I have a table full of automated test data, which continuously has new unique data inserted: >>> >>> Column | Type | >>> Modifiers >>> ----------------+-----------------------------+------------------------- >>> ----------------+-----------------------------+------------------------- >>> ----------------+-----------------------------+----------- >>> id | integer | not null default >>> nextval('dbltests_id_seq'::regclass) >>> testname | text | not null >>> last_update | timestamp without time zone | not null default now() >>> current_status | text | not null >>> os | text | not null >>> arch | text | not null >>> build_type | text | not null >>> branch | text | not null >>> >>> The 'testname' column contains many different tests, and each unique 'testname' has numerous different associated os,arch,build_type& branch values. For example, testname='foo' will run on os='Linux',arch='i686',build_type='debug',branch='t38',and also on os='Windows7',arch='x86_64',build_type='release',branch='r40',etc, and there will be many other different testname's withsimilar permutations of the os,arch,build_type & branch columns. So for example, there will also be testname='bar' ortestname='omega' or testname='sigma' for the other assorted os,arch,build_type & branch permutations. >>> >>> The current_status column is either 'PASSED' or 'FAILED'. >>> >>> What I'm struggling with is how to put together a SQL query which shows me all instances of a given testname permutation >>> (os,arch,build_type,branch) which has a current_status value of 'FAILED' for two or more most recent consecutive 'last_update'values. >>> >>> Suggestions welcome. >>> >>> thanks in advance!