Re: Time differences between rows, not columns? - Mailing list pgsql-sql

From
Subject Re: Time differences between rows, not columns?
Date
Msg-id 4538E1203665624F97A8494BCF2716E5C07E2B@blrx2kmbgl102.blr.amer.dell.com
Whole thread Raw
In response to Time differences between rows, not columns?  (<Amit_Wadhwa@Dell.com>)
List pgsql-sql
Thanks!

<snip>Amit,You say "I want to know how much time it took for a bill to
be accepted after it was submitted" So, do you want between 10 and 40,
not 10 and 20? I assume you meant 10 and 40... </snip>
...Could be any status to any status, I wanted to generalize the concept
for future usage.


I tried the below:
1.

select bill_id,(max(case when statusid = 40 then statustime end) -
max(case when statusid = 10 then statustime end)) from
ap.vits_statusupdate where statusid in ( 10,40 ) group by bill_id;
Took 16 ms...

But meanwhile I also tried:
select a.bill_id, b.statustime-a.statustime from ap.vits_statusupdate a,
ap.vits_statusupdate b where a.statusid = 10 and b.statusid = 40 and
a.bill_id = b.bill_id

Took 15 ms

The second query is taking approximately 1 ms lesser time.. (have tried
about 30 times in pgadmin)

Got about 10 records in my test resultset.

Surprisingly, if I add the order by bill_id cluase at the end of both
queries, then your query performs 1 ms faster than mine, Don't know why,
Is there any performance issues if this is run over 10000+ rows?
Which (yours or mine?) do you think will be faster, and more efficient?

2.

select sum(case when diff < 7 then 1 else 0 end) as lt_7,sum(case when
diff between 7 and 15 then 1 else 0 end) as btw_7_15,sum(case when diff
> 15 then 1 else 0 end) as gt_15 from (select bill_id,(max(case when statusid = 40 then statustime end)
- max(case when statusid = 10 then statustime end)) as diff  from
ap.vits_statusupdate where statusid in ( 10,40 ) group by bill_id) x;

Tried this one too with both the queries in the sub-select (yours and
mine), and works perfectly well for me, with accurate answers, and
exactly what I'm looking for! shows 0 ms (??)..

Next question.. Which one should I use? The joined query? Or the max()
query?

Thanks a lot for your time!

Regards,
Amit



-----Original Message-----
From: pgsql-sql-owner@postgresql.org
[mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Anthony Molinaro
Sent: Tuesday, August 30, 2005 4:59 PM
To: Wadhwa, Amit; pgsql-sql@postgresql.org
Subject: Re: [SQL] Time differences between rows, not columns?

Amit, You say "I want to know how much time it took for a bill to be
accepted after it was submitted"

So, do you want between 10 and 40, not 10 and 20?
I assume you meant 10 and 40.
Ok, there's a few approaches to your questions, first
To get "how much time it took for a bill to be accepted after
it was submitted":

select bill_id,      (max(case when status_id = 40                then statustime end) -        max(case when status_id
=10                then statustime end))     from bill_status where status_id in ( 10,40 )group by bill_id; 

that will give you the amt of time (simple date arithmetic, so, in days)
from submission to being accepted.

You also, want a simple distribution, you can use the technique above
But with SUM:

select sum(case when diff < 7 then 1 else 0 end) as lt_7,      sum(case when diff between 7 and 15 then 1 else 0 end)
as
btw_7_15,      sum(case when diff > 15 then 1 else 0 end) as gt_15 from (
select bill_id,      (max(case when status_id = 40                then statustime end) -        max(case when status_id
=10                then statustime end)) as diff     from bill_status where status_id in ( 10,40 )group by bill_id
)x; 


I have a recipe in my upcoming book ('The SQL Cookbook' by O'Reilly)
So, I'd like to know if what I suggested works out for you.
Obviously, you'll need whatever tweaks to make it perfect for your
system, the feedback I am concerned about is the technique.

hope that helps, Anthony

-----Original Message-----
From: pgsql-sql-owner@postgresql.org
[mailto:pgsql-sql-owner@postgresql.org] On Behalf Of
Amit_Wadhwa@Dell.com
Sent: Tuesday, August 30, 2005 5:45 AM
To: pgsql-sql@postgresql.org
Subject: [SQL] Time differences between rows, not columns?
Importance: High

Using postgresSQL 8.0 for windows, running on windows 2003 server 16gb
RAM, 3Ghz dual p4.
Language: Java - JDBC postgres driver: postgresql-8.0-310.jdbc3.jar


I have an application with 3 tables (in this context that is)


Table bills
 bill_id NOT NULL serial ... And other columns


Table bill_status
 bill_id (references bills.bill_id) statusid int4 (references bill_statuslookup.statusid) statustime datetime


Table bill_statuslookup
 statusid serial not null statusname varchar(255)




The application basically tracks a workflow of bills flowing fromone
department to another.
Everytime the bill moves one step, an entry is made into the bill_status
table.

Eg.
Bills table
-------------
Bill_id         otherfield1

1            ........


Bill_status table:
---------------------
Bill_id     statusid    statustime

1             10         2005-04-04 00:34:31
1             20         2005-04-05 00:55:00


Bill_statuslookup table:
-------------------------
Statusid          Statusname

10                submitted
20                received
30                rejected
40                accepted
..
..



Now my problem is this:

1. Find the time taken for each bill to reach from status 10 to status
20 , given the time of status 10 should be between t1 and t2.

Eg I want to know how much time it took for a bill to be accepted after
it was submitted (criteria: submitted between yesterday and today)

2. I want to know how many bills took <7 days, how many tok 7-15 days,
how many took >15 days etc.


The status is a lookup table because the workflow constantly changes,
and I can't have submitted_on, recd_on accepted_on times etc in the main
bills table as columns because there are way too many statuses in the
life of a bill (read invoice).


Hope its clear as to what I'm looking for.
Any help is greatly appreciated!!

Regards,
Amit

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to      choose an index scan if your joining column's
datatypesdo not      match 

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings



pgsql-sql by date:

Previous
From: Achilleus Mantzios
Date:
Subject: Re: REINDEX DATABASE
Next
From: "Anthony Molinaro"
Date:
Subject: Re: Time differences between rows, not columns?