Thread: "next"
Suppose I've got a table containing train departure and arrival times, like so: SELECT id,departure,arrival,arrival-departure AS duration FROM timetable; id | departure | arrival | duration ----+------------------------+------------------------+---------- 1 | 2002-12-02 14:00:00+00 | 2002-12-02 15:00:00+00 | 01:00 7 | 2002-12-02 17:00:00+00 | 2002-12-03 14:00:00+00 | 21:00 2 | 2002-12-03 17:00:00+00 | 2002-12-03 18:00:00+00 | 01:00 6 | 2002-12-03 21:00:00+00 | 2002-12-04 04:00:00+00 | 07:00 4 | 2002-12-04 10:00:00+00 | 2002-12-04 18:00:00+00 | 08:00 5 | 2002-12-05 08:00:00+00 | 2002-12-05 10:00:00+00 | 02:00 Can anyone advise as to the best way to express the following question in SQL: "What's the earliest arrival time where there isn't a subsequent departure time within (e.g.) 7 hours?". The table will contain many rows. I feel foolish - it _looks_ simple - but I've been banging my head against the wall trying to implement this as either a JOIN or a subselect, and it's starting to hurt. Malcolm.
On Mon, Dec 02, 2002 at 04:54:43PM +0000, Malcolm Hutty wrote: > > Suppose I've got a table containing train departure and arrival times, > like so: > > SELECT id,departure,arrival,arrival-departure AS duration FROM timetable; > > id | departure | arrival | duration > ----+------------------------+------------------------+---------- > 1 | 2002-12-02 14:00:00+00 | 2002-12-02 15:00:00+00 | 01:00 > 7 | 2002-12-02 17:00:00+00 | 2002-12-03 14:00:00+00 | 21:00 > 2 | 2002-12-03 17:00:00+00 | 2002-12-03 18:00:00+00 | 01:00 > 6 | 2002-12-03 21:00:00+00 | 2002-12-04 04:00:00+00 | 07:00 > 4 | 2002-12-04 10:00:00+00 | 2002-12-04 18:00:00+00 | 08:00 > 5 | 2002-12-05 08:00:00+00 | 2002-12-05 10:00:00+00 | 02:00 > > > Can anyone advise as to the best way to express the following question > in SQL: "What's the earliest arrival time where there isn't a > subsequent departure time within (e.g.) 7 hours?". The table will > contain many rows. > > I feel foolish - it _looks_ simple - but I've been banging my head > against the wall trying to implement this as either a JOIN or a > subselect, and it's starting to hurt. create table trains (id serial primary key, depart timestamp not null, arrive timestamp not null check (arrive > depart)); insert into train (depart,arrive) values ('2002-01-01 1:00 PM','2002-01-01 2:00 PM'); insert into trains (depart,arrive) values ('2002-01-01 1:00 PM','2002-01-01 2:00 PM'); insert into trains (depart,arrive) values ('2002-01-01 4:00 PM','2002-01-01 5:00 PM'); insert into trains (depart,arrive) values ('2002-01-02 4:00 PM','2002-01-01 5:00 PM'); insert into trains (depart,arrive) values ('2002-01-02 4:00 PM','2002-02-01 5:00 PM'); insert into trains (depart,arrive) values ('2002-01-09 4:00 PM','2002-02-09 5:00 PM'); joel@joel=# select * from trains; id | depart | arrive ----+---------------------+--------------------- 1 | 2002-01-01 13:00:00 | 2002-01-01 14:00:00 2 | 2002-01-01 16:00:00 | 2002-01-01 17:00:00 4 | 2002-01-02 16:00:00 | 2002-02-01 17:00:00 5 | 2002-01-09 16:00:00 | 2002-02-09 17:00:00 Ok, so trains 2 and 4 have arrivals where the last departure is more than 7 hours away. And train 5 will also appear, since there is no departure after it. We can find these with: select id, arrive from trains t1 where t1.arrive + '7 hours' < all ( select depart from trains t2 where t2.depart > t1.arrive ); to find these earliest of these, just order them by arrive, and limit 1 on the outer query, as such: select id,arrive from trains t1 where t1.arrive + '7 hours' < all (select depart from trains t2 where t2.depart > t1.arrive ) order by arrive limit 1; And we get train #1, which is the earliest train arrival that meets the requirements. -- Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Independent Knowledge Management Consultant
On Mon, Dec 02, 2002 at 12:18:37PM -0500, Joel Burton wrote: > insert into train (depart,arrive) values ('2002-01-01 1:00 > PM','2002-01-01 2:00 PM'); > > insert into trains (depart,arrive) values ('2002-01-01 1:00 > PM','2002-01-01 2:00 PM'); > > insert into trains (depart,arrive) values ('2002-01-01 4:00 > PM','2002-01-01 5:00 PM'); > > insert into trains (depart,arrive) values ('2002-01-02 4:00 > PM','2002-01-01 5:00 PM'); > > insert into trains (depart,arrive) values ('2002-01-02 4:00 > PM','2002-02-01 5:00 PM'); > > insert into trains (depart,arrive) values ('2002-01-09 4:00 > PM','2002-02-09 5:00 PM'); > > > joel@joel=# select * from trains; > id | depart | arrive > ----+---------------------+--------------------- > 1 | 2002-01-01 13:00:00 | 2002-01-01 14:00:00 > 2 | 2002-01-01 16:00:00 | 2002-01-01 17:00:00 > 4 | 2002-01-02 16:00:00 | 2002-02-01 17:00:00 > 5 | 2002-01-09 16:00:00 | 2002-02-09 17:00:00 > > > Ok, so trains 2 and 4 have arrivals where the last departure is more > than 7 hours away. And train 5 will also appear, since there is no > departure after it. ... > And we get train #1, which is the earliest train arrival that meets the > requirements. Eek! Need to get lunch before brain falls asleep. Train #3 disappeared into a derailed insert statement, so if you cut & paste my data, your trains 3 & 4 will be my 4 & 5. And train #2, not #1, is the ultimate answer. -- Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Independent Knowledge Management Consultant
On Mon, Dec 02, 2002 at 06:10:39PM +0000, Malcolm Hutty wrote: > Joel Burton wrote: > > >We can find these with: > > > >select id, > > arrive > >from trains t1 > >where t1.arrive + '7 hours' < all ( select depart > > from trains t2 > > where t2.depart > t1.arrive ); > > Thanks, that really helped. It was the "all" that did it; I'd been > messing with IN and EXISTS and generally making a mess of it. Glad to help. Re: EXISTS, I think that this would be equivalent: select id, arrive from trains t1 where not exists ( select * from trains t2 where t2.depart > t1.arrive and t2.depart - t1.arrive <= '7 hours' ) the t2.depart > t1.arrive is to get rid of most matches, rather than relying on the mucher slower subtraction. This might perform faster or slower than the the < ALL, depending on your data, indexes, etc. I'd think it would be slower, but benchmark if it's important. I think the first is definitely clearer, though. BTW, for SQL novices, there's also ANY, similar to ALL, which finds cases where there's any match. This can be easily switched for EXISTS. If you'd like some help using these, I'd highly recommend Joe Celko's _SQL_For_Smarties_. -- Joel BURTON | joel@joelburton.com | joelburton.com | aim: wjoelburton Independent Knowledge Management Consultant