Thread: "next"

"next"

From
Malcolm Hutty
Date:
Suppose I've got a table containing train departure and arrival times,
like so:

SELECT id,departure,arrival,arrival-departure AS duration FROM timetable;

  id |       departure        |        arrival         | duration
----+------------------------+------------------------+----------
   1 | 2002-12-02 14:00:00+00 | 2002-12-02 15:00:00+00 | 01:00
   7 | 2002-12-02 17:00:00+00 | 2002-12-03 14:00:00+00 | 21:00
   2 | 2002-12-03 17:00:00+00 | 2002-12-03 18:00:00+00 | 01:00
   6 | 2002-12-03 21:00:00+00 | 2002-12-04 04:00:00+00 | 07:00
   4 | 2002-12-04 10:00:00+00 | 2002-12-04 18:00:00+00 | 08:00
   5 | 2002-12-05 08:00:00+00 | 2002-12-05 10:00:00+00 | 02:00


Can anyone advise as to the best way to express the following question
in SQL: "What's the earliest arrival time where there isn't a
subsequent departure time within (e.g.) 7 hours?". The table will
contain many rows.

I feel foolish - it _looks_ simple - but I've been banging my head
against the wall trying to implement this as either a JOIN or a
subselect, and it's starting to hurt.

Malcolm.


Re: "next"

From
Joel Burton
Date:
On Mon, Dec 02, 2002 at 04:54:43PM +0000, Malcolm Hutty wrote:
>
> Suppose I've got a table containing train departure and arrival times,
> like so:
>
> SELECT id,departure,arrival,arrival-departure AS duration FROM timetable;
>
>  id |       departure        |        arrival         | duration
> ----+------------------------+------------------------+----------
>   1 | 2002-12-02 14:00:00+00 | 2002-12-02 15:00:00+00 | 01:00
>   7 | 2002-12-02 17:00:00+00 | 2002-12-03 14:00:00+00 | 21:00
>   2 | 2002-12-03 17:00:00+00 | 2002-12-03 18:00:00+00 | 01:00
>   6 | 2002-12-03 21:00:00+00 | 2002-12-04 04:00:00+00 | 07:00
>   4 | 2002-12-04 10:00:00+00 | 2002-12-04 18:00:00+00 | 08:00
>   5 | 2002-12-05 08:00:00+00 | 2002-12-05 10:00:00+00 | 02:00
>
>
> Can anyone advise as to the best way to express the following question
> in SQL: "What's the earliest arrival time where there isn't a
> subsequent departure time within (e.g.) 7 hours?". The table will
> contain many rows.
>
> I feel foolish - it _looks_ simple - but I've been banging my head
> against the wall trying to implement this as either a JOIN or a
> subselect, and it's starting to hurt.

create table trains (id serial primary key, depart timestamp not null,
arrive timestamp not null check (arrive > depart));

insert into train (depart,arrive) values ('2002-01-01 1:00
PM','2002-01-01 2:00 PM');

insert into trains (depart,arrive) values ('2002-01-01 1:00
PM','2002-01-01 2:00 PM');

insert into trains (depart,arrive) values ('2002-01-01 4:00
PM','2002-01-01 5:00 PM');

insert into trains (depart,arrive) values ('2002-01-02 4:00
PM','2002-01-01 5:00 PM');

insert into trains (depart,arrive) values ('2002-01-02 4:00
PM','2002-02-01 5:00 PM');

insert into trains (depart,arrive) values ('2002-01-09 4:00
PM','2002-02-09 5:00 PM');


joel@joel=# select * from trains;
 id |       depart        |       arrive
 ----+---------------------+---------------------
 1 | 2002-01-01 13:00:00 | 2002-01-01 14:00:00
 2 | 2002-01-01 16:00:00 | 2002-01-01 17:00:00
 4 | 2002-01-02 16:00:00 | 2002-02-01 17:00:00
 5 | 2002-01-09 16:00:00 | 2002-02-09 17:00:00


Ok, so trains 2 and 4 have arrivals where the last departure is more
than 7 hours away. And train 5 will also appear, since there is no
departure after it.

We can find these with:

select id,
       arrive
from   trains t1
where  t1.arrive + '7 hours' < all ( select depart
                                     from   trains t2
                     where  t2.depart > t1.arrive );

to find these earliest of these, just order them by arrive, and limit 1
on the outer query, as such:

select id,arrive from trains t1 where t1.arrive + '7 hours' < all
(select depart from trains t2 where t2.depart > t1.arrive ) order by
arrive limit 1;

And we get train #1, which is the earliest train arrival that meets the
requirements.

--

Joel BURTON  |  joel@joelburton.com  |  joelburton.com  |  aim: wjoelburton
Independent Knowledge Management Consultant

Re: "next"

From
Joel Burton
Date:
On Mon, Dec 02, 2002 at 12:18:37PM -0500, Joel Burton wrote:
> insert into train (depart,arrive) values ('2002-01-01 1:00
> PM','2002-01-01 2:00 PM');
>
> insert into trains (depart,arrive) values ('2002-01-01 1:00
> PM','2002-01-01 2:00 PM');
>
> insert into trains (depart,arrive) values ('2002-01-01 4:00
> PM','2002-01-01 5:00 PM');
>
> insert into trains (depart,arrive) values ('2002-01-02 4:00
> PM','2002-01-01 5:00 PM');
>
> insert into trains (depart,arrive) values ('2002-01-02 4:00
> PM','2002-02-01 5:00 PM');
>
> insert into trains (depart,arrive) values ('2002-01-09 4:00
> PM','2002-02-09 5:00 PM');
>
>
> joel@joel=# select * from trains;
>  id |       depart        |       arrive
>  ----+---------------------+---------------------
>  1 | 2002-01-01 13:00:00 | 2002-01-01 14:00:00
>  2 | 2002-01-01 16:00:00 | 2002-01-01 17:00:00
>  4 | 2002-01-02 16:00:00 | 2002-02-01 17:00:00
>  5 | 2002-01-09 16:00:00 | 2002-02-09 17:00:00
>
>
> Ok, so trains 2 and 4 have arrivals where the last departure is more
> than 7 hours away. And train 5 will also appear, since there is no
> departure after it.

...

> And we get train #1, which is the earliest train arrival that meets the
> requirements.

Eek! Need to get lunch before brain falls asleep. Train #3 disappeared
into a derailed insert statement, so if you cut & paste my data, your
trains 3 & 4 will be my 4 & 5.


And train #2, not #1, is the ultimate answer.

--

Joel BURTON  |  joel@joelburton.com  |  joelburton.com  |  aim: wjoelburton
Independent Knowledge Management Consultant

Re: "next"

From
Joel Burton
Date:
On Mon, Dec 02, 2002 at 06:10:39PM +0000, Malcolm Hutty wrote:
> Joel Burton wrote:
>
> >We can find these with:
> >
> >select id,
> >       arrive
> >from   trains t1
> >where  t1.arrive + '7 hours' < all ( select depart
> >                                     from   trains t2
> >                     where  t2.depart > t1.arrive );
>
> Thanks, that really helped. It was the "all" that did it; I'd been
> messing with IN and EXISTS and generally making a mess of it.

Glad to help.

Re: EXISTS, I think that this would be equivalent:

select id,
       arrive
from   trains t1
where  not exists ( select *
                    from   trains t2
            where  t2.depart > t1.arrive
            and    t2.depart - t1.arrive <= '7 hours' )

the t2.depart > t1.arrive is to get rid of most matches, rather than
relying on the mucher slower subtraction.

This might perform faster or slower than the the < ALL, depending on
your data, indexes, etc. I'd think it would be slower, but benchmark if
it's important. I think the first is definitely clearer, though.

BTW, for SQL novices, there's also ANY, similar to ALL, which finds
cases where there's any match. This can be easily switched for EXISTS.
If you'd like some help using these, I'd highly recommend Joe Celko's
_SQL_For_Smarties_.

--

Joel BURTON  |  joel@joelburton.com  |  joelburton.com  |  aim: wjoelburton
Independent Knowledge Management Consultant