Thread: operating on data from multiple rows?

operating on data from multiple rows?

From
Joshua Daniel Franklin
Date:
Here is a problem I've run into with an old IMHO poorly designed database:

There is a table ("log") that has fields

username, sessionid, loggedtime, loggeddate, accntstatus

A SELECT might return this data, for example:

bob    1035208    2002-10-11    11:32:00    Start
bob    1035208    2002-10-11    11:38:00    Stop
bob    1052072    2002-10-12    10:05:00    Start
bob    1052072    2002-10-12    10:15:00    Stop

I'm trying to get my head around a SELECT that will return
only one entry per sessionid with a duration instead of two entries for
each. If I had two separate tables for Start and Stop it would
be trivial with a join, but all I can think of is doing a
"SELECT ... WHERE accntstatus = 'Start'" and then grabbing the
sessionid and doing a separate SELECT for every record (and then the
math to get the duration). This seems like a bad idea since thousands
of records are retrived at a time.
Am I missing a better way?

--
Joshua Daniel Franklin
Network Administrator
IOCC.COM



Re: operating on data from multiple rows?

From
"Michael Paesold"
Date:
Joshua Daniel Franklin <joshua@iocc.com> wrote:

> Here is a problem I've run into with an old IMHO poorly designed database:
>
> There is a table ("log") that has fields
>
> username, sessionid, loggedtime, loggeddate, accntstatus
>
> A SELECT might return this data, for example:
>
> bob 1035208 2002-10-11 11:32:00 Start
> bob 1035208 2002-10-11 11:38:00 Stop
> bob 1052072 2002-10-12 10:05:00 Start
> bob 1052072 2002-10-12 10:15:00 Stop
>
> I'm trying to get my head around a SELECT that will return
> only one entry per sessionid with a duration instead of two entries for
> each. If I had two separate tables for Start and Stop it would
> be trivial with a join, but all I can think of is doing a
> "SELECT ... WHERE accntstatus = 'Start'" and then grabbing the
> sessionid and doing a separate SELECT for every record (and then the
> math to get the duration). This seems like a bad idea since thousands
> of records are retrived at a time.
> Am I missing a better way?

A self-join would help...

SELECT start.username, start.sessionid,
    ((stop.loggeddate + stop.loggedtime)
        - (start.loggeddate + start.loggedtime)) as duration
  FROM log AS start, log AS stop
  WHERE start.accntstatus = 'Start'
    AND stop.accntstatus = 'Stop'
    AND start.sessionid = stop.sessionid;

(not tested, but try like this)
You probably have to cast the value of the duration.

Best Regards,
Michael Paesold


Re: operating on data from multiple rows?

From
Andrew McMillan
Date:
On Wed, 2002-10-23 at 08:06, Joshua Daniel Franklin wrote:
> Here is a problem I've run into with an old IMHO poorly designed database:
>
> There is a table ("log") that has fields
>
> username, sessionid, loggedtime, loggeddate, accntstatus
>
> A SELECT might return this data, for example:
>
> bob    1035208    2002-10-11    11:32:00    Start
> bob    1035208    2002-10-11    11:38:00    Stop
> bob    1052072    2002-10-12    10:05:00    Start
> bob    1052072    2002-10-12    10:15:00    Stop
>
> I'm trying to get my head around a SELECT that will return
> only one entry per sessionid with a duration instead of two entries for
> each. If I had two separate tables for Start and Stop it would
> be trivial with a join, but all I can think of is doing a
> "SELECT ... WHERE accntstatus = 'Start'" and then grabbing the
> sessionid and doing a separate SELECT for every record (and then the
> math to get the duration). This seems like a bad idea since thousands
> of records are retrived at a time.
> Am I missing a better way?

Some alternatives that spring to mind:

(a) Use another table to reconstruct the data into a better form.

(b) Add a column for stop_time, or duration and maintain it separately
with a regular script.

(c) Write a function which returns the duration.

You could combine (b) and (c) as well.

Here's an example function that might give you ideas:

CREATE OR REPLACE FUNCTION duration_to ( DATE, TIME, INT ) RETURNS
TIMESPAN AS 'SELECT ($1::timestamp + $2) - (loggeddate::timestamp +
loggedtime) FROM log WHERE sessionid = $3 AND accntstatus = ''Start'' '
LANGUAGE 'SQL';

Then you should be able to do something like:

SELECT *, duration_to( loggeddate, loggedtime, sessionid FROM log WHERE
accntstatus = 'Stop';

Regards,
                    Andrew.

--
---------------------------------------------------------------------
Andrew @ Catalyst .Net.NZ Ltd, PO Box 11-053, Manners St,  Wellington
WEB: http://catalyst.net.nz/         PHYS: Level 2, 150-154 Willis St
DDI: +64(4)916-7201     MOB: +64(21)635-694    OFFICE: +64(4)499-2267
           Survey for free with http://survey.net.nz/
---------------------------------------------------------------------


Re: operating on data from multiple rows?

From
Doug Gorley
Date:
On Tue, 2002-10-22 at 12:52, Michael Paesold wrote:
> Joshua Daniel Franklin <joshua@iocc.com> wrote:
>
> > Here is a problem I've run into with an old IMHO poorly designed database:
> >
> > There is a table ("log") that has fields
> >
> > username, sessionid, loggedtime, loggeddate, accntstatus
> >
> > A SELECT might return this data, for example:
> >
> > bob 1035208 2002-10-11 11:32:00 Start
> > bob 1035208 2002-10-11 11:38:00 Stop
> > bob 1052072 2002-10-12 10:05:00 Start
> > bob 1052072 2002-10-12 10:15:00 Stop
> >
> > I'm trying to get my head around a SELECT that will return
> > only one entry per sessionid with a duration instead of two entries for
> > each. If I had two separate tables for Start and Stop it would
> > be trivial with a join, but all I can think of is doing a
> > "SELECT ... WHERE accntstatus = 'Start'" and then grabbing the
> > sessionid and doing a separate SELECT for every record (and then the
> > math to get the duration). This seems like a bad idea since thousands
> > of records are retrived at a time.
> > Am I missing a better way?
>
> A self-join would help...
>
> SELECT start.username, start.sessionid,
>     ((stop.loggeddate + stop.loggedtime)
>         - (start.loggeddate + start.loggedtime)) as duration
>   FROM log AS start, log AS stop
>   WHERE start.accntstatus = 'Start'
>     AND stop.accntstatus = 'Stop'
>     AND start.sessionid = stop.sessionid;
>
> (not tested, but try like this)
> You probably have to cast the value of the duration.
>
> Best Regards,
> Michael Paesold
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster

I second this idea; in fact, given the difficulty in getting useful
information from the current table, you might even want to consider
building a view:

-----------------------

create view log_info as
  select l1.username,
         l1.sessionid,
         l1.loggeddate + l1.loggedtime as start,
         l2.loggeddate + l2.loggedtime as stop,
         case
           when l2.sessionid is not null then
             (
               ( l2.loggeddate + l2.loggedtime )
               - ( l1.loggeddate + l1.loggedtime )
             )
           else
             date_trunc( 'second',
                         now() - ( l1.loggeddate + l1.loggedtime ) )
         end as duration
  from ( select * from log where accntstatus = 'Start' ) l1
    left outer join ( select * from log where accntstatus = 'Stop' ) l2
    on ( l1.sessionid = l2.sessionid )
;

--
Doug Gorley | douggorley@shaw.ca     OpenPGP Key ID: 0xA221559B
Fingerprint: D707 DB92 E64B 69DA B8C7  2F65 C5A9 5415 A221 559B
Interested in public-key cryptography?    http://www.gnupg.org/


Attachment