Thread: difficult JOIN

difficult JOIN

From
"Thomas Chille"
Date:
Hi,

i have the following SQL-Problem:

We are using 2 tables. The first, called plan, is holding planned working times for employees per
tour:

plan.id_tour
plan.id_employee
plan.begin_time
plan.end_time

The second table 'work' stores the actual worked times for employees per tour:

work.id_tour
work.id_employee
work.begin_time
work.end_time

Employees can be multiple times assigned to one tour. One record will be created for every
assignment. They can also work multiple times in one tour.

Now i wanna merge this infos into one report. I wanna join the first plan entry for one employee in
one tour with the first work entry for one employee in one tour and so on.

How can i obtain that? A simply USING(id_tour, id_employee) -JOIN will not doit.

Thanks for any hints,
Thomas


Re: difficult JOIN

From
"Jim C. Nasby"
Date:
On Tue, Jan 25, 2005 at 05:26:50PM +0100, Thomas Chille wrote:
> Hi,
>
> i have the following SQL-Problem:
>
> We are using 2 tables. The first, called plan, is holding planned working
> times for employees per
> tour:
>
> plan.id_tour
> plan.id_employee
> plan.begin_time
> plan.end_time
>
> The second table 'work' stores the actual worked times for employees per
> tour:
>
> work.id_tour
> work.id_employee
> work.begin_time
> work.end_time
>
> Employees can be multiple times assigned to one tour. One record will be
> created for every
> assignment. They can also work multiple times in one tour.
>
> Now i wanna merge this infos into one report. I wanna join the first plan
> entry for one employee in
> one tour with the first work entry for one employee in one tour and so on.
>
> How can i obtain that? A simply USING(id_tour, id_employee) -JOIN will not
> doit.

Hrm. So for a given tour, employee, you want to pair the first record in
plan with the first record in work, and the second record in plan with
the second record in work?

Doing that will be pretty tricky. I'm not sure you can even do it in a
single SELECT.

More important, does it even make sense? What if an employee ends up not
working at all for one of his/her planned times? Every record after that
would be completely skewed. Wouldn't it make much more sense to either
assign an ID to each record in the plan table, and correlate records in
the work table using that ID, or correlate based on begin and end time?

BTW, I've never seen the convention id_employee; people generally use
employee_id. Is it more important to know that you're talking about an
ID or that you're talking about an employee? Just food for thought.
--
Jim C. Nasby, Database Consultant               decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

Re: difficult JOIN

From
Thomas Chille
Date:
Hi Jim,

thanks for your answer!

> Hrm. So for a given tour, employee, you want to pair the first record in
> plan with the first record in work, and the second record in plan with
> the second record in work?

Yes you understand me well, thats what i'm trying to achieve.

> Doing that will be pretty tricky. I'm not sure you can even do it in a
> single SELECT.

I'm not sure anymore too.

> More important, does it even make sense? What if an employee ends up not
> working at all for one of his/her planned times? Every record after that
> would be completely skewed. Wouldn't it make much more sense to either
> assign an ID to each record in the plan table, and correlate records in
> the work table using that ID, or correlate based on begin and end time?

This report should not be the final thing. It schould only present the
matched times on an easy way and the tourleader can than edit the
worktimes manually. He have to do this after every tour because the
employees are often not using the timerecording unit correct (i'm
talking about restaurants).

In the meanwhile i gave every record an position counter (per tour and
employee), derrived from begin_time, per trigger, and merged them in
this way:

SELECT * FROM work LEFT JOIN plan USING(id_tour, id_employee, counter)
UNION
SELECT * FROM work RIGHT JOIN plan USING(id_tour, id_employee, counter)

Thats works for me but if the tourleader change one worktime record
the counters have to be recalculate and the order of the records will
change.

This is hard to handle and i think i will do all merging
programmatically in the app or with an ppgsql function.

> BTW, I've never seen the convention id_employee; people generally use
> employee_id. Is it more important to know that you're talking about an
> ID or that you're talking about an employee? Just food for thought.

I'm not the father of this strange naming convention :)

Thank you again,
Thomas

Re: difficult JOIN

From
Martijn van Oosterhout
Date:
On Wed, Jan 26, 2005 at 11:22:48AM +0100, Thomas Chille wrote:
> > More important, does it even make sense? What if an employee ends up not
> > working at all for one of his/her planned times? Every record after that
> > would be completely skewed. Wouldn't it make much more sense to either
> > assign an ID to each record in the plan table, and correlate records in
> > the work table using that ID, or correlate based on begin and end time?
>
> This report should not be the final thing. It schould only present the
> matched times on an easy way and the tourleader can than edit the
> worktimes manually. He have to do this after every tour because the
> employees are often not using the timerecording unit correct (i'm
> talking about restaurants).

One thing that may help is to do an inner join on (id_tour,
id_employee) and then match the following:

work.starttime < plan.endtime and plan.starttime < work.endtime

This will join then and give a row whenever there is an overlap between
the two records. Ofcourse, if there is no overlap the record doesn't
appear at all, but an outer join might work here...

Ah no, IIRC postgresql only supports outer joins on merge join able
conditions so that wont work. But the above should be a start...

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

Attachment

Re: difficult JOIN

From
"Jim C. Nasby"
Date:
On Wed, Jan 26, 2005 at 11:22:48AM +0100, Thomas Chille wrote:
> In the meanwhile i gave every record an position counter (per tour and
> employee), derrived from begin_time, per trigger, and merged them in
> this way:

How are you generating the position counter? A sequence? If you multiply
by some factor it might eliminate the need to re-number everything.

> SELECT * FROM work LEFT JOIN plan USING(id_tour, id_employee, counter)
> UNION
> SELECT * FROM work RIGHT JOIN plan USING(id_tour, id_employee, counter)
>
> Thats works for me but if the tourleader change one worktime record
> the counters have to be recalculate and the order of the records will
> change.
>
> This is hard to handle and i think i will do all merging
> programmatically in the app or with an ppgsql function.
>
> > BTW, I've never seen the convention id_employee; people generally use
> > employee_id. Is it more important to know that you're talking about an
> > ID or that you're talking about an employee? Just food for thought.
>
> I'm not the father of this strange naming convention :)

I hate getting stuck with other people's stupidity. :)
--
Jim C. Nasby, Database Consultant               decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"