difficult JOIN
Hi,
i have the following SQL-Problem:
We are using 2 tables. The first, called plan, is holding planned working times for employees per
tour:
plan.id_tour
plan.id_employee
plan.begin_time
plan.end_time
The second table 'work' stores the actual worked times for employees per tour:
work.id_tour
work.id_employee
work.begin_time
work.end_time
Employees can be multiple times assigned to one tour. One record will be created for every
assignment. They can also work multiple times in one tour.
Now i wanna merge this infos into one report. I wanna join the first plan entry for one employee in
one tour with the first work entry for one employee in one tour and so on.
How can i obtain that? A simply USING(id_tour, id_employee) -JOIN will not doit.
Thanks for any hints,
Thomas
On Tue, Jan 25, 2005 at 05:26:50PM +0100, Thomas Chille wrote:
Hi,
i have the following SQL-Problem:
We are using 2 tables. The first, called plan, is holding planned working
times for employees per
tour:plan.id_tour
plan.id_employee
plan.begin_time
plan.end_timeThe second table 'work' stores the actual worked times for employees per
tour:work.id_tour
work.id_employee
work.begin_time
work.end_timeEmployees can be multiple times assigned to one tour. One record will be
created for every
assignment. They can also work multiple times in one tour.Now i wanna merge this infos into one report. I wanna join the first plan
entry for one employee in
one tour with the first work entry for one employee in one tour and so on.How can i obtain that? A simply USING(id_tour, id_employee) -JOIN will not
doit.
Hrm. So for a given tour, employee, you want to pair the first record in
plan with the first record in work, and the second record in plan with
the second record in work?
Doing that will be pretty tricky. I'm not sure you can even do it in a
single SELECT.
More important, does it even make sense? What if an employee ends up not
working at all for one of his/her planned times? Every record after that
would be completely skewed. Wouldn't it make much more sense to either
assign an ID to each record in the plan table, and correlate records in
the work table using that ID, or correlate based on begin and end time?
BTW, I've never seen the convention id_employee; people generally use
employee_id. Is it more important to know that you're talking about an
ID or that you're talking about an employee? Just food for thought.
--
Jim C. Nasby, Database Consultant decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828
Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"
Hi Jim,
thanks for your answer!
Hrm. So for a given tour, employee, you want to pair the first record in
plan with the first record in work, and the second record in plan with
the second record in work?
Yes you understand me well, thats what i'm trying to achieve.
Doing that will be pretty tricky. I'm not sure you can even do it in a
single SELECT.
I'm not sure anymore too.
More important, does it even make sense? What if an employee ends up not
working at all for one of his/her planned times? Every record after that
would be completely skewed. Wouldn't it make much more sense to either
assign an ID to each record in the plan table, and correlate records in
the work table using that ID, or correlate based on begin and end time?
This report should not be the final thing. It schould only present the
matched times on an easy way and the tourleader can than edit the
worktimes manually. He have to do this after every tour because the
employees are often not using the timerecording unit correct (i'm
talking about restaurants).
In the meanwhile i gave every record an position counter (per tour and
employee), derrived from begin_time, per trigger, and merged them in
this way:
SELECT * FROM work LEFT JOIN plan USING(id_tour, id_employee, counter)
UNION
SELECT * FROM work RIGHT JOIN plan USING(id_tour, id_employee, counter)
Thats works for me but if the tourleader change one worktime record
the counters have to be recalculate and the order of the records will
change.
This is hard to handle and i think i will do all merging
programmatically in the app or with an ppgsql function.
BTW, I've never seen the convention id_employee; people generally use
employee_id. Is it more important to know that you're talking about an
ID or that you're talking about an employee? Just food for thought.
I'm not the father of this strange naming convention :)
Thank you again,
Thomas
On Wed, Jan 26, 2005 at 11:22:48AM +0100, Thomas Chille wrote:
More important, does it even make sense? What if an employee ends up not
working at all for one of his/her planned times? Every record after that
would be completely skewed. Wouldn't it make much more sense to either
assign an ID to each record in the plan table, and correlate records in
the work table using that ID, or correlate based on begin and end time?This report should not be the final thing. It schould only present the
matched times on an easy way and the tourleader can than edit the
worktimes manually. He have to do this after every tour because the
employees are often not using the timerecording unit correct (i'm
talking about restaurants).
One thing that may help is to do an inner join on (id_tour,
id_employee) and then match the following:
work.starttime < plan.endtime and plan.starttime < work.endtime
This will join then and give a row whenever there is an overlap between
the two records. Ofcourse, if there is no overlap the record doesn't
appear at all, but an outer join might work here...
Ah no, IIRC postgresql only supports outer joins on merge join able
conditions so that wont work. But the above should be a start...
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.
On Wed, Jan 26, 2005 at 11:22:48AM +0100, Thomas Chille wrote:
In the meanwhile i gave every record an position counter (per tour and
employee), derrived from begin_time, per trigger, and merged them in
this way:
How are you generating the position counter? A sequence? If you multiply
by some factor it might eliminate the need to re-number everything.
SELECT * FROM work LEFT JOIN plan USING(id_tour, id_employee, counter)
UNION
SELECT * FROM work RIGHT JOIN plan USING(id_tour, id_employee, counter)Thats works for me but if the tourleader change one worktime record
the counters have to be recalculate and the order of the records will
change.This is hard to handle and i think i will do all merging
programmatically in the app or with an ppgsql function.BTW, I've never seen the convention id_employee; people generally use
employee_id. Is it more important to know that you're talking about an
ID or that you're talking about an employee? Just food for thought.I'm not the father of this strange naming convention :)
I hate getting stuck with other people's stupidity. :)
--
Jim C. Nasby, Database Consultant decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828
Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"