Thread: I need help creating a query

I need help creating a query

From
"Sergio Duran"
Date:
Hello,

I need a little help creating a query, I have two tables, worker and position, for simplicity sake worker only has its ID and its name, position has the ID of the worker, the name of his position, a date, and his salary/

worker:   worker_id, name
position: position_id, worker_id, position, startdate, salary

If I perfom a query joining both tables, I can obtain all the workers and the positions the've had.

SELECT name, startdate, position,  salary FROM worker JOIN position USING(worker_id);
worker1 | 2001-01-01 | boss             | 999999
worker2 | 2001-01-01 | cleaning        | 100
worker2 | 2006-04-01 | programmer   | 20000
worker2 | 2006-07-04 | management | 25000

so far so good, now I need to obtain all the workers only with the position they had on a given date.
if I wanted to know the positions on '2006-05-01' it would return
worker1 | 2001-01-01 | boss             | 999999
worker2 | 2006-04-01 | programmer   | 20000

Maybe I only need some insight here, maybe it should be done with 2 queries, maybe it should be done with functions, I don't know.

Thanks

Re: I need help creating a query

From
"Dann Corbit"
Date:

The query date supplied should be applied against start date and then take the minimum record from that set.

 


From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Sergio Duran
Sent: Thursday, July 13, 2006 12:20 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] I need help creating a query

 

Hello,

I need a little help creating a query, I have two tables, worker and position, for simplicity sake worker only has its ID and its name, position has the ID of the worker, the name of his position, a date, and his salary/

worker:   worker_id, name
position: position_id, worker_id, position, startdate, salary

If I perfom a query joining both tables, I can obtain all the workers and the positions the've had.

SELECT name, startdate, position,  salary FROM worker JOIN position USING(worker_id);
worker1 | 2001-01-01 | boss             | 999999
worker2 | 2001-01-01 | cleaning        | 100
worker2 | 2006-04-01 | programmer   | 20000
worker2 | 2006-07-04 | management | 25000

so far so good, now I need to obtain all the workers only with the position they had on a given date.
if I wanted to know the positions on '2006-05-01' it would return
worker1 | 2001-01-01 | boss             | 999999
worker2 | 2006-04-01 | programmer   | 20000

Maybe I only need some insight here, maybe it should be done with 2 queries, maybe it should be done with functions, I don't know.

Thanks

Re: I need help creating a query

From
Richard Broersma Jr
Date:
>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This is just a quick guess. I am not sure if the logic is correct but it could be a starting
point.

select P2.worker_id, P2.pdate, P1.position, P1.salary
from position as P1
  join
  (select worker_id, max(startdate) as pdate
   from position
   where startdate <= '2006-05-01'
   group by worker_id, position_id) as P2
on (P1.worker_id = P2.worker_id) and (P1.startdate = P2.pdate)
;

Regards,

Richard Broersma Jr.

Re: I need help creating a query

From
"Sergio Duran"
Date:
Nice, Richard, but you use max(startdate), how about the salary? i cant use max(salary) how about if he got a pay cut?

My current solution is to write the nested query on the field list, like
SELECT worker.*,
  (select salary
     FROM position where worker_id=worker.worker_id
     and fecha<='2006-04-01' LIMIT 1) as salary
FROM worker;

but I can only return 1 column from that subquery and repeating the same subquery for each column needed (position, date and salary) seems a little too much, if I write a procedure would postgres would optimize the access?


On 7/13/06, Richard Broersma Jr <rabroersma@yahoo.com> wrote:
>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This is just a quick guess. I am not sure if the logic is correct but it could be a starting
point.

select P2.worker_id, P2.pdate, P1.position, P1.salary
from position as P1
  join
  (select worker_id, max(startdate) as pdate
   from position
   where startdate <= '2006-05-01'
   group by worker_id, position_id) as P2
on (P1.worker_id = P2.worker_id) and (P1.startdate = P2.pdate)
;

Regards,

Richard Broersma Jr.

Re: I need help creating a query

From
"Sergio Duran"
Date:
How about if we make it simpler, only 1 table

create table worker(
    name varchar(50),
    position varchar(50),
    startdate date,
    salary numeric(9,2));
insert into worker values ('Jon', 'boss', '2001-01-1', 1000.00);
insert into worker values ('Peter', 'cleaning', '2002-01-01', 100.00);
insert into worker values ('Peter', 'programming', '2004-01-01', 300.00);
insert into worker values ('Peter', 'management', '2006-01-01', 500.00);
select * from worker;
 name  |  position   | startdate  | salary
-------+-------------+------------+---------
 Jon   | boss        | 2001-01-01 | 1000.00
 Peter | cleaning    | 2002-01-01 |  100.00
 Peter | programming | 2004-01-01 |  300.00
 Peter | management  | 2006-01-01 |  500.00

I want to group by name, order by date desc and show the first grouped salary, maybe I should write an aggregate function that saves the first value and ignores the next ones. Is there already an aggregate function that does this? I havent written any aggregate functions yet, can anybody spare some pointers?

On 7/14/06, Sergio Duran <sergioduran@gmail.com> wrote:
Nice, Richard, but you use max(startdate), how about the salary? i cant use max(salary) how about if he got a pay cut?

My current solution is to write the nested query on the field list, like
SELECT worker.*,
  (select salary
     FROM position where worker_id=worker.worker_id
     and fecha<='2006-04-01' LIMIT 1) as salary
FROM worker;

but I can only return 1 column from that subquery and repeating the same subquery for each column needed (position, date and salary) seems a little too much, if I write a procedure would postgres would optimize the access?



On 7/13/06, Richard Broersma Jr < rabroersma@yahoo.com> wrote:
>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This is just a quick guess. I am not sure if the logic is correct but it could be a starting
point.

select P2.worker_id, P2.pdate, P1.position, P1.salary
from position as P1
  join
  (select worker_id, max(startdate) as pdate
   from position
   where startdate <= '2006-05-01'
   group by worker_id, position_id) as P2
on (P1.worker_id = P2.worker_id) and (P1.startdate = P2.pdate)
;

Regards,

Richard Broersma Jr.


Re: I need help creating a query

From
Q
Date:
On 15/07/2006, at 2:07 AM, Sergio Duran wrote:

How about if we make it simpler, only 1 table

create table worker(
    name varchar(50),
    position varchar(50),
    startdate date,
    salary numeric(9,2));
insert into worker values ('Jon', 'boss', '2001-01-1', 1000.00);
insert into worker values ('Peter', 'cleaning', '2002-01-01', 100.00);
insert into worker values ('Peter', 'programming', '2004-01-01', 300.00);
insert into worker values ('Peter', 'management', '2006-01-01', 500.00);
select * from worker;
 name  |  position   | startdate  | salary
-------+-------------+------------+---------
 Jon   | boss        | 2001-01-01 | 1000.00
 Peter | cleaning    | 2002-01-01 |  100.00
 Peter | programming | 2004-01-01 |  300.00
 Peter | management  | 2006-01-01 |  500.00

I want to group by name, order by date desc and show the first grouped salary, maybe I should write an aggregate function that saves the first value and ignores the next ones. Is there already an aggregate function that does this? I havent written any aggregate functions yet, can anybody spare some pointers? 

Try this:

SELECT w2.*
FROM (  SELECT name,
            MAX(startdate) AS startdate
        FROM worker
        GROUP BY name
    )
    AS w1
        JOIN worker AS w2
        ON (w1.name = w2.name
        AND w1.startdate = w2.startdate);

Obviously you would use a real primary key instead of 'name' for the join constraint but you get the idea 



-- 

Seeya...Q


               -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-


                          _____  /  Quinton Dolan - qdolan@gmail.com

  __  __/  /   /   __/   /      /      

     /    __  /   _/    /      /        Gold Coast, QLD, Australia

  __/  __/ __/ ____/   /   -  /            Ph: +61 419 729 806

                    _______  /

                            _\




Re: I need help creating a query

From
Richard Broersma Jr
Date:
> create table worker(
>     name varchar(50),
>     position varchar(50),
>     startdate date,
>     salary numeric(9,2));
> insert into worker values ('Jon', 'boss', '2001-01-1', 1000.00);
> insert into worker values ('Peter', 'cleaning', '2002-01-01', 100.00);
> insert into worker values ('Peter', 'programming', '2004-01-01', 300.00);
> insert into worker values ('Peter', 'management', '2006-01-01', 500.00);
> select * from worker;
>  name  |  position   | startdate  | salary
> -------+-------------+------------+---------
>  Jon   | boss        | 2001-01-01 | 1000.00
>  Peter | cleaning    | 2002-01-01 |  100.00
>  Peter | programming | 2004-01-01 |  300.00
>  Peter | management  | 2006-01-01 |  500.00
>
> I want to group by name, order by date desc and show the first grouped
> salary, maybe I should write an aggregate function that saves the first
> value and ignores the next ones. Is there already an aggregate function that
> does this? I havent written any aggregate functions yet, can anybody spare
> some pointers?

This query didn't give you the max salary.  First, the subselect give your maximum start date for
each employee the occurred before your given date '2006-05-01', regardless if they get a raise or
a cut.

Then we join the result of the sub-select to the main table to get the specific records that meet
the criteria of the sub-select.

select W2.name, W1.position, W2.pdate, w1.salary
from worker as W1
join
     (select name, max(startdate) as pdate
             from worker
             where startdate <= '2005-01-01'
             group by name
     ) as W2
on (W1.name = W2.name) and (W1.startdate = W2.pdate)
;

 name  |  position   |   pdate    | salary
-------+-------------+------------+---------
 Jon   | boss        | 2001-01-01 | 1000.00
 Peter | programming | 2004-01-01 |  300.00

So with this query, we get what everyones salary would be on the date of '2005-01-01' regardless
of raises or cuts.

Regards,

Richard Broersma Jr.

Re: I need help creating a query

From
"Sergio Duran"
Date:
This is what I did, I used plpgsql,

create or replace function first_accum(anyelement, anyelement) returns anyelement as $$
BEGIN
  IF $1 IS NOT NULL THEN return $1;
  ELSE return $2;
  END IF;
END' language plpgsql;

then I created the aggregate
CREATE AGGREGATE first(sfunc = first_accum, basetype = anyelement, stype = anyelement);

first_accum is basically the same thing as coalesce, but CREATE AGGREGATE wasn't allowing me to use coalesce.

now I can get the salaries and positions each worker had on a given date.

SELECT name, first(startdate), first(salary)
FROM worker
JOIN position ON position.worker_id=worker.worker_id
WHERE fecha<='2006-05-01'
ORDER BY fecha DESC;

I'd appreciate some feedback, I hope there's a better way to do this. (maybe without creating the plpgsql function but using an internal function)

On 7/14/06, Sergio Duran <sergioduran@gmail.com> wrote:
Nice, Richard, but you use max(startdate), how about the salary? i cant use max(salary) how about if he got a pay cut?

My current solution is to write the nested query on the field list, like
SELECT worker.*,
  (select salary
     FROM position where worker_id=worker.worker_id
     and fecha<='2006-04-01' LIMIT 1) as salary
FROM worker;

but I can only return 1 column from that subquery and repeating the same subquery for each column needed (position, date and salary) seems a little too much, if I write a procedure would postgres would optimize the access?



On 7/13/06, Richard Broersma Jr < rabroersma@yahoo.com> wrote:
>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This is just a quick guess. I am not sure if the logic is correct but it could be a starting
point.

select P2.worker_id, P2.pdate, P1.position, P1.salary
from position as P1
  join
  (select worker_id, max(startdate) as pdate
   from position
   where startdate <= '2006-05-01'
   group by worker_id, position_id) as P2
on (P1.worker_id = P2.worker_id) and (P1.startdate = P2.pdate)
;

Regards,

Richard Broersma Jr.


Re: I need help creating a query

From
"Marcin Mank"
Date:
----- Original Message -----
From: "Sergio Duran" <sergioduran@gmail.com>
To: <pgsql-general@postgresql.org>
Sent: Thursday, July 13, 2006 9:20 PM
Subject: [GENERAL] I need help creating a query


> Hello,
>
> I need a little help creating a query, I have two tables, worker and
> position, for simplicity sake worker only has its ID and its name,
position
> has the ID of the worker, the name of his position, a date, and his
salary/
>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the
position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This should work:

select distinct on(W.worker_id) W.name,P.position,P.salary
from worker W,position P
where P.worker_id=W.worker_id
and 'SOME DATE' >= P.startdate
order by W.worker_id,P.startdate

Cheers
Marcin


Re: I need help creating a query

From
"Sergio Duran"
Date:
Ok, all the suggestions were good.

I think I'll stick with Marcin Mank's query for now, I'll also try to work further with Richard Broersma's query later.

Thank you guys, you were really helpful.


On 7/14/06, Marcin Mank <marcin.mank@gmail.com> wrote:

----- Original Message -----
From: "Sergio Duran" <sergioduran@gmail.com>
To: <pgsql-general@postgresql.org >
Sent: Thursday, July 13, 2006 9:20 PM
Subject: [GENERAL] I need help creating a query


> Hello,
>
> I need a little help creating a query, I have two tables, worker and
> position, for simplicity sake worker only has its ID and its name,
position
> has the ID of the worker, the name of his position, a date, and his
salary/
>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the
position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This should work:

select distinct on(W.worker_id) W.name,P.position,P.salary
from worker W,position P
where P.worker_id=W.worker_id
and 'SOME DATE' >= P.startdate
order by W.worker_id,P.startdate

Cheers
Marcin