Thread: I need help creating a query

I need help creating a query

From

"Sergio Duran"

Date:

13 July 2006, 16:20:26

Hello,

I need a little help creating a query, I have two tables, worker and position, for simplicity sake worker only has its ID and its name, position has the ID of the worker, the name of his position, a date, and his salary/

worker:   worker_id, name
position: position_id, worker_id, position, startdate, salary

If I perfom a query joining both tables, I can obtain all the workers and the positions the've had.

SELECT name, startdate, position, salary FROM worker JOIN position USING(worker_id);
worker1 | 2001-01-01 | boss             | 999999
worker2 | 2001-01-01 | cleaning        | 100
worker2 | 2006-04-01 | programmer   | 20000
worker2 | 2006-07-04 | management | 25000

so far so good, now I need to obtain all the workers only with the position they had on a given date.
if I wanted to know the positions on '2006-05-01' it would return
worker1 | 2001-01-01 | boss             | 999999
worker2 | 2006-04-01 | programmer   | 20000

Maybe I only need some insight here, maybe it should be done with 2 queries, maybe it should be done with functions, I don't know.

Thanks

Re: I need help creating a query

From

"Dann Corbit"

Date:

13 July 2006, 16:25:25

The query date supplied should be applied against start date and then take the minimum record from that set.

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Sergio Duran
Sent: Thursday, July 13, 2006 12:20 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] I need help creating a query

Re: I need help creating a query

From

Richard Broersma Jr

Date:

13 July 2006, 16:44:59

>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This is just a quick guess. I am not sure if the logic is correct but it could be a starting
point.

select P2.worker_id, P2.pdate, P1.position, P1.salary
from position as P1
  join
  (select worker_id, max(startdate) as pdate
   from position
   where startdate <= '2006-05-01'
   group by worker_id, position_id) as P2
on (P1.worker_id = P2.worker_id) and (P1.startdate = P2.pdate)
;

Regards,

Richard Broersma Jr.

Re: I need help creating a query

From

"Sergio Duran"

Date:

14 July 2006, 12:49:02

Nice, Richard, but you use max(startdate), how about the salary? i cant use max(salary) how about if he got a pay cut?

My current solution is to write the nested query on the field list, like
SELECT worker.*,
(select salary
FROM position where worker_id=worker.worker_id
and fecha<='2006-04-01' LIMIT 1) as salary
FROM worker;

but I can only return 1 column from that subquery and repeating the same subquery for each column needed (position, date and salary) seems a little too much, if I write a procedure would postgres would optimize the access?

On 7/13/06, Richard Broersma Jr <rabroersma@yahoo.com> wrote:

>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This is just a quick guess. I am not sure if the logic is correct but it could be a starting
point.

select P2.worker_id, P2.pdate, P1.position, P1.salary
from position as P1
  join
  (select worker_id, max(startdate) as pdate
   from position
   where startdate <= '2006-05-01'
   group by worker_id, position_id) as P2
on (P1.worker_id = P2.worker_id) and (P1.startdate = P2.pdate)
;

Regards,

Richard Broersma Jr.

Re: I need help creating a query

From

"Sergio Duran"

Date:

14 July 2006, 13:07:50

How about if we make it simpler, only 1 table

create table worker(
    name varchar(50),
    position varchar(50),
    startdate date,
    salary numeric(9,2));
insert into worker values ('Jon', 'boss', '2001-01-1', 1000.00);
insert into worker values ('Peter', 'cleaning', '2002-01-01', 100.00);
insert into worker values ('Peter', 'programming', '2004-01-01', 300.00);
insert into worker values ('Peter', 'management', '2006-01-01', 500.00);
select * from worker;
name | position   | startdate | salary
-------+-------------+------------+---------
Jon   | boss        | 2001-01-01 | 1000.00
Peter | cleaning    | 2002-01-01 | 100.00
Peter | programming | 2004-01-01 | 300.00
Peter | management | 2006-01-01 | 500.00

I want to group by name, order by date desc and show the first grouped salary, maybe I should write an aggregate function that saves the first value and ignores the next ones. Is there already an aggregate function that does this? I havent written any aggregate functions yet, can anybody spare some pointers?

On 7/14/06, Sergio Duran <sergioduran@gmail.com> wrote:

Nice, Richard, but you use max(startdate), how about the salary? i cant use max(salary) how about if he got a pay cut?

My current solution is to write the nested query on the field list, like
SELECT worker.*,
(select salary
     FROM position where worker_id=worker.worker_id
     and fecha<='2006-04-01' LIMIT 1) as salary
FROM worker;

but I can only return 1 column from that subquery and repeating the same subquery for each column needed (position, date and salary) seems a little too much, if I write a procedure would postgres would optimize the access?

On 7/13/06, Richard Broersma Jr < rabroersma@yahoo.com> wrote:
>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This is just a quick guess. I am not sure if the logic is correct but it could be a starting
point.

select P2.worker_id, P2.pdate, P1.position, P1.salary
from position as P1
  join
  (select worker_id, max(startdate) as pdate
   from position
   where startdate <= '2006-05-01'
   group by worker_id, position_id) as P2
on (P1.worker_id = P2.worker_id) and (P1.startdate = P2.pdate)
;

Regards,

Richard Broersma Jr.

Re: I need help creating a query

From

Date:

14 July 2006, 13:30:21

On 15/07/2006, at 2:07 AM, Sergio Duran wrote:

How about if we make it simpler, only 1 table

create table worker(
    name varchar(50),
    position varchar(50),
    startdate date,
    salary numeric(9,2));
insert into worker values ('Jon', 'boss', '2001-01-1', 1000.00);
insert into worker values ('Peter', 'cleaning', '2002-01-01', 100.00);
insert into worker values ('Peter', 'programming', '2004-01-01', 300.00);
insert into worker values ('Peter', 'management', '2006-01-01', 500.00);
select * from worker;
name | position   | startdate | salary
-------+-------------+------------+---------
Jon   | boss        | 2001-01-01 | 1000.00
Peter | cleaning    | 2002-01-01 | 100.00
Peter | programming | 2004-01-01 | 300.00
Peter | management | 2006-01-01 | 500.00

I want to group by name, order by date desc and show the first grouped salary, maybe I should write an aggregate function that saves the first value and ignores the next ones. Is there already an aggregate function that does this? I havent written any aggregate functions yet, can anybody spare some pointers?

Try this:

SELECT w2.*

FROM ( SELECT name,

MAX(startdate) AS startdate

FROM worker

GROUP BY name

)

AS w1

JOIN worker AS w2

ON (w1.name = w2.name

AND w1.startdate = w2.startdate);

Obviously you would use a real primary key instead of 'name' for the join constraint but you get the idea

Seeya...Q

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

_____ / Quinton Dolan - qdolan@gmail.com

__ __/ / / __/ / /

/ __ / _/ / / Gold Coast, QLD, Australia

__/ __/ __/ ____/ / - / Ph: +61 419 729 806

_______ /

Re: I need help creating a query

From

Richard Broersma Jr

Date:

14 July 2006, 13:40:45

> create table worker(
>     name varchar(50),
>     position varchar(50),
>     startdate date,
>     salary numeric(9,2));
> insert into worker values ('Jon', 'boss', '2001-01-1', 1000.00);
> insert into worker values ('Peter', 'cleaning', '2002-01-01', 100.00);
> insert into worker values ('Peter', 'programming', '2004-01-01', 300.00);
> insert into worker values ('Peter', 'management', '2006-01-01', 500.00);
> select * from worker;
>  name  |  position   | startdate  | salary
> -------+-------------+------------+---------
>  Jon   | boss        | 2001-01-01 | 1000.00
>  Peter | cleaning    | 2002-01-01 |  100.00
>  Peter | programming | 2004-01-01 |  300.00
>  Peter | management  | 2006-01-01 |  500.00
>
> I want to group by name, order by date desc and show the first grouped
> salary, maybe I should write an aggregate function that saves the first
> value and ignores the next ones. Is there already an aggregate function that
> does this? I havent written any aggregate functions yet, can anybody spare
> some pointers?

This query didn't give you the max salary.  First, the subselect give your maximum start date for
each employee the occurred before your given date '2006-05-01', regardless if they get a raise or
a cut.

Then we join the result of the sub-select to the main table to get the specific records that meet
the criteria of the sub-select.

select W2.name, W1.position, W2.pdate, w1.salary
from worker as W1
join
     (select name, max(startdate) as pdate
             from worker
             where startdate <= '2005-01-01'
             group by name
     ) as W2
on (W1.name = W2.name) and (W1.startdate = W2.pdate)
;

 name  |  position   |   pdate    | salary
-------+-------------+------------+---------
 Jon   | boss        | 2001-01-01 | 1000.00
 Peter | programming | 2004-01-01 |  300.00

So with this query, we get what everyones salary would be on the date of '2005-01-01' regardless
of raises or cuts.

Regards,

Richard Broersma Jr.

Re: I need help creating a query

From

"Sergio Duran"

Date:

14 July 2006, 13:44:18

This is what I did, I used plpgsql,

create or replace function first_accum(anyelement, anyelement) returns anyelement as $$
BEGIN
IF $1 IS NOT NULL THEN return $1;
ELSE return $2;
END IF;
END' language plpgsql;

then I created the aggregate

CREATE AGGREGATE first(sfunc = first_accum, basetype = anyelement, stype = anyelement);

first_accum is basically the same thing as coalesce, but CREATE AGGREGATE wasn't allowing me to use coalesce.

now I can get the salaries and positions each worker had on a given date.

SELECT name, first(startdate), first(salary)
FROM worker
JOIN position ON position.worker_id=worker.worker_id
WHERE fecha<='2006-05-01'
ORDER BY fecha DESC;

I'd appreciate some feedback, I hope there's a better way to do this. (maybe without creating the plpgsql function but using an internal function)

On 7/14/06, Sergio Duran <sergioduran@gmail.com> wrote:

Nice, Richard, but you use max(startdate), how about the salary? i cant use max(salary) how about if he got a pay cut?

My current solution is to write the nested query on the field list, like
SELECT worker.*,
(select salary
     FROM position where worker_id=worker.worker_id
     and fecha<='2006-04-01' LIMIT 1) as salary
FROM worker;

but I can only return 1 column from that subquery and repeating the same subquery for each column needed (position, date and salary) seems a little too much, if I write a procedure would postgres would optimize the access?

On 7/13/06, Richard Broersma Jr < rabroersma@yahoo.com> wrote:
>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This is just a quick guess. I am not sure if the logic is correct but it could be a starting
point.

select P2.worker_id, P2.pdate, P1.position, P1.salary
from position as P1
  join
  (select worker_id, max(startdate) as pdate
   from position
   where startdate <= '2006-05-01'
   group by worker_id, position_id) as P2
on (P1.worker_id = P2.worker_id) and (P1.startdate = P2.pdate)
;

Regards,

Richard Broersma Jr.

Re: I need help creating a query

From

"Marcin Mank"

Date:

14 July 2006, 14:18:47

----- Original Message -----
From: "Sergio Duran" <sergioduran@gmail.com>
To: <pgsql-general@postgresql.org>
Sent: Thursday, July 13, 2006 9:20 PM
Subject: [GENERAL] I need help creating a query


> Hello,
>
> I need a little help creating a query, I have two tables, worker and
> position, for simplicity sake worker only has its ID and its name,
position
> has the ID of the worker, the name of his position, a date, and his
salary/
>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the
position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This should work:

select distinct on(W.worker_id) W.name,P.position,P.salary
from worker W,position P
where P.worker_id=W.worker_id
and 'SOME DATE' >= P.startdate
order by W.worker_id,P.startdate

Cheers
Marcin

Re: I need help creating a query

From

"Sergio Duran"

Date:

14 July 2006, 14:32:59

Ok, all the suggestions were good.

I think I'll stick with Marcin Mank's query for now, I'll also try to work further with Richard Broersma's query later.

Thank you guys, you were really helpful.

On 7/14/06, Marcin Mank <marcin.mank@gmail.com> wrote:

----- Original Message -----
From: "Sergio Duran" <sergioduran@gmail.com>
To: <pgsql-general@postgresql.org >
Sent: Thursday, July 13, 2006 9:20 PM
Subject: [GENERAL] I need help creating a query

> Hello,
>
> I need a little help creating a query, I have two tables, worker and
> position, for simplicity sake worker only has its ID and its name,
position
> has the ID of the worker, the name of his position, a date, and his
salary/
>
> worker:   worker_id, name
> position: position_id, worker_id, position, startdate, salary
>
> If I perfom a query joining both tables, I can obtain all the workers and
> the positions the've had.
>
> SELECT name, startdate, position,  salary FROM worker JOIN position
> USING(worker_id);
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2001-01-01 | cleaning        | 100
> worker2 | 2006-04-01 | programmer   | 20000
> worker2 | 2006-07-04 | management | 25000
>
> so far so good, now I need to obtain all the workers only with the
position
> they had on a given date.
> if I wanted to know the positions on '2006-05-01' it would return
> worker1 | 2001-01-01 | boss             | 999999
> worker2 | 2006-04-01 | programmer   | 20000
>

This should work:

select distinct on(W.worker_id) W.name,P.position,P.salary
from worker W,position P
where P.worker_id=W.worker_id
and 'SOME DATE' >= P.startdate
order by W.worker_id,P.startdate

Cheers
Marcin