Thread: A complex SQL query

A complex SQL query

From
"jj08"
Date:

 I hope someone can give me some pointers.

Here is my table.

+--------+----------+------------+-----------+

| usr_id | employer | start_date | end_date  |

+--------+----------+------------+-----------+

| A      | Goo      | 201904     | -         |

| A      | Micro    | 201704     | 201903    |

 

| B      | Micro    | 201706     | -         |

| B      | Goo      | 201012     | 201705    |

| B      | Micro    | 201001     | 201011    |

+--------+----------+------------+-----------+

 

I am trying to list up people working for a company called "Micro".

Some people work for a company multiple times, like user B.

I only need one line per user, displaying only the latest affiliation date.

 

For user_id "B", I could do Select user_id, MAX(start_date), end_date where employer = 'Micro',

but that would fail to get record for user_id "A".

 

If multiple records exist, I want to do MAX, but if only a single record exists, I don't need MAX.

How do I do that? 

From the above data, I would like to see only two lines:

| A      | Micro    | 201704     | 201903    |

| B      | Micro    | 201706     | -         |

Thank you.


-------------------------
Online Storage & Sharing, Online Backup, FTP / Email Server Hosting and More.
Drive Headquarters. Top quality services designed for business!
Sign up free at: www.DriveHQ.com.

Re: A complex SQL query

From
Jānis Pūris
Date:
Would something like this work for you ? http://www.sqlfiddle.com/#!17/2e45eb/9

select
    user_id,
    max(start_date) as start_date,
    case when max(end_date) < max(start_date) then null else max(end_date) end as end_date
from
    employment
where
    employer = 'Micro'
group by
    user_id
;
On 6 Sep 2019, 14:31 +0200, jj08 <jj08@drivehq.com>, wrote:

 I hope someone can give me some pointers.

Here is my table.

+--------+----------+------------+-----------+

| usr_id | employer | start_date | end_date  |

+--------+----------+------------+-----------+

| A      | Goo      | 201904     | -         |

| A      | Micro    | 201704     | 201903    |

 

| B      | Micro    | 201706     | -         |

| B      | Goo      | 201012     | 201705    |

| B      | Micro    | 201001     | 201011    |

+--------+----------+------------+-----------+

 

I am trying to list up people working for a company called "Micro".

Some people work for a company multiple times, like user B.

I only need one line per user, displaying only the latest affiliation date.

 

For user_id "B", I could do Select user_id, MAX(start_date), end_date where employer = 'Micro',

but that would fail to get record for user_id "A".

 

If multiple records exist, I want to do MAX, but if only a single record exists, I don't need MAX.

How do I do that? 

From the above data, I would like to see only two lines:

| A      | Micro    | 201704     | 201903    |

| B      | Micro    | 201706     | -         |

Thank you.


-------------------------
Online Storage & Sharing, Online Backup, FTP / Email Server Hosting and More.
Drive Headquarters. Top quality services designed for business!
Sign up free at: www.DriveHQ.com.

RE:Re: A complex SQL query

From
"jj08"
Date:

Yes, that's exactly what works for me.

Thanks a bunch!!!

 

--From: janis@puris.lv
--To: pgsql-sql@lists.postgresql.org,jj08@drivehq.com
--Date: 9/6/2019 5:56:10 AM
--Subject: Re: A complex SQL query

Would something like this work for you ? http://www.sqlfiddle.com/#!17/2e45eb/9
 
select
    user_id,
    max(start_date) as start_date,
    case when max(end_date) < max(start_date) then null else max(end_date) end as end_date
from
    employment
where
    employer = 'Micro'
group by
    user_id
;
On 6 Sep 2019, 14:31 +0200, jj08 , wrote:

 I hope someone can give me some pointers.

Here is my table.

+--------+----------+------------+-----------+

| usr_id | employer | start_date | end_date  |

+--------+----------+------------+-----------+

| A      | Goo      | 201904     | -         |

| A      | Micro    | 201704     | 201903    |

 

| B      | Micro    | 201706     | -         |

| B      | Goo      | 201012     | 201705    |

| B      | Micro    | 201001     | 201011    |

+--------+----------+------------+-----------+

 

I am trying to list up people working for a company called "Micro".

Some people work for a company multiple times, like user B.

I only need one line per user, displaying only the latest affiliation date.

 

For user_id "B", I could do Select user_id, MAX(start_date), end_date where employer = 'Micro',

but that would fail to get record for user_id "A".

 

If multiple records exist, I want to do MAX, but if only a single record exists, I don't need MAX.

How do I do that? 

From the above data, I would like to see only two lines:

| A      | Micro    | 201704     | 201903    |

| B      | Micro    | 201706     | -         |

Thank you.


-------------------------
Online Storage & Sharing, Online Backup, FTP / Email Server Hosting and More.
Drive Headquarters. Top quality services designed for business!
Sign up free at: www.DriveHQ.com.

-------------------------
Online Storage & Sharing, Online Backup, FTP / Email Server Hosting and More.
Drive Headquarters. Top quality services designed for business!
Sign up free at: www.DriveHQ.com.

RE:RE:Re: A complex SQL query

From
"jj08"
Date:

To Janis or other members of the forum:

My sample table contains other columns as well.

One of the columns that I did not include in my first message is "position".
The last table has been modified as follows:

+--------+----------+------------+------------------------+
| usr_id | employer | position   + start_date | end_date  |
+--------+----------+------------+------------------------+
| A      | Goo      | CTO        |            | 201904    |        
| A      | Micro    | Mgr        | 201704     | 201903    |

| B      | Micro    | Engg_Mgr   | 201706     | -         |
| B      | Goo      | Researcher | 201012     | 201705    |
| B      | Micro    | Postdoc    | 201001     | 201011    |
+--------+----------+------------+------------------------+

Explanation:
B joined Micro as a postdoc in 201001. After 11 months (201011), he quit and moved to Goo as a researcher.
After about 6.5 years, he came back to Micro, this time, as an engineering manager.

Problem:
I want to retrieve position as well, but I cannot add 'position' to the select part.
//----------------------------------------------------------------------------------
select
   user_id,position,
   max(position_start) as position_start,
case when max(position_end) < max(position_start) then null else max(position_end) end as position_end
            from
                employer
            where
                employer= 'Micro'
            group by
                user_id;
           
//-------------------------------------

This results in error:

"ERROR:  column "employer.position" must appear in the GROUP BY clause or be used in an aggregate function


When I add position to the group by clause, like this:
//----------------------------------------------------------------------------------
select
   user_id,position,
   max(position_start) as position_start,
case when max(position_end) < max(position_start) then null else max(position_end) end as position_end
            from
                employer
            where
                employer= 'Micro'
            group by
                user_id, position;
           
//-------------------------------------

I get two records for B, both when he was a postdoc back in 201001, and Engg_Mgr (201706).
I only need the latest/single entry (the whole point I wanted to do MAX(start_date).

How could I do that?

Thank you.

 

 

 

--From: pgsql-sql-owner+M68182-176663@lists.postgresql.org
--To: jj08@drivehq.com,pgsql-sql@lists.postgresql.org,janis@puris.lv
--Date: 9/6/2019 7:35:30 AM
--Subject: RE:Re: A complex SQL query

Yes, that's exactly what works for me.

Thanks a bunch!!!

 

--From: janis@puris.lv
--To: pgsql-sql@lists.postgresql.org,jj08@drivehq.com
--Date: 9/6/2019 5:56:10 AM
--Subject: Re: A complex SQL query

Would something like this work for you ? http://www.sqlfiddle.com/#!17/2e45eb/9
 
select
    user_id,
    max(start_date) as start_date,
    case when max(end_date) < max(start_date) then null else max(end_date) end as end_date
from
    employment
where
    employer = 'Micro'
group by
    user_id
;
On 6 Sep 2019, 14:31 +0200, jj08


-------------------------
Online Storage & Sharing, Online Backup, FTP / Email Server Hosting and More.
Drive Headquarters. Top quality services designed for business!
Sign up free at: www.DriveHQ.com.


-------------------------
Online Storage & Sharing, Online Backup, FTP / Email Server Hosting and More.
Drive Headquarters. Top quality services designed for business!
Sign up free at: www.DriveHQ.com.

Re: RE:Re: A complex SQL query

From
Victor Yegorov
Date:
сб, 7 сент. 2019 г. в 21:23, jj08 <jj08@drivehq.com>:

To Janis or other members of the forum:

My sample table contains other columns as well.

One of the columns that I did not include in my first message is "position".
The last table has been modified as follows:

+--------+----------+------------+------------------------+
| usr_id | employer | position   + start_date | end_date  |
+--------+----------+------------+------------------------+
| A      | Goo      | CTO        |            | 201904    |        
| A      | Micro    | Mgr        | 201704     | 201903    |

| B      | Micro    | Engg_Mgr   | 201706     | -         |
| B      | Goo      | Researcher | 201012     | 201705    |
| B      | Micro    | Postdoc    | 201001     | 201011    |
+--------+----------+------------+------------------------+

SELECT DISTINCT ON (usr_id)
       *
  FROM employment
 WHERE employer='Micro'
 ORDER BY usr_id, start_date DESC;

--
Victor Yegorov

RE:Re: RE:Re: A complex SQL query

From
"jj08"
Date:

Hello Vicor!

It works!!!

Thank you very much.





--From: vyegorov@gmail.com
--To: jj08@drivehq.com
--CC: pgsql-sql@lists.postgresql.org
--Date: 9/7/2019 2:33:40 PM
--Subject: Re: RE:Re: A complex SQL query

сб, 7 сент. 2019 г. в 21:23, jj08 <jj08@drivehq.com>:

To Janis or other members of the forum:

My sample table contains other columns as well.

One of the columns that I did not include in my first message is "position".
The last table has been modified as follows:

+--------+----------+------------+------------------------+
| usr_id | employer | position   + start_date | end_date  |
+--------+----------+------------+------------------------+
| A      | Goo      | CTO        |            | 201904    |        
| A      | Micro    | Mgr        | 201704     | 201903    |

| B      | Micro    | Engg_Mgr   | 201706     | -         |
| B      | Goo      | Researcher | 201012     | 201705    |
| B      | Micro    | Postdoc    | 201001     | 201011    |
+--------+----------+------------+------------------------+

SELECT DISTINCT ON (usr_id)
       *
  FROM employment
 WHERE employer='Micro'
 ORDER BY usr_id, start_date DESC;
 
--
Victor Yegorov

-------------------------
Online Storage & Sharing, Online Backup, FTP / Email Server Hosting and More.
Drive Headquarters. Top quality services designed for business!
Sign up free at: www.DriveHQ.com.

Re: RE:Re: RE:Re: A complex SQL query

From
v_kalees
Date:
This is much cleaner way

select * from (
select usr_id,
       employer,
       start_date,
       end_date,
       rank() OVER (PARTITION BY employer ORDER BY start_date DESC) AS rnk
from <tablename> ) t
where t.rnk=1;

Thanks
On Sunday, September 8, 2019, 7:40:05 AM GMT+5:30, jj08 <jj08@drivehq.com> wrote:


Hello Vicor!

It works!!!

Thank you very much.





--From: vyegorov@gmail.com
--To: jj08@drivehq.com
--CC: pgsql-sql@lists.postgresql.org
--Date: 9/7/2019 2:33:40 PM
--Subject: Re: RE:Re: A complex SQL query

сб, 7 сент. 2019 г. в 21:23, jj08 <jj08@drivehq.com>:

To Janis or other members of the forum:

My sample table contains other columns as well.

One of the columns that I did not include in my first message is "position".
The last table has been modified as follows:

+--------+----------+------------+------------------------+
| usr_id | employer | position   + start_date | end_date  |
+--------+----------+------------+------------------------+
| A      | Goo      | CTO        |            | 201904    |        
| A      | Micro    | Mgr        | 201704     | 201903    |

| B      | Micro    | Engg_Mgr   | 201706     | -         |
| B      | Goo      | Researcher | 201012     | 201705    |
| B      | Micro    | Postdoc    | 201001     | 201011    |
+--------+----------+------------+------------------------+

SELECT DISTINCT ON (usr_id)
       *
  FROM employment
 WHERE employer='Micro'
 ORDER BY usr_id, start_date DESC;
 
--
Victor Yegorov

-------------------------
Online Storage & Sharing, Online Backup, FTP / Email Server Hosting and More.
Drive Headquarters. Top quality services designed for business!
Sign up free at: www.DriveHQ.com.

RE:Re: RE:Re: RE:Re: A complex SQL query

From
"jj08"
Date:

v_kalees:

Thank you for your suggestion.

I tried your code at http://www.sqlfiddle.com/#!17/4e386/4/1

but couldn't make it work. (I wanted to see two rows, but your query returns only one row.)

My original question was to find people working for company "Micro", so I took liberty to add "where employer='Micro'" condition to your solution.

May be I put the where clause at a wrong place?

Anyway, for me it is getting more complex :)

Victor's solution is producing 2 rows as I wanted, so for now a crisis has been averted!

 

From: v_kalees@yahoo.com
--To: vyegorov@gmail.com,jj08@drivehq.com
--CC: pgsql-sql@lists.postgresql.org
--Date: 9/7/2019 8:20:57 PM
--Subject: Re: RE:Re: RE:Re: A complex SQL query

 
This is much cleaner way
 
select * from (
select usr_id,
       employer,
       start_date,
       end_date,
       rank() OVER (PARTITION BY employer ORDER BY start_date DESC) AS rnk
from ) t
where t.rnk=1;
 
Thanks
On Sunday, September 8, 2019, 7:40:05 AM GMT+5:30, jj08 wrote:
 
 

Hello Vicor!

It works!!!

Thank you very much.





--From: vyegorov@gmail.com
--To: jj08@drivehq.com
--CC: pgsql-sql@lists.postgresql.org
--Date: 9/7/2019 2:33:40 PM
--Subject: Re: RE:Re: A complex SQL query

сб, 7 сент. 2019 г. в 21:23, jj08 <jj08@drivehq.com>:

To Janis or other members of the forum:

My sample table contains other columns as well.

One of the columns that I did not include in my first message is "position".
The last table has been modified as follows:

+--------+----------+------------+------------------------+
| usr_id | employer | position   + start_date | end_date  |
+--------+----------+------------+------------------------+
| A      | Goo      | CTO        |            | 201904    |        
| A      | Micro    | Mgr        | 201704     | 201903    |

| B      | Micro    | Engg_Mgr   | 201706     | -         |
| B      | Goo      | Researcher | 201012     | 201705    |
| B      | Micro    | Postdoc    | 201001     | 201011    |
+--------+----------+------------+------------------------+

SELECT DISTINCT ON (usr_id)
       *
  FROM employment
 WHERE employer='Micro'
 ORDER BY usr_id, start_date DESC;
 
--
Victor Yegorov

-------------------------
Online Storage & Sharing, Online Backup, FTP / Email Server Hosting and More.
Drive Headquarters. Top quality services designed for business!
Sign up free at: www.DriveHQ.com.

 


-------------------------
Online Storage & Sharing, Online Backup, FTP / Email Server Hosting and More.
Drive Headquarters. Top quality services designed for business!
Sign up free at: www.DriveHQ.com.