Re: [SQL] Please advice on query optimization - Mailing list pgsql-sql

From Samed YILDIRIM
Subject Re: [SQL] Please advice on query optimization
Date
Msg-id 485421492716587@web5m.yandex.ru
Whole thread Raw
In response to Re: [SQL] Please advice on query optimization  (Muhannad Shubita <muhannadshubita@gmail.com>)
List pgsql-sql
Hi Muhannad,
 
first_value and last_value functions can be used for these purpose. But query should be rewrited as below. Also for this case I'm not sure which one is better, sub query or window functions.
 
SELECT DISTINCT employee_id, day, sum(total_hours) over w1, min(sign_in) OVER w1 as first_sign_in, max(sign_in) OVER w1 as last_sign_in, first_value(device_id) OVER w2 as first_sign_in_device, first_value(device_id) OVER w3 as last_sign_in_device
FROM bioemployee
WHERE sign_in BETWEEN X and Y
WINDOW w1 as (PARTITION BY employee_id,day),w2 as (PARTITION BY employee_id,day ORDER BY sign_in ASC),w3 as (PARTITION BY employee_id,day ORDER BY sign_in DESC)
ORDER BY day;
 
Best regards.
 
İyi çalışmalar.
Samed YILDIRIM
 
20.04.2017, 20:35, "Muhannad Shubita" <muhannadshubita@gmail.com>:
Thanks Samed,
 
one more question, what if I had other columns that cannot be used with an aggregate function (TEXT-based for example) but still needed to be paired with the first & last sign in? for instance, a randomly generated md5-ed ID by the sign in device that needs to be displayed as a reference:
 
Day  Employee Total-Hours  First-Sign-In   device-id-of-first-sign-in Last-Sign-In    device-id-of-last-sign-in
 
20/4 emp1 4 8:22 202cb962ac5.. 3:25                     152d234b70..
 
 
On Thu, Apr 20, 2017 at 9:18 PM, Samed YILDIRIM <samed@reddoc.net> wrote:
Hi Muhannad,
 
Did you try using MIN and MAX function? I guess that following query solves your problem.
 
select employee_id, day, sum(total_hours) as total_hours, (select name from employee where id = employee_id) as emp_name, min(sign_in) as first_sign_in, max(sign_in) as last_sign_in
from bioemployee where sign_in BETWEEN X and Y GROUP BY employee_id, day ORDER BY day;
 
Best regards.
 
 
 
 
İyi çalışmalar.
Samed YILDIRIM
 
 
 
20.04.2017, 19:56, "Muhannad Shubita" <muhannadshubita@gmail.com>:
Good day,
 
I have a table with biometric info about employees (let's call it bioemployee):
 
id serial not null -- primary key
employee_id  -- foreign key references employee table
day char(8) -- YY-MM-DD
sign_in TIMESTAMP
total_hours INTEGER
 
I want to display details grouped by day & employee Id, for example:
 
 
Day Employee Total-Hours First-Sign-In Last-Sign-In
 
20/4 emp1 4 8:22 3:25
21/4 emp1 7 9:00 4:11
21/4 emp2 2 11:00 01:11
 
 
I created a pgsql function to get the details through the below query:
 
select employee_id, day, sum(total_hours) as total_hours, (select name from employee where id = employee_id) as emp_name
from bioemployee where sign_in BETWEEN X and Y GROUP BY employee_id, day ORDER BY day;
 
now the problem is with getting First-Sign-In  &  Last-Sign-In per group (employee & day), I have currently implemented it in a FOR loop:
 
for RECORD in query LOOP
--First-Sign-In
select sign_in from bioemployee where employee_id = -- and day = -- and sign_in BETWEEN X and Y ORDER BY sign_in LIMIT 1;
--Last-Sign-In
select sign_in from bioemployee where employee_id = -- and day = -- and sign_in BETWEEN X and Y ORDER BY sign_in DESC LIMIT 1;
 
return next json_build_object(first_sign_in, last_sign_in, ..rest of details);
END LOOP
 
but If I had 100 employees over a span of 30 days, this would be 6000 queries inside the loop! which I am sure you would agree is an overkill
 
 
is there a better way to do this? 
 
 
Thanks.
 
 
--
Regards,
Muhannad

pgsql-sql by date:

Previous
From: Muhannad Shubita
Date:
Subject: Re: [SQL] Please advice on query optimization
Next
From: "David G. Johnston"
Date:
Subject: Re: [SQL] Please advice on query optimization