Thread: retrieving primary key for row with MIN function

retrieving primary key for row with MIN function

From
Marcin Krol
Date:
Hello everyone,

I need to retrieve PK (r.id in the query) for row with
MIN(r.start_date), but with a twist: I need to select only one record,
the one with minimum date.

Doing it like this does not solve the problem:

SELECT h.id AS host_id, MIN(r.start_date) AS reservation_start_date,
r.id AS reservation_id
FROM hosts h
LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
(r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date, '2010-04-29'::date)
GROUP BY h.id, r.id
ORDER BY reservation_start_date ASC

I have to use either GROUP BY r.id or use MIN(r.id). MIN(r.id) doesn't
select the id from the row with corresponding MIN(r.start_date), so it's
useless, while GROUP BY r.id produces more than one row:

host_id    reservation_start_date    reservation_id
361     2009-05-11              38
361     2009-05-17              21

I need to select only row with reservation_id = 38.

I would rather not do subquery for every 'host' record, since there can
be a lot of them...

Regards,
mk




Re: retrieving primary key for row with MIN function

From
Sean Davis
Date:


On Wed, Apr 29, 2009 at 8:37 AM, Marcin Krol <mrkafk@gmail.com> wrote:
Hello everyone,

I need to retrieve PK (r.id in the query) for row with
MIN(r.start_date), but with a twist: I need to select only one record,
the one with minimum date.

Doing it like this does not solve the problem:

SELECT h.id AS host_id, MIN(r.start_date) AS reservation_start_date,
r.id AS reservation_id
FROM hosts h
LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
(r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date, '2010-04-29'::date)
GROUP BY h.id, r.id

HAVING r.start_date=MIN(r.start_date);

Does that do what you need?
 

ORDER BY reservation_start_date ASC

I have to use either GROUP BY r.id or use MIN(r.id). MIN(r.id) doesn't
select the id from the row with corresponding MIN(r.start_date), so it's
useless, while GROUP BY r.id produces more than one row:

host_id reservation_start_date  reservation_id
361     2009-05-11              38
361     2009-05-17              21

I need to select only row with reservation_id = 38.

I would rather not do subquery for every 'host' record, since there can
be a lot of them...

Regards,
mk




--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

Re: retrieving primary key for row with MIN function

From
Marcin Krol
Date:
Sean Davis wrote:
>     I need to retrieve PK (r.id <http://r.id> in the query) for row with
>     MIN(r.start_date), but with a twist: I need to select only one record,
>     the one with minimum date.
>
>     Doing it like this does not solve the problem:
>
>     SELECT h.id <http://h.id> AS host_id, MIN(r.start_date) AS
>     reservation_start_date,
>     r.id <http://r.id> AS reservation_id
>     FROM hosts h
>     LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id <http://h.id>
>     LEFT OUTER JOIN reservation r ON r.id <http://r.id> =
>     rh.reservation_id AND
>     (r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date,
>     '2010-04-29'::date)
>     GROUP BY h.id <http://h.id>, r.id <http://r.id>
>
>
> HAVING r.start_date=MIN(r.start_date);
>
> Does that do what you need?

Unfortunately, it doesn't: if I add HAVING
r.start_date=MIN(r.start_date), Postgres complains:

ERROR:  column "r.start_date" must appear in the GROUP BY clause or be
used in an aggregate function

Which means I have to add it to GROUP BY clause. Which means that
separate records for r.start_date (for many reservations of a given
host) appear:

     SELECT h.id AS host_id, MIN(r.start_date) AS
reservation_start_date, r.id AS reservation_id
     FROM hosts h
     LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
     LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
     (r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date,
'2010-04-29'::date)
     GROUP BY h.id, r.id, r.start_date
     HAVING r.start_date = MIN(r.start_date)
   ORDER BY h.id


host_id    reservation_start_date    reservation_id
344     2009-05-11            5
344     2009-05-18              6
344     2009-05-29              7


Regards,
mk

Re: retrieving primary key for row with MIN function

From
Adam Ruth
Date:
Without windowing functions, I'm not sure of a built in method. But you could create your own aggregate.

------------------------
create type top_id_type as (id int, date timestamp);

create or replace function top_id_state ( state top_id_type, id int, date timestamp) returns top_id_type as $$
declare
result top_id_type;
begin
if state is null or date < state.date then
result.id = id;
result.date = date;
else
result = state;
end if;
return result;
end
$$ language plpgsql;

create or replace function top_id_final (state top_id_type) returns int as $$
begin
return state.id;
end
$$ language plpgsql;
create aggregate top_id (int, timestamp) (
sfunc = top_id_state,
stype = top_id_type,
finalfunc = top_id_final
);
----------------------- 

This is my first ever user defined aggregate, so someone may be able to improve it.

------------------------- Usage

SELECT h.id AS host_id, MIN(r.start_date) AS reservation_start_date,
top_id(r.id, r.start_date) AS reservation_id
FROM hosts h
LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
(r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date, '2010-04-29'::date)
GROUP BY h.id
ORDER BY reservation_start_date ASC

-------------------------

On 29/04/2009, at 10:37 PM, Marcin Krol wrote:

Hello everyone,

I need to retrieve PK (r.id in the query) for row with
MIN(r.start_date), but with a twist: I need to select only one record,
the one with minimum date.

Doing it like this does not solve the problem:

SELECT h.id AS host_id, MIN(r.start_date) AS reservation_start_date,
r.id AS reservation_id
FROM hosts h
LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
(r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date, '2010-04-29'::date)
GROUP BY h.id, r.id
ORDER BY reservation_start_date ASC

I have to use either GROUP BY r.id or use MIN(r.id). MIN(r.id) doesn't
select the id from the row with corresponding MIN(r.start_date), so it's
useless, while GROUP BY r.id produces more than one row:

host_id reservation_start_date reservation_id
361     2009-05-11              38
361     2009-05-17              21

I need to select only row with reservation_id = 38.

I would rather not do subquery for every 'host' record, since there can
be a lot of them...

Regards,
mk




--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

Re: retrieving primary key for row with MIN function

From
Adam Ruth
Date:
Out of curiosity I ran some quick and dirty tests. It looks like either can be as much as twice as fast as the other, depending on the data and query, so you'd want to runs some tests before picking a method.



On 30/04/2009, at 12:28 PM, Sean Davis wrote:



On Wed, Apr 29, 2009 at 6:20 PM, Adam Ruth <adamruth@mac.com> wrote:
Without windowing functions, I'm not sure of a built in method. But you could create your own aggregate.

Seems like you should be able to do this with a correlated subquery that does an order and limit 1 and not have to use a custom aggregate?

Sean
 

------------------------
create type top_id_type as (id int, date timestamp);

create or replace function top_id_state ( state top_id_type, id int, date timestamp) returns top_id_type as $$
declare
result top_id_type;
begin
if state is null or date < state.date then
result.id = id;
result.date = date;
else
result = state;
end if;
return result;
end
$$ language plpgsql;

create or replace function top_id_final (state top_id_type) returns int as $$
begin
return state.id;
end
$$ language plpgsql;
create aggregate top_id (int, timestamp) (
sfunc = top_id_state,
stype = top_id_type,
finalfunc = top_id_final
);
----------------------- 

This is my first ever user defined aggregate, so someone may be able to improve it.

------------------------- Usage

SELECT h.id AS host_id, MIN(r.start_date) AS reservation_start_date,
top_id(r.id, r.start_date) AS reservation_id

FROM hosts h
LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
(r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date, '2010-04-29'::date)
GROUP BY h.id
ORDER BY reservation_start_date ASC

-------------------------

On 29/04/2009, at 10:37 PM, Marcin Krol wrote:

Hello everyone,

I need to retrieve PK (r.id in the query) for row with

MIN(r.start_date), but with a twist: I need to select only one record,
the one with minimum date.

Doing it like this does not solve the problem:

SELECT h.id AS host_id, MIN(r.start_date) AS reservation_start_date,
r.id AS reservation_id
FROM hosts h
LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
(r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date, '2010-04-29'::date)
GROUP BY h.id, r.id
ORDER BY reservation_start_date ASC

I have to use either GROUP BY r.id or use MIN(r.id). MIN(r.id) doesn't
select the id from the row with corresponding MIN(r.start_date), so it's
useless, while GROUP BY r.id produces more than one row:

host_id reservation_start_date reservation_id
361     2009-05-11              38
361     2009-05-17              21

I need to select only row with reservation_id = 38.

I would rather not do subquery for every 'host' record, since there can
be a lot of them...

Regards,
mk




--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice



Re: retrieving primary key for row with MIN function

From
Jasen Betts
Date:
On 2009-04-29, Marcin Krol <mrkafk@gmail.com> wrote:
> Hello everyone,
>
> I need to retrieve PK (r.id in the query) for row with
> MIN(r.start_date), but with a twist: I need to select only one record,
> the one with minimum date.

use LIMIT.

eg:

SELECT
   h.id  AS host_id,
   r.start_date AS reservation_start_date,
   r.id AS reservation_id
FROM hosts h
LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
(r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date,'2010-04-29'::date)
ORDER BY reservation_start_date ASC
LIMIT 1



Re: retrieving primary key for row with MIN function

From
daq
Date:
MK> Hello everyone,

MK> I need to retrieve PK (r.id in the query) for row with
MK> MIN(r.start_date), but with a twist: I need to select only one record,
MK> the one with minimum date.

MK> Doing it like this does not solve the problem:

MK> SELECT h.id AS host_id, MIN(r.start_date) AS reservation_start_date,
MK> r.id AS reservation_id
MK> FROM hosts h
MK> LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
MK> LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
MK> (r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date, '2010-04-29'::date)
MK> GROUP BY h.id, r.id
MK> ORDER BY reservation_start_date ASC

MK> I have to use either GROUP BY r.id or use MIN(r.id). MIN(r.id) doesn't
MK> select the id from the row with corresponding MIN(r.start_date), so it's
MK> useless, while GROUP BY r.id produces more than one row:

MK> host_id reservation_start_date  reservation_id
MK> 361     2009-05-11              38
MK> 361     2009-05-17              21

MK> I need to select only row with reservation_id = 38.

MK> I would rather not do subquery for every 'host' record, since there can
MK> be a lot of them...

MK> Regards,
MK> mk

I think, the key is:
SELECT id FROM reservation WHERE start_date=(íselect min(start_date)
FROM reservation);

You got the id. Use it well! :)

DAQ


Re: retrieving primary key for row with MIN function

From
Adam Ruth
Date:
He said he'd rather not do a subquery because of performance. Though, it may actually be faster than the aggregate.

On 30/04/2009, at 12:28 PM, Sean Davis wrote:



On Wed, Apr 29, 2009 at 6:20 PM, Adam Ruth <adamruth@mac.com> wrote:
Without windowing functions, I'm not sure of a built in method. But you could create your own aggregate.

Seems like you should be able to do this with a correlated subquery that does an order and limit 1 and not have to use a custom aggregate?

Sean
 

------------------------
create type top_id_type as (id int, date timestamp);

create or replace function top_id_state ( state top_id_type, id int, date timestamp) returns top_id_type as $$
declare
result top_id_type;
begin
if state is null or date < state.date then
result.id = id;
result.date = date;
else
result = state;
end if;
return result;
end
$$ language plpgsql;

create or replace function top_id_final (state top_id_type) returns int as $$
begin
return state.id;
end
$$ language plpgsql;
create aggregate top_id (int, timestamp) (
sfunc = top_id_state,
stype = top_id_type,
finalfunc = top_id_final
);
----------------------- 

This is my first ever user defined aggregate, so someone may be able to improve it.

------------------------- Usage

SELECT h.id AS host_id, MIN(r.start_date) AS reservation_start_date,
top_id(r.id, r.start_date) AS reservation_id

FROM hosts h
LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
(r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date, '2010-04-29'::date)
GROUP BY h.id
ORDER BY reservation_start_date ASC

-------------------------

On 29/04/2009, at 10:37 PM, Marcin Krol wrote:

Hello everyone,

I need to retrieve PK (r.id in the query) for row with

MIN(r.start_date), but with a twist: I need to select only one record,
the one with minimum date.

Doing it like this does not solve the problem:

SELECT h.id AS host_id, MIN(r.start_date) AS reservation_start_date,
r.id AS reservation_id
FROM hosts h
LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
(r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date, '2010-04-29'::date)
GROUP BY h.id, r.id
ORDER BY reservation_start_date ASC

I have to use either GROUP BY r.id or use MIN(r.id). MIN(r.id) doesn't
select the id from the row with corresponding MIN(r.start_date), so it's
useless, while GROUP BY r.id produces more than one row:

host_id reservation_start_date reservation_id
361     2009-05-11              38
361     2009-05-17              21

I need to select only row with reservation_id = 38.

I would rather not do subquery for every 'host' record, since there can
be a lot of them...

Regards,
mk




--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice



Re: retrieving primary key for row with MIN function

From
Sean Davis
Date:


On Wed, Apr 29, 2009 at 6:20 PM, Adam Ruth <adamruth@mac.com> wrote:
Without windowing functions, I'm not sure of a built in method. But you could create your own aggregate.

Seems like you should be able to do this with a correlated subquery that does an order and limit 1 and not have to use a custom aggregate?

Sean
 

------------------------
create type top_id_type as (id int, date timestamp);

create or replace function top_id_state ( state top_id_type, id int, date timestamp) returns top_id_type as $$
declare
result top_id_type;
begin
if state is null or date < state.date then
result.id = id;
result.date = date;
else
result = state;
end if;
return result;
end
$$ language plpgsql;

create or replace function top_id_final (state top_id_type) returns int as $$
begin
return state.id;
end
$$ language plpgsql;
create aggregate top_id (int, timestamp) (
sfunc = top_id_state,
stype = top_id_type,
finalfunc = top_id_final
);
----------------------- 

This is my first ever user defined aggregate, so someone may be able to improve it.

------------------------- Usage

SELECT h.id AS host_id, MIN(r.start_date) AS reservation_start_date,
top_id(r.id, r.start_date) AS reservation_id

FROM hosts h
LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
(r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date, '2010-04-29'::date)
GROUP BY h.id
ORDER BY reservation_start_date ASC

-------------------------

On 29/04/2009, at 10:37 PM, Marcin Krol wrote:

Hello everyone,

I need to retrieve PK (r.id in the query) for row with

MIN(r.start_date), but with a twist: I need to select only one record,
the one with minimum date.

Doing it like this does not solve the problem:

SELECT h.id AS host_id, MIN(r.start_date) AS reservation_start_date,
r.id AS reservation_id
FROM hosts h
LEFT OUTER JOIN reservation_hosts rh ON rh.host_id = h.id
LEFT OUTER JOIN reservation r ON r.id = rh.reservation_id AND
(r.start_date, r.end_date) OVERLAPS ('2009-04-29'::date, '2010-04-29'::date)
GROUP BY h.id, r.id
ORDER BY reservation_start_date ASC

I have to use either GROUP BY r.id or use MIN(r.id). MIN(r.id) doesn't
select the id from the row with corresponding MIN(r.start_date), so it's
useless, while GROUP BY r.id produces more than one row:

host_id reservation_start_date reservation_id
361     2009-05-11              38
361     2009-05-17              21

I need to select only row with reservation_id = 38.

I would rather not do subquery for every 'host' record, since there can
be a lot of them...

Regards,
mk




--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice