Re: first_value/last_value - Mailing list pgsql-general

From Alexey Bashtanov
Subject Re: first_value/last_value
Date
Msg-id 573D9F39.4040608@imap.cc
Whole thread Raw
In response to Re: first_value/last_value  (Tom Smith <tomsmith1989sk@gmail.com>)
Responses Re: first_value/last_value
List pgsql-general
Hello Tom,

I agree such functions are very useful, as they allow you to use ordinary aggregation functions such as sum/max/avg
along with first/last ones (traditionally served by DISTINCT ON or LIMIT) in the same group-by node
which improves performance and readability.

The first/last extension you mentioned http://pgxn.org/dist/first_last_agg/ has a slight disadvantage:
it relies on postgres providing it sorted data if user asks for defined ordering.
This makes first/last functions not supporting hashAgg, only groupAgg which usually requires explicit sort.

If you find first_last_agg performance poor take a look at argm extension http://pgxn.org/dist/argm/ .
It provides similar functionality but works faster as it does not sort or make postgres core sort,
but only chooses the first row within each group.

As for the original question unfortunately the way from an extension to postgres core is not too easy
and normally requires an extension to become popular and to be included in postgres distribution as a contrib module first.

Regards,
  Alexey Bashtanov

On 19/05/16 04:04, Tom Smith wrote:
It would really save all the troubles for many people if postgresql has a built-in first/last function  along with sum/avg.
There is already a C extension and a wiki sample  and  implemented for window function.
I am curious why these two functions were not added along  their window implementation counter part,
for completness and consistency


On Wed, May 18, 2016 at 10:42 PM, Melvin Davidson <melvin6925@gmail.com> wrote:


On Wed, May 18, 2016 at 10:36 PM, Adam Brusselback <adambrusselback@gmail.com> wrote:
Here is an example that works in a single query.  Since you have two different orders you want the data back in, you need to use subqueries to get the proper data back, but it works, and is very fast.

CREATE TEMPORARY TABLE foo AS 
SELECT generate_series as bar
FROM generate_series(1, 1000000);

CREATE INDEX idx_foo_bar ON foo (bar);


SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar asc
LIMIT 1
) x
UNION ALL 
SELECT *
FROM (
SELECT bar
FROM foo
ORDER BY bar desc
LIMIT 1
) y;

DROP TABLE foo;

Seems to me SELECT min(<column>),  max(<column>) FROM deja.vu ; would also work.


--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.



pgsql-general by date:

Previous
From: Matija Lesar
Date:
Subject: Re: first_value/last_value
Next
From: Leonardo M. Ramé
Date:
Subject: Re: Londiste 3 pgq events_1_1 table huge