Thread: Select the max on a field

Select the max on a field

From

"Gaetano Mendola"

Date:

12 September 2002, 09:09:51

Hi all,

Suppose that I have a table like this:


att_1  |    att_2 |    att_3 |  att_4
--------------------------------
1       |      a      |    y       |    y1
2       |      b      |    y       |    y2
3       |      a      |    xx     |    y3
4       |      c      |    zz      |    y4
5       |      a      |    t        |    y5
6       |      c      |    x       |    y6



I want obtain all row that have for each att_2
the max of att_1

some like

SELECT *
FROM  MY_TABLE
GROUP BY att_2
HAVING att_1 = max ( id_user_log)


for obtain:

att_1  |    att_2 |    att_3 |  att_4
--------------------------------
2       |      b      |    y       |    y2
5       |      a      |    t        |    y5
6       |      c      |    x       |    y6

Ciao
Gaetano

Re: Select the max on a field

From

"Gaetano Mendola"

Date:

12 September 2002, 09:15:55

"Gaetano Mendola" <mendola@bigfoot.com> wrote in message
news:alq3mr$2s7o$1@news.hub.org...
> Hi all,
>
> Suppose that I have a table like this:
>
>
> att_1  |    att_2 |    att_3 |  att_4
> --------------------------------
> 1       |      a      |    y       |    y1
> 2       |      b      |    y       |    y2
> 3       |      a      |    xx     |    y3
> 4       |      c      |    zz      |    y4
> 5       |      a      |    t        |    y5
> 6       |      c      |    x       |    y6
>
>
>
> I want obtain all row that have for each att_2
> the max of att_1

I forgot to say: "with a single total query without sub select "


Ciao
Gaetano

Re: Select the max on a field

From

Stephan Szabo

Date:

12 September 2002, 10:02:08

On Thu, 12 Sep 2002, Gaetano Mendola wrote:

> Hi all,
>
> Suppose that I have a table like this:
>
>
> att_1  |    att_2 |    att_3 |  att_4
> --------------------------------
> 1       |      a      |    y       |    y1
> 2       |      b      |    y       |    y2
> 3       |      a      |    xx     |    y3
> 4       |      c      |    zz      |    y4
> 5       |      a      |    t        |    y5
> 6       |      c      |    x       |    y6
>
>
>
> I want obtain all row that have for each att_2
> the max of att_1
>
> some like
>
> SELECT *
> FROM  MY_TABLE
> GROUP BY att_2
> HAVING att_1 = max ( id_user_log)

If you don't mind using postgres extensions and
don't need a particular att_1 ordering:

select distinct on (att_2) * from test
order by att_2, att_1 desc;

Otherwise I'm not sure you can do it without
a subselect somewhere.

Re: Select the max on a field

From

Jeff Eckermann

Date:

12 September 2002, 10:03:43

--- Gaetano Mendola <mendola@bigfoot.com> wrote:
> 
> "Gaetano Mendola" <mendola@bigfoot.com> wrote in
> message
> news:alq3mr$2s7o$1@news.hub.org...
> > Hi all,
> >
> > Suppose that I have a table like this:
> >
> >
> > att_1  |    att_2 |    att_3 |  att_4
> > --------------------------------
> > 1       |      a      |    y       |    y1
> > 2       |      b      |    y       |    y2
> > 3       |      a      |    xx     |    y3
> > 4       |      c      |    zz      |    y4
> > 5       |      a      |    t        |    y5
> > 6       |      c      |    x       |    y6
> >
> >
> >
> > I want obtain all row that have for each att_2
> > the max of att_1
> 
> I forgot to say: "with a single total query without
> sub select "
> 
If you don't mind a non-portable feature, DISTINCT ON
should do what you want.  Something like:

SELECT DISTINCT ON (att_2) att_1, att_2, att_3, att_4
FROM table
ORDER BY att_2, att_1 DESC;

__________________________________________________
Do you Yahoo!?
Yahoo! News - Today's headlines
http://news.yahoo.com

Performance inside and outside view ( WAS Re: Select the max on a field )

From

"Gaetano Mendola"

Date:

12 September 2002, 10:31:05

"Stephan Szabo" <sszabo@megazone23.bigpanda.com> wrote in message
news:20020912065516.L45340-100000@megazone23.bigpanda.com...
> select distinct on (att_2) * from test
> order by att_2, att_1 desc;

Yes that's right it's help me,
but here the optimizer have some problems:


CREATE VIEW last_user_logs AS
SELECT DISTINCT ON (id_user) *
FROM user_logs
ORDER BY id_user, id_user_log DESC;

and now:

# explain analyze select * from user_logs where id_user = '5430';
NOTICE:  QUERY PLAN:

Index Scan using idx_user_user_logs on user_logs  (cost=0.00..3038.59
rows=939 width=68) (actual time=0.03..0.03 rows=0 loops=1)
Total runtime: 0.12 msec

EXPLAIN

empdb=# explain analyze select * from last_user_logs where id_user = 5430;
NOTICE:  QUERY PLAN:

Subquery Scan last_user_logs  (cost=20256.12..20725.43 rows=18772 width=68)
(actual time=3526.10..3526.10 rows=0 loops=1) ->  Unique  (cost=20256.12..20725.43 rows=18772 width=68) (actual
time=3067.14..3522.54 rows=2226 loops=1)       ->  Sort  (cost=20256.12..20256.12 rows=187723 width=68) (actual
time=3067.13..3241.94 rows=187723 loops=1)             ->  Seq Scan on user_logs  (cost=0.00..3813.23 rows=187723
width=68) (actual time=0.02..1070.59 rows=187723 loops=1)
Total runtime: 3578.07 msec

EXPLAIN


if I do instead without view and
#explain analyze SELECT DISTINCT ON (id_user) * FROM user_logsWHERE id_user = 5430ORDER BY id_user, id_user_log DESC;

NOTICE:  QUERY PLAN:

Unique  (cost=3084.93..3087.27 rows=94 width=68) (actual time=0.21..0.21
rows=0 loops=1) ->  Sort  (cost=3084.93..3084.93 rows=939 width=68) (actual
time=0.21..0.21 rows=0 loops=1)       ->  Index Scan using idx_user_user_logs on user_logs
(cost=0.00..3038.59 rows=939 width=68) (actual time=0.03..0.03 rows=0
loops=1)
Total runtime: 0.31 msec

EXPLAIN


Ciao
Gaetano

Re: Performance inside and outside view ( WAS Re: Select the max on a field )

From

Tom Lane

Date:

12 September 2002, 10:50:21

"Gaetano Mendola" <mendola@bigfoot.com> writes:
> Yes that's right it's help me,
> but here the optimizer have some problems:

What's the datatype of id_user, and why are you quoting the compared
constant in some cases but not others?  I don't think the difference
has anything to do with the view here ...
        regards, tom lane

Re: Performance inside and outside view ( WAS Re: Select the

From

Stephan Szabo

Date:

12 September 2002, 11:22:53

On Thu, 12 Sep 2002, Tom Lane wrote:

> "Gaetano Mendola" <mendola@bigfoot.com> writes:
> > Yes that's right it's help me,
> > but here the optimizer have some problems:
>
> What's the datatype of id_user, and why are you quoting the compared
> constant in some cases but not others?  I don't think the difference
> has anything to do with the view here ...

At least in current sources it would, since it can't push down the clause
(unless I'm misreading subquery_is_pushdown_safe and the comment) through
the distinct on.

Re: Performance inside and outside view ( WAS Re: Select the max on a field )

From

"Gaetano Mendola"

Date:

12 September 2002, 11:53:36

"Tom Lane" <tgl@sss.pgh.pa.us> wrote in message
news:7350.1031842223@sss.pgh.pa.us...
> "Gaetano Mendola" <mendola@bigfoot.com> writes:
> > Yes that's right it's help me,
> > but here the optimizer have some problems:
>
> What's the datatype of id_user, and why are you quoting the compared
> constant in some cases but not others?  I don't think the difference
> has anything to do with the view here ...

id_user is an INTEGER type.

Well what about this two selects:

a) SELECT DISTINCT ON (id_user) *   FROM user_logs  WHERE id_user = 5430  ORDER BY id_user, id_user_log DESC;

b) CREATE VIEW last_user_logs AS    SELECT DISTINCT ON (id_user) *    FROM user_logs    ORDER BY id_user, id_user_log
DESC;
    SELECT *    FROM last_user_logs    WHERE id_user = 5430


I think that in the second case the filter is not pushed inside the view.

Ciao
Gaetano

Re: Performance inside and outside view ( WAS Re: Select the

From

Stephan Szabo

Date:

12 September 2002, 12:05:28

On Thu, 12 Sep 2002, Gaetano Mendola wrote:

>
> "Tom Lane" <tgl@sss.pgh.pa.us> wrote in message
> news:7350.1031842223@sss.pgh.pa.us...
> > "Gaetano Mendola" <mendola@bigfoot.com> writes:
> > > Yes that's right it's help me,
> > > but here the optimizer have some problems:
> >
> > What's the datatype of id_user, and why are you quoting the compared
> > constant in some cases but not others?  I don't think the difference
> > has anything to do with the view here ...
>
> id_user is an INTEGER type.
>
> Well what about this two selects:
>
> a) SELECT DISTINCT ON (id_user) *
>     FROM user_logs
>    WHERE id_user = 5430
>    ORDER BY id_user, id_user_log DESC;
>
> b) CREATE VIEW last_user_logs AS
>      SELECT DISTINCT ON (id_user) *
>      FROM user_logs
>      ORDER BY id_user, id_user_log DESC;
>
>      SELECT *
>      FROM last_user_logs
>      WHERE id_user = 5430
>
>
> I think that in the second case the filter is not pushed inside the view.

Pushing into distinct on has limited cases where it's legal,
you're using one of them, but the system doesn't differentiate
and doesn't push in either.
select distinct on (col1) * from table where col2=2;
andselect * from (select distinct on (col1) * from table) v where col2=2; -- this is like a view case
are not the same query.

I believe Tom makes a comment to the effect that it should
be safe if the clause only references distinct on'd columns.

Re: Performance inside and outside view ( WAS Re: Select the

From

Tom Lane

Date:

12 September 2002, 14:57:20

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> On Thu, 12 Sep 2002, Tom Lane wrote:
>> What's the datatype of id_user, and why are you quoting the compared
>> constant in some cases but not others?  I don't think the difference
>> has anything to do with the view here ...

> At least in current sources it would, since it can't push down the clause
> (unless I'm misreading subquery_is_pushdown_safe and the comment) through
> the distinct on.

Oh, duh, you're right; I'd missed the DISTINCT ON ...
        regards, tom lane