Thread: Select the max on a field
Hi all, Suppose that I have a table like this: att_1 | att_2 | att_3 | att_4 -------------------------------- 1 | a | y | y1 2 | b | y | y2 3 | a | xx | y3 4 | c | zz | y4 5 | a | t | y5 6 | c | x | y6 I want obtain all row that have for each att_2 the max of att_1 some like SELECT * FROM MY_TABLE GROUP BY att_2 HAVING att_1 = max ( id_user_log) for obtain: att_1 | att_2 | att_3 | att_4 -------------------------------- 2 | b | y | y2 5 | a | t | y5 6 | c | x | y6 Ciao Gaetano
"Gaetano Mendola" <mendola@bigfoot.com> wrote in message news:alq3mr$2s7o$1@news.hub.org... > Hi all, > > Suppose that I have a table like this: > > > att_1 | att_2 | att_3 | att_4 > -------------------------------- > 1 | a | y | y1 > 2 | b | y | y2 > 3 | a | xx | y3 > 4 | c | zz | y4 > 5 | a | t | y5 > 6 | c | x | y6 > > > > I want obtain all row that have for each att_2 > the max of att_1 I forgot to say: "with a single total query without sub select " Ciao Gaetano
On Thu, 12 Sep 2002, Gaetano Mendola wrote: > Hi all, > > Suppose that I have a table like this: > > > att_1 | att_2 | att_3 | att_4 > -------------------------------- > 1 | a | y | y1 > 2 | b | y | y2 > 3 | a | xx | y3 > 4 | c | zz | y4 > 5 | a | t | y5 > 6 | c | x | y6 > > > > I want obtain all row that have for each att_2 > the max of att_1 > > some like > > SELECT * > FROM MY_TABLE > GROUP BY att_2 > HAVING att_1 = max ( id_user_log) If you don't mind using postgres extensions and don't need a particular att_1 ordering: select distinct on (att_2) * from test order by att_2, att_1 desc; Otherwise I'm not sure you can do it without a subselect somewhere.
--- Gaetano Mendola <mendola@bigfoot.com> wrote: > > "Gaetano Mendola" <mendola@bigfoot.com> wrote in > message > news:alq3mr$2s7o$1@news.hub.org... > > Hi all, > > > > Suppose that I have a table like this: > > > > > > att_1 | att_2 | att_3 | att_4 > > -------------------------------- > > 1 | a | y | y1 > > 2 | b | y | y2 > > 3 | a | xx | y3 > > 4 | c | zz | y4 > > 5 | a | t | y5 > > 6 | c | x | y6 > > > > > > > > I want obtain all row that have for each att_2 > > the max of att_1 > > I forgot to say: "with a single total query without > sub select " > If you don't mind a non-portable feature, DISTINCT ON should do what you want. Something like: SELECT DISTINCT ON (att_2) att_1, att_2, att_3, att_4 FROM table ORDER BY att_2, att_1 DESC; __________________________________________________ Do you Yahoo!? Yahoo! News - Today's headlines http://news.yahoo.com
Performance inside and outside view ( WAS Re: Select the max on a field )
From
"Gaetano Mendola"
Date:
"Stephan Szabo" <sszabo@megazone23.bigpanda.com> wrote in message news:20020912065516.L45340-100000@megazone23.bigpanda.com... > select distinct on (att_2) * from test > order by att_2, att_1 desc; Yes that's right it's help me, but here the optimizer have some problems: CREATE VIEW last_user_logs AS SELECT DISTINCT ON (id_user) * FROM user_logs ORDER BY id_user, id_user_log DESC; and now: # explain analyze select * from user_logs where id_user = '5430'; NOTICE: QUERY PLAN: Index Scan using idx_user_user_logs on user_logs (cost=0.00..3038.59 rows=939 width=68) (actual time=0.03..0.03 rows=0 loops=1) Total runtime: 0.12 msec EXPLAIN empdb=# explain analyze select * from last_user_logs where id_user = 5430; NOTICE: QUERY PLAN: Subquery Scan last_user_logs (cost=20256.12..20725.43 rows=18772 width=68) (actual time=3526.10..3526.10 rows=0 loops=1) -> Unique (cost=20256.12..20725.43 rows=18772 width=68) (actual time=3067.14..3522.54 rows=2226 loops=1) -> Sort (cost=20256.12..20256.12 rows=187723 width=68) (actual time=3067.13..3241.94 rows=187723 loops=1) -> Seq Scan on user_logs (cost=0.00..3813.23 rows=187723 width=68) (actual time=0.02..1070.59 rows=187723 loops=1) Total runtime: 3578.07 msec EXPLAIN if I do instead without view and #explain analyze SELECT DISTINCT ON (id_user) * FROM user_logsWHERE id_user = 5430ORDER BY id_user, id_user_log DESC; NOTICE: QUERY PLAN: Unique (cost=3084.93..3087.27 rows=94 width=68) (actual time=0.21..0.21 rows=0 loops=1) -> Sort (cost=3084.93..3084.93 rows=939 width=68) (actual time=0.21..0.21 rows=0 loops=1) -> Index Scan using idx_user_user_logs on user_logs (cost=0.00..3038.59 rows=939 width=68) (actual time=0.03..0.03 rows=0 loops=1) Total runtime: 0.31 msec EXPLAIN Ciao Gaetano
"Gaetano Mendola" <mendola@bigfoot.com> writes: > Yes that's right it's help me, > but here the optimizer have some problems: What's the datatype of id_user, and why are you quoting the compared constant in some cases but not others? I don't think the difference has anything to do with the view here ... regards, tom lane
On Thu, 12 Sep 2002, Tom Lane wrote: > "Gaetano Mendola" <mendola@bigfoot.com> writes: > > Yes that's right it's help me, > > but here the optimizer have some problems: > > What's the datatype of id_user, and why are you quoting the compared > constant in some cases but not others? I don't think the difference > has anything to do with the view here ... At least in current sources it would, since it can't push down the clause (unless I'm misreading subquery_is_pushdown_safe and the comment) through the distinct on.
Re: Performance inside and outside view ( WAS Re: Select the max on a field )
From
"Gaetano Mendola"
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> wrote in message news:7350.1031842223@sss.pgh.pa.us... > "Gaetano Mendola" <mendola@bigfoot.com> writes: > > Yes that's right it's help me, > > but here the optimizer have some problems: > > What's the datatype of id_user, and why are you quoting the compared > constant in some cases but not others? I don't think the difference > has anything to do with the view here ... id_user is an INTEGER type. Well what about this two selects: a) SELECT DISTINCT ON (id_user) * FROM user_logs WHERE id_user = 5430 ORDER BY id_user, id_user_log DESC; b) CREATE VIEW last_user_logs AS SELECT DISTINCT ON (id_user) * FROM user_logs ORDER BY id_user, id_user_log DESC; SELECT * FROM last_user_logs WHERE id_user = 5430 I think that in the second case the filter is not pushed inside the view. Ciao Gaetano
On Thu, 12 Sep 2002, Gaetano Mendola wrote: > > "Tom Lane" <tgl@sss.pgh.pa.us> wrote in message > news:7350.1031842223@sss.pgh.pa.us... > > "Gaetano Mendola" <mendola@bigfoot.com> writes: > > > Yes that's right it's help me, > > > but here the optimizer have some problems: > > > > What's the datatype of id_user, and why are you quoting the compared > > constant in some cases but not others? I don't think the difference > > has anything to do with the view here ... > > id_user is an INTEGER type. > > Well what about this two selects: > > a) SELECT DISTINCT ON (id_user) * > FROM user_logs > WHERE id_user = 5430 > ORDER BY id_user, id_user_log DESC; > > b) CREATE VIEW last_user_logs AS > SELECT DISTINCT ON (id_user) * > FROM user_logs > ORDER BY id_user, id_user_log DESC; > > SELECT * > FROM last_user_logs > WHERE id_user = 5430 > > > I think that in the second case the filter is not pushed inside the view. Pushing into distinct on has limited cases where it's legal, you're using one of them, but the system doesn't differentiate and doesn't push in either. select distinct on (col1) * from table where col2=2; andselect * from (select distinct on (col1) * from table) v where col2=2; -- this is like a view case are not the same query. I believe Tom makes a comment to the effect that it should be safe if the clause only references distinct on'd columns.
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes: > On Thu, 12 Sep 2002, Tom Lane wrote: >> What's the datatype of id_user, and why are you quoting the compared >> constant in some cases but not others? I don't think the difference >> has anything to do with the view here ... > At least in current sources it would, since it can't push down the clause > (unless I'm misreading subquery_is_pushdown_safe and the comment) through > the distinct on. Oh, duh, you're right; I'd missed the DISTINCT ON ... regards, tom lane