Re: getting the ranks out of items with SHARED - Mailing list pgsql-general
From | Tom Lane |
---|---|
Subject | Re: getting the ranks out of items with SHARED |
Date | |
Msg-id | 20658.1121360033@sss.pgh.pa.us Whole thread Raw |
In response to | getting the ranks out of items with SHARED (Janning Vygen <vygen@gmx.de>) |
List | pgsql-general |
Janning Vygen <vygen@planwerk6.de> writes: > and this is the combined statement: > UPDATE temp_gc > SET gc_rank = ranking.rank > FROM ( > SELECT > *, > ranking(r1.gc_gesamtpunkte, r1.gc_gesamtsiege) AS rank > FROM ( > SELECT > mg_name, > gc_gesamtpunkte, > gc_gesamtsiege > FROM temp_gc > ORDER BY gc_gesamtpunkte DESC, gc_gesamtsiege DESC, mg_name ASC > ) AS r1 > ) AS ranking > WHERE temp_gc.mg_name = ranking.mg_name; > to me it looks like the call to ranking() is moved from the subquery plan > "upwards". but i really can't interpret this explain output regarding to > "where" the ranking funcion is called. Yeah, I think you are exactly right. The EXPLAIN output doesn't show targetlist contents (EXPLAIN VERBOSE would, but not very readably :-(), but the small difference in the estimated costs seems to indicate that the function evaluation is not happening at the level of the "subquery scan" plan node in the combined plan. This is consistent with the planner's behavior in general. It will honor the "order by" in the sense that the output of the "r1" subquery is delivered to the upper plan level in that order, but it feels no compunction about flattening the "ranking" subquery into the outer query, and after that you cannot be certain about the order in which the executions of the ranking() function happen. What you need is to prevent the flattening of the intermediate subquery. The current favorite technique is to insert a dummy OFFSET clause: UPDATE temp_gc SET gc_rank = ranking.rank FROM ( SELECT *, ranking(r1.gc_gesamtpunkte, r1.gc_gesamtsiege) AS rank FROM ( SELECT mg_name, gc_gesamtpunkte, gc_gesamtsiege FROM temp_gc ORDER BY gc_gesamtpunkte DESC, gc_gesamtsiege DESC, mg_name ASC ) AS r1 OFFSET 0 ) AS ranking WHERE temp_gc.mg_name = ranking.mg_name; although you could also do it by inserting an ORDER BY at that level. BTW, there's been some discussion of preventing flattening of subqueries whose SELECT lists contain volatile functions. If we did that then declaring ranking() as volatile would be enough to avoid the problem. I've been hesitant to make the change because I'm concerned about the probable loss of optimization in cases where the function is labeled volatile by default, merely because the author didn't think about how to mark it. And in any case "volatile" doesn't really describe the issue with your function... > my problem is that getting a rank out of items is very expensive with > aggregate functions, I'm unconvinced --- I don't see a reason that you can't implement it exactly the same way as an aggregate. The only difference is that the state value is an aggregate state value instead of a global variable. There might be some extra palloc overhead, but nothing worse. regards, tom lane
pgsql-general by date: