Re: SELECT LIMIT 1 VIEW Performance Issue - Mailing list pgsql-performance
From | K C Lau |
---|---|
Subject | Re: SELECT LIMIT 1 VIEW Performance Issue |
Date | |
Msg-id | 6.2.1.2.0.20050926152048.0542bcc8@localhost Whole thread Raw |
In response to | Re: SELECT LIMIT 1 VIEW Performance Issue (K C Lau <kclau60@netvigator.com>) |
List | pgsql-performance |
At 20:17 05/09/23, K C Lau wrote: >At 19:15 05/09/23, Simon Riggs wrote: >>select distinct on (PlayerID) PlayerID,AtDate from Player a >>where PlayerID='22220' order by PlayerId, AtDate Desc; >> >>Does that work for you? >> >>Best Regards, Simon Riggs > >esdt=> explain analyze select distinct on (PlayerID) PlayerID,AtDate from >Player a where PlayerID='22220' order by PlayerId Desc, AtDate Desc; > Unique (cost=0.00..1327.44 rows=2 width=23) (actual time=0.027..8.438 > rows=1 loops=1) > -> Index Scan Backward using pk_player on player > a (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.022..4.950 > rows=1743 loops=1) > Index Cond: ((playerid)::text = '22220'::text) > Total runtime: 8.499 ms > >That is the fastest of all queries looping the 1743 rows. >I do get the desired result by adding LIMIT 1: > >esdt=> explain analyze select distinct on (PlayerID) PlayerID,AtDate from >Player a where PlayerID='22220' order by PlayerId Desc, AtDate Desc LIMIT 1; > > Limit (cost=0.00..663.72 rows=1 width=23) (actual time=0.032..0.033 > rows=1 loops=1) > -> Unique (cost=0.00..1327.44 rows=2 width=23) (actual > time=0.028..0.028 rows=1 loops=1) > -> Index Scan Backward using pk_player on player > a (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.022..0.022 > rows=1 loops=1) > Index Cond: ((playerid)::text = '22220'::text) > Total runtime: 0.094 ms > >However, when I use that within a function in a view, it is slow again: > >esdt=> create or replace function player_max_atdate (varchar(32)) returns >varchar(32) as $$ >esdt$> select distinct on (PlayerID) AtDate from player where PlayerID= >$1 order by PlayerID desc, AtDate desc limit 1; >esdt$> $$ language sql immutable; >CREATE FUNCTION >esdt=> create or replace view VCurPlayer3 as select * from Player where >AtDate = player_max_atdate(PlayerID); >CREATE VIEW >esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where >PlayerID='22220'; > > Index Scan using pk_player on player (cost=0.00..1331.83 rows=9 > width=23) (actual time=76.660..76.664 rows=1 loops=1) > Index Cond: ((playerid)::text = '22220'::text) > Filter: ((atdate)::text = (player_max_atdate(playerid))::text) > Total runtime: 76.716 ms > >Why wouldn't the function get the row as quickly as the direct sql does? Results from the following query suggests that the explain analyze output above only tells half the story, and that the function is in fact called 1743 times: esdt=> create or replace view VCurPlayer3 as select distinct on (PlayerID) * from Player a where OID = (select distinct on (PlayerID) OID from Player b where b.PlayerID = a.PlayerID and b.AtDate = player_max_atdate(b.PlayerID) order by PlayerID desc, AtDate desc limit 1) order by PlayerId Desc, AtDate desc; CREATE VIEW esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where PlayerID='22220'; Subquery Scan vcurplayer3 (cost=0.00..1715846.91 rows=1 width=68) (actual time=0.640..119.124 rows=1 loops=1) -> Unique (cost=0.00..1715846.90 rows=1 width=776) (actual time=0.633..119.112 rows=1 loops=1) -> Index Scan Backward using pk_player on player a (cost=0.00..1715846.88 rows=9 width=776) (actual time=0.628..119.104 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Filter: (oid = (subplan)) SubPlan -> Limit (cost=0.00..976.38 rows=1 width=27) (actual time=0.057..0.058 rows=1 loops=1743) -> Unique (cost=0.00..976.38 rows=1 width=27) (actual time=0.052..0.052 rows=1 loops=1743) -> Index Scan Backward using pk_player on player b (cost=0.00..976.36 rows=6 width=27) (actual time=0.047..0.047 rows=1 loops=1743) Index Cond: ((playerid)::text = ($0)::text) Filter: ((atdate)::text = (player_max_atdate(playerid))::text) Total runtime: 119.357 ms It would also explain the very long time taken by the pl/pgsql function I posted a bit earlier. So I guess it all comes back to the basic question: For the query select distinct on (PlayerID) * from Player a where PlayerID='22220' order by PlayerId Desc, AtDate Desc; can the optimizer recognise the fact the query is selecting by the primary key (PlayerID,AtDate), so it can skip the remaining rows for that PlayerID, as if LIMIT 1 is implied? Best regards, KC.
pgsql-performance by date: