Thread: SELECT LIMIT 1 VIEW Performance Issue
Hi All, I previously posted the following as a sequel to my SELECT DISTINCT Performance Issue question. We would most appreciate any clue or suggestions on how to overcome this show-stopping issue. We are using 8.0.3 on Windows. Is it a known limitation when using a view with SELECT ... LIMIT 1? Would the forthcoming performance enhancement with MAX help when used within a view, as in: create or replace view VCurPlayer as select * from Player a where a.AtDate = (select Max(b.AtDate) from Player b where a.PlayerID = b.PlayerID); select PlayerID,AtDate from VCurPlayer where PlayerID='22220'; Thanks and regards, KC. --------- At 19:45 05/06/06, PFC wrote: >>Previously, we have also tried to use LIMIT 1 instead of DISTINCT, but >>the performance was no better: >>select PlayerID,AtDate from Player where PlayerID='22220' order by >>PlayerID desc, AtDate desc LIMIT 1 > > The DISTINCT query will pull out all the rows and keep only one, > so the >one with LIMIT should be faster. Can you post explain analyze of the LIMIT >query ? Actually the problem with LIMIT 1 query is when we use views with the LIMIT 1 construct. The direct SQL is ok: esdt=> explain analyze select PlayerID,AtDate from Player where PlayerID='22220' order by PlayerID desc, AtDate desc LIMIT 1; Limit (cost=0.00..1.37 rows=1 width=23) (actual time=0.000..0.000 rows=1 loops=1) -> Index Scan Backward using pk_player on player (cost=0.00..16074.23 rows=11770 width=23) (actual time=0.000..0.000 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Total runtime: 0.000 ms esdt=> create or replace view VCurPlayer3 as select * from Player a where AtDate = (select b.AtDate from Player b where a.PlayerID = b.PlayerID order by b.PlayerID desc, b.AtDate desc LIMIT 1); esdt=> explain analyze select PlayerID,AtDate,version from VCurPlayer3 where PlayerID='22220'; Index Scan using pk_player on player a (cost=0.00..33072.78 rows=59 width=27) (actual time=235.000..235.000 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Filter: ((atdate)::text = ((subplan))::text) SubPlan -> Limit (cost=0.00..1.44 rows=1 width=23) (actual time=0.117..0.117 rows=1 loops=1743) -> Index Scan Backward using pk_player on player b (cost=0.00..14023.67 rows=9727 width=23) (actual time=0.108..0.108 rows=1 loops=1743) Index Cond: (($0)::text = (playerid)::text) Total runtime: 235.000 ms The problem appears to be in the loops=1743 scanning all 1743 data records for that player. Regards, KC.
Hi All, Investigating further on this problem I brought up in June, the following query with pg 8.0.3 on Windows scans all 1743 data records for a player: esdt=> explain analyze select PlayerID,AtDate from Player a where PlayerID='22220' and AtDate = (select b.AtDate from Player b where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc LIMIT 1); Index Scan using pk_player on player a (cost=0.00..2789.07 rows=9 width=23) (a ctual time=51.046..51.049 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Filter: ((atdate)::text = ((subplan))::text) SubPlan -> Limit (cost=0.00..0.83 rows=1 width=23) (actual time=0.016..0.017 rows =1 loops=1743) -> Index Scan Backward using pk_player on player b (cost=0.00..970. 53 rows=1166 width=23) (actual time=0.011..0.011 rows=1 loops=1743) Index Cond: ((playerid)::text = ($0)::text) Total runtime: 51.133 ms Using a static value in the subquery produces the desired result below, but since we use views for our queries (see last part of this email), we cannot push the static value into the subquery: esdt=> explain analyze select PlayerID,AtDate from Player a where PlayerID='22220' and AtDate = (select b.AtDate from Player b where b.PlayerID = '22220' order by b.PlayerID desc, b.AtDate desc LIMIT 1); Index Scan using pk_player on player a (cost=0.75..4.26 rows=1 width=23) (actu al time=0.054..0.058 rows=1 loops=1) Index Cond: (((playerid)::text = '22220'::text) AND ((atdate)::text = ($0)::t ext)) InitPlan -> Limit (cost=0.00..0.75 rows=1 width=23) (actual time=0.028..0.029 rows =1 loops=1) -> Index Scan Backward using pk_player on player b (cost=0.00..1323 .05 rows=1756 width=23) (actual time=0.023..0.023 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Total runtime: 0.149 ms The Player table has a primary key on PlayerID, AtDate. Is there a way to stop the inner-most index scan looping all 1743 data records for that player? Is that a bug or known issue? BTW, I tried using 8.1 beta2 on Windows and its performance is similar, I have also tried other variants such as MAX and DISTINCT but with no success. Any help is most appreciated. Best regards, KC. At 10:46 05/06/15, K C Lau wrote: >Hi All, > >I previously posted the following as a sequel to my SELECT DISTINCT >Performance Issue question. We would most appreciate any clue or >suggestions on how to overcome this show-stopping issue. We are using >8.0.3 on Windows. > >Is it a known limitation when using a view with SELECT ... LIMIT 1? > >Would the forthcoming performance enhancement with MAX help when used >within a view, as in: > >create or replace view VCurPlayer as select * from Player a >where a.AtDate = (select Max(b.AtDate) from Player b where a.PlayerID = >b.PlayerID); > >select PlayerID,AtDate from VCurPlayer where PlayerID='22220'; > >Thanks and regards, >KC. > >--------- > >Actually the problem with LIMIT 1 query is when we use views with the >LIMIT 1 construct. The direct SQL is ok: > >esdt=> explain analyze select PlayerID,AtDate from Player where >PlayerID='22220' > order by PlayerID desc, AtDate desc LIMIT 1; > > Limit (cost=0.00..1.37 rows=1 width=23) (actual time=0.000..0.000 > rows=1 loops=1) > -> Index Scan Backward using pk_player on > player (cost=0.00..16074.23 rows=11770 width=23) (actual > time=0.000..0.000 rows=1 loops=1) > Index Cond: ((playerid)::text = '22220'::text) > Total runtime: 0.000 ms > >esdt=> create or replace view VCurPlayer3 as select * from Player a >where AtDate = (select b.AtDate from Player b where a.PlayerID = b.PlayerID >order by b.PlayerID desc, b.AtDate desc LIMIT 1); > >esdt=> explain analyze select PlayerID,AtDate,version from VCurPlayer3 >where PlayerID='22220'; > Index Scan using pk_player on player a (cost=0.00..33072.78 rows=59 > width=27) >(actual time=235.000..235.000 rows=1 loops=1) > Index Cond: ((playerid)::text = '22220'::text) > Filter: ((atdate)::text = ((subplan))::text) > SubPlan > -> Limit (cost=0.00..1.44 rows=1 width=23) (actual > time=0.117..0.117 rows=1 loops=1743) > -> Index Scan Backward using pk_player on player > b (cost=0.00..14023.67 rows=9727 width=23) (actual time=0.108..0.108 > rows=1 loops=1743) > Index Cond: (($0)::text = (playerid)::text) > Total runtime: 235.000 ms
On Thu, 2005-09-22 at 12:21 +0800, K C Lau wrote: > Investigating further on this problem I brought up in June, the following > query with pg 8.0.3 on Windows scans all 1743 data records for a player: > > esdt=> explain analyze select PlayerID,AtDate from Player a > where PlayerID='22220' and AtDate = (select b.AtDate from Player b > where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc > LIMIT 1); > > Total runtime: 51.133 ms > > Using a static value in the subquery produces the desired result below, but > since we use views for our queries (see last part of this email), we cannot > push the static value into the subquery: > > esdt=> explain analyze select PlayerID,AtDate from Player a > where PlayerID='22220' and AtDate = (select b.AtDate from Player b > where b.PlayerID = '22220' order by b.PlayerID desc, b.AtDate desc LIMIT 1); > Total runtime: 0.149 ms > > The Player table has a primary key on PlayerID, AtDate. Is there a way to > stop the inner-most index scan looping all 1743 data records for that > player? Is that a bug or known issue? Currently the planner can't tell whether a subquery is correlated or not until it has planned the query. So it is unable to push down the qualification automatically in the way you have achieved manually. The new min() optimisation doesn't yet work with GROUP BY which is what you would use to reformulate the query that way, so no luck that way either. If you don't want to do this in a view, calculate the values for all players at once and store the values in a summary table for when you need them. Best Regards, Simon Riggs
We use similar views as base views throughout our OLTP system to get the latest time-based record(s). So it is quite impossible to use summary tables etc. Are there other ways to do it? The subquery would pinpoint the record(s) with the composite primary key. Both MS Sql and Oracle do not have such performance problem. So this problem is effectively stopping us from migrating to PostgreSQL. Any suggestions would be most appreciated. Best regards, KC. At 16:40 05/09/22, Simon Riggs wrote: >On Thu, 2005-09-22 at 12:21 +0800, K C Lau wrote: > > > Investigating further on this problem I brought up in June, the following > > query with pg 8.0.3 on Windows scans all 1743 data records for a player: > > > > esdt=> explain analyze select PlayerID,AtDate from Player a > > where PlayerID='22220' and AtDate = (select b.AtDate from Player b > > where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc > > LIMIT 1); > > > > > Total runtime: 51.133 ms > > > > Using a static value in the subquery produces the desired result below, > but > > since we use views for our queries (see last part of this email), we > cannot > > push the static value into the subquery: > > > > esdt=> explain analyze select PlayerID,AtDate from Player a > > where PlayerID='22220' and AtDate = (select b.AtDate from Player b > > where b.PlayerID = '22220' order by b.PlayerID desc, b.AtDate desc > LIMIT 1); > > > Total runtime: 0.149 ms > > > > The Player table has a primary key on PlayerID, AtDate. Is there a way to > > stop the inner-most index scan looping all 1743 data records for that > > player? Is that a bug or known issue? > >Currently the planner can't tell whether a subquery is correlated or not >until it has planned the query. So it is unable to push down the >qualification automatically in the way you have achieved manually. The >new min() optimisation doesn't yet work with GROUP BY which is what you >would use to reformulate the query that way, so no luck that way either. > >If you don't want to do this in a view, calculate the values for all >players at once and store the values in a summary table for when you >need them. > >Best Regards, Simon Riggs > > > >---------------------------(end of broadcast)--------------------------- >TIP 6: explain analyze is your friend
On Thu, 2005-09-22 at 18:40 +0800, K C Lau wrote: > We use similar views as base views throughout our OLTP system to get the > latest time-based record(s). So it is quite impossible to use summary > tables etc. Are there other ways to do it? > > The subquery would pinpoint the record(s) with the composite primary key. > Both MS Sql and Oracle do not have such performance problem. So this > problem is effectively stopping us from migrating to PostgreSQL. > > Any suggestions would be most appreciated. Even if this were fixed for 8.1, which seems unlikely, would you be able to move to that release immediately? ISTM you have two choices, in priority, complexity and time/cost order 1) custom mods to your app 2) custom mods to PostgreSQL Maybe its possible to reconstruct your query with sub-sub-selects so that you have a correlated query with manually pushed down clauses, which also references a more constant base view? Is a 51ms query really such a problem for you? Best Regards, Simon Riggs
> >I previously posted the following as a sequel to my SELECT DISTINCT > >Performance Issue question. We would most appreciate any clue or > >suggestions on how to overcome this show-stopping issue. We are using > >8.0.3 on Windows. > > > >Is it a known limitation when using a view with SELECT ... LIMIT 1? > > > >Would the forthcoming performance enhancement with MAX help when used > >within a view, as in: > > > >create or replace view VCurPlayer as select * from Player a > >where a.AtDate = (select Max(b.AtDate) from Player b where a.PlayerID = > >b.PlayerID); Here is a trick I use sometimes with views, etc. This may or may not be effective to solve your problem but it's worth a shot. Create one small SQL function taking date, etc. and returning the values and define it immutable. Now in-query it is treated like a constant. Another useful application for this feature is when you have nested views (view 1 queries view 2) and you need to filter records based on fields from view 2 which are not returned in view 1. Impossible? in view 2 add clause where v2.f between f_min() and f_max(), them being immutable functions which can grab filter criteria based on inputs or values from a table. Merlin
At 20:48 05/09/22, Simon Riggs wrote: >On Thu, 2005-09-22 at 18:40 +0800, K C Lau wrote: > > We use similar views as base views throughout our OLTP system to get the > > latest time-based record(s). So it is quite impossible to use summary > > tables etc. Are there other ways to do it? > > > > The subquery would pinpoint the record(s) with the composite primary key. > > Both MS Sql and Oracle do not have such performance problem. So this > > problem is effectively stopping us from migrating to PostgreSQL. > > > > Any suggestions would be most appreciated. > >Even if this were fixed for 8.1, which seems unlikely, would you be able >to move to that release immediately? Yes. In fact when we first developed our system a few years ago, we tested on MS7.0, Oracle 8 and PG 7.1.1 and we did not hit that problem. When we try again with PG 8.0, the performance becomes unbearable, but other areas appear ok and other queries are often faster than MS Sql2k. >Maybe its possible to reconstruct your query with sub-sub-selects so >that you have a correlated query with manually pushed down clauses, >which also references a more constant base view? We would be most happy to try them if we have some example views or pointers. >Is a 51ms query really such a problem for you? Unfortunately yes, as our target performance is in the high hundreds of transactions per sec. And 51 ms is already the best case for a single select, with everything cached in memory immediately after the same select which took 390 ms on a quiet system. >Best Regards, Simon Riggs Best regards, KC.
At 22:37 05/09/22, Merlin Moncure wrote: > > >create or replace view VCurPlayer as select * from Player a > > >where a.AtDate = (select Max(b.AtDate) from Player b where a.PlayerID= > > >b.PlayerID); > >Here is a trick I use sometimes with views, etc. This may or may not be >effective to solve your problem but it's worth a shot. Create one small >SQL function taking date, etc. and returning the values and define it >immutable. Now in-query it is treated like a constant. We don't use functions as a rule, but I would be glad to give it a try. I would most appreciate if you could define a sample function and rewrite the VCurPlayer view above. Both PlayerID and AtDate are varchar fields. >Another useful application for this feature is when you have nested >views (view 1 queries view 2) and you need to filter records based on >fields from view 2 which are not returned in view 1. Impossible? > >in view 2 add clause where v2.f between f_min() and f_max(), them being >immutable functions which can grab filter criteria based on inputs or >values from a table. > >Merlin Best regards, KC.
K C Lau <kclau60@netvigator.com> writes: > At 20:48 05/09/22, Simon Riggs wrote: >> Even if this were fixed for 8.1, which seems unlikely, would you be able >> to move to that release immediately? > Yes. In fact when we first developed our system a few years ago, we tested > on MS7.0, Oracle 8 and PG 7.1.1 and we did not hit that problem. It's really not credible that PG 7.1 did any better with this than current sources do. The subplan mechanism hasn't changed materially since about 6.5. It could be that 7.1's performance was simply so bad across the board that you didn't notice ... regards, tom lane
On Thu, 2005-09-22 at 22:39 +0800, K C Lau wrote: > >Is a 51ms query really such a problem for you? > > Unfortunately yes, as our target performance is in the high hundreds of > transactions per sec. And 51 ms is already the best case for a single > select, with everything cached in memory immediately after the same select > which took 390 ms on a quiet system. If the current value is used so often, use two tables - one with a current view only of the row maintained using UPDATE. Different performance issues maybe, but at least not correlated subquery ones. Best Regards, Simon Riggs
> >Here is a trick I use sometimes with views, etc. This may or may not be > >effective to solve your problem but it's worth a shot. Create one small > >SQL function taking date, etc. and returning the values and define it > >immutable. Now in-query it is treated like a constant. > > We don't use functions as a rule, but I would be glad to give it a try. > I would most appreciate if you could define a sample function and rewrite > the VCurPlayer view above. Both PlayerID and AtDate are varchar fields. > esdt=> explain analyze select PlayerID,AtDate from Player a > where PlayerID='22220' and AtDate = (select b.AtDate from Player b > where b.PlayerID = '22220' order by b.PlayerID desc, b.AtDate desc LIMIT 1 try: create function player_max_at_date (varchar) returns date as $$ select atdate from player where playerid = $1 order by playerid desc, AtDate desc limit 1; $$ language sql immutable; create view v as select playerid, player_max_at_date(playerid) from player; select * from v where playerid = 'x'; --etc note: this function is not really immutable. try with both 'immutable' and 'stable' if performance is same, do stable. You're welcome in advance, ;) Merlin
Have you tried the "best choice" pattern -- where you select the set of candidate rows and then exclude those for which a better choice exists within the set? I often get better results with this pattern than with the alternatives. Transmuting your query to use this patter gives: select PlayerID,AtDate from Player a where PlayerID='22220' and not exists (select * from Player b where b.PlayerID = a.PlayerID and b.AtDate > a.AtDate); >>> K C Lau <kclau60@netvigator.com> 09/21/05 11:21 PM >>> select PlayerID,AtDate from Player a where PlayerID='22220' and AtDate = (select b.AtDate from Player b where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc LIMIT 1);
Thank you all for your suggestions. I' tried, with some variations too, but still no success. The times given are the best of a few repeated tries on an 8.1 beta 2 db freshly migrated from 8.0.3 on Windows. For reference, only the following gets the record quickly: esdt=> explain analyze select PlayerID,AtDate from Player a where PlayerID='22220' and AtDate = (select b.AtDate from Player b where b.PlayerID = '22220' order by b.PlayerID desc, b.AtDate desc LIMIT 1); Index Scan using pk_player on player a (cost=0.75..4.26 rows=1 width=23) (actual time=0.054..0.057 rows=1 loops=1) Index Cond: (((playerid)::text = '22220'::text) AND ((atdate)::text = ($0)::text)) InitPlan -> Limit (cost=0.00..0.75 rows=1 width=23) (actual time=0.027..0.028 rows=1 loops=1) -> Index Scan Backward using pk_player on player b (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.023..0.023 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Total runtime: 0.132 ms At 02:19 05/09/23, Kevin Grittner wrote: >Have you tried the "best choice" pattern -- where you select the set of >candidate rows and then exclude those for which a better choice >exists within the set? I often get better results with this pattern than >with the alternatives. esdt=> explain analyze select PlayerID,AtDate from Player a where PlayerID='22220' and not exists (select * from Player b where b.PlayerID = a.PlayerID and b.AtDate > a.AtDate); Index Scan using pk_player on player a (cost=0.00..3032.46 rows=878 width=23) (actual time=35.820..35.823 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Filter: (NOT (subplan)) SubPlan -> Index Scan using pk_player on player b (cost=0.00..378.68 rows=389 width=776) (actual time=0.013..0.013 rows=1 loops=1743) Index Cond: (((playerid)::text = ($0)::text) AND ((atdate)::text > ($1)::text)) Total runtime: 35.950 ms Note that it is faster than the LIMIT 1: esdt=> explain analyze select PlayerID,AtDate from Player a where PlayerID='22220' and AtDate = (select b.AtDate from Pl ayer b where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc LIMIT 1); Index Scan using pk_player on player a (cost=0.00..2789.07 rows=9 width=23) (actual time=41.366..41.371 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Filter: ((atdate)::text = ((subplan))::text) SubPlan -> Limit (cost=0.00..0.83 rows=1 width=23) (actual time=0.013..0.014 rows=1 loops=1743) -> Index Scan Backward using pk_player on player b (cost=0.00..970.53 rows=1166 width=23) (actual time=0.008..0.008 rows=1 loops=1743) Index Cond: ((playerid)::text = ($0)::text) Total runtime: 41.490 ms At 02:07 05/09/23, Merlin Moncure wrote: > > >Here is a trick I use sometimes with views, etc. This may or may not be > > >effective to solve your problem but it's worth a shot. Create one small > > >SQL function taking date, etc. and returning the values and define it > > >immutable. Now in-query it is treated like a constant. esdt=> create or replace function player_max_atdate (varchar(32)) returns varchar(32) as $$ esdt$> select atdate from player where playerid = $1 order by playerid desc, AtDate desc limit 1; esdt$> $$ language sql immutable; CREATE FUNCTION esdt=> create or replace view VCurPlayer3 as select * from Player where AtDate = player_max_atdate(PlayerID); CREATE VIEW esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where PlayerID='22220'; Index Scan using pk_player on player (cost=0.00..1331.83 rows=9 width=23) (actual time=65.434..65.439 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Filter: ((atdate)::text = (player_max_atdate(playerid))::text) Total runtime: 65.508 ms While it says loops=1, the time suggests that it is going through all 1743 records for that PlayerID. I tried to simulate the fast subquery inside the function, but it is taking almost twice as much time: esdt=> create or replace function player_max_atdate (varchar(32)) returns varchar(32) as $$ esdt$> select atdate from player a where playerid = $1 and AtDate = (select b.AtDate from Player b esdt$> where b.PlayerID = $1 order by b.PlayerID desc, b.AtDate desc LIMIT 1); esdt$> $$ language sql immutable; CREATE FUNCTION esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where PlayerID='22220'; Index Scan using pk_player on player (cost=0.00..1331.83 rows=9 width=23) (actual time=119.369..119.373 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Filter: ((atdate)::text = (player_max_atdate(playerid))::text) Total runtime: 119.441 ms Adding another LIMIT 1 inside the function makes it even slower: esdt=> create or replace function player_max_atdate (varchar(32)) returns varchar(32) as $$ esdt$> select atdate from player where playerid = $1 and AtDate = (select b.AtDate from Player b esdt$> where b.PlayerID = $1 order by b.PlayerID desc, b.AtDate desc LIMIT 1) esdt$> order by PlayerID desc, AtDate desc LIMIT 1; esdt$> $$ language sql immutable; CREATE FUNCTION esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where PlayerID='22220'; Index Scan using pk_player on player (cost=0.00..1331.83 rows=9 width=23) (actual time=129.858..129.863 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Filter: ((atdate)::text = (player_max_atdate(playerid))::text) Total runtime: 129.906 ms At 00:16 05/09/23, Simon Riggs wrote: >If the current value is used so often, use two tables - one with a >current view only of the row maintained using UPDATE. Different >performance issues maybe, but at least not correlated subquery ones. Many of our tables have similar construct and it would be a huge task to duplicate and maintain all these tables throughout the system. We would prefer a solution with SQL or function at the view or db level, or better still, a fix, if this problem is considered general enough. >You're welcome in advance, ;) >Merlin Thank you all in advance for any further ideas. KC.
On Thu, 2005-09-22 at 18:40 +0800, K C Lau wrote: > > > esdt=> explain analyze select PlayerID,AtDate from Player a > > > where PlayerID='22220' and AtDate = (select b.AtDate from Player b > > > where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate desc > > > LIMIT 1); I think you should try: select distinct on (PlayerID) PlayerID,AtDate from Player a where PlayerID='22220' order by PlayerId, AtDate Desc; Does that work for you? Best Regards, Simon Riggs
At 19:15 05/09/23, Simon Riggs wrote: >select distinct on (PlayerID) PlayerID,AtDate from Player a >where PlayerID='22220' order by PlayerId, AtDate Desc; > >Does that work for you? > >Best Regards, Simon Riggs esdt=> explain analyze select distinct on (PlayerID) PlayerID,AtDate from Player a where PlayerID='22220' order by PlayerId, AtDate Desc; Unique (cost=1417.69..1426.47 rows=2 width=23) (actual time=31.231..36.609 rows=1 loops=1) -> Sort (cost=1417.69..1422.08 rows=1756 width=23) (actual time=31.129..32.473 rows=1743 loops=1) Sort Key: playerid, atdate -> Index Scan using pk_player on player a (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.035..6.575 rows=1743 loops=1) Index Cond: ((playerid)::text = '22220'::text) Total runtime: 36.943 ms The sort was eliminated with: order by PlayerId Desc, AtDate Desc: esdt=> explain analyze select distinct on (PlayerID) PlayerID,AtDate from Player a where PlayerID='22220' order by PlayerId Desc, AtDate Desc; Unique (cost=0.00..1327.44 rows=2 width=23) (actual time=0.027..8.438 rows=1 loops=1) -> Index Scan Backward using pk_player on player a (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.022..4.950 rows=1743 loops=1) Index Cond: ((playerid)::text = '22220'::text) Total runtime: 8.499 ms That is the fastest of all queries looping the 1743 rows. I do get the desired result by adding LIMIT 1: esdt=> explain analyze select distinct on (PlayerID) PlayerID,AtDate from Player a where PlayerID='22220' order by PlayerId Desc, AtDate Desc LIMIT 1; Limit (cost=0.00..663.72 rows=1 width=23) (actual time=0.032..0.033 rows=1 loops=1) -> Unique (cost=0.00..1327.44 rows=2 width=23) (actual time=0.028..0.028 rows=1 loops=1) -> Index Scan Backward using pk_player on player a (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.022..0.022 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Total runtime: 0.094 ms However, when I use that within a function in a view, it is slow again: esdt=> create or replace function player_max_atdate (varchar(32)) returns varchar(32) as $$ esdt$> select distinct on (PlayerID) AtDate from player where PlayerID= $1 order by PlayerID desc, AtDate desc limit 1; esdt$> $$ language sql immutable; CREATE FUNCTION esdt=> create or replace view VCurPlayer3 as select * from Player where AtDate = player_max_atdate(PlayerID); CREATE VIEW esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where PlayerID='22220'; Index Scan using pk_player on player (cost=0.00..1331.83 rows=9 width=23) (actual time=76.660..76.664 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Filter: ((atdate)::text = (player_max_atdate(playerid))::text) Total runtime: 76.716 ms Why wouldn't the function get the row as quickly as the direct sql does? Best regards, KC.
> At 02:07 05/09/23, Merlin Moncure wrote: > > > >Here is a trick I use sometimes with views, etc. This may or may not > be > > > >effective to solve your problem but it's worth a shot. Create one > small > > > >SQL function taking date, etc. and returning the values and define it > > > >immutable. Now in-query it is treated like a constant. > > esdt=> create or replace function player_max_atdate (varchar(32)) returns > varchar(32) as $$ > esdt$> select atdate from player where playerid = $1 order by playerid > desc, AtDate desc limit 1; > esdt$> $$ language sql immutable; Can you time just the execution of this function and compare vs. pure SQL version? If the times are different, can you do a exaplain analyze of a prepared version of above? prepare test(character varying) as select atdate from player where playerid = $1 order by playerid desc, AtDate desc limit 1; explain analyze execute test('22220'); > CREATE FUNCTION > esdt=> create or replace view VCurPlayer3 as select * from Player where > AtDate = player_max_atdate(PlayerID); > CREATE VIEW This is wrong, it should have been create or replace view VCurPlayer3 as select *, player_max_atdate(PlayerID) as max_date from Player; I did a test on a table with 124k records and a two part key, ID & date. esp# select count(*) from parts_order_file; count -------- 124158 (1 row) esp=# select count(*) from parts_order_file where pr_dealer_no = '000500'; count ------- 27971 (1 row) created same function, view v, etc. esp=# explain analyze select * from v where pr_dealer_no = '000500' limit 1; QUERY PLAN ------------------------------------------------------------------------ ---------------------------- ---------------------------------------------------------------- Limit (cost=0.00..3.87 rows=1 width=10) (actual time=1.295..1.297 rows=1 loops=1) -> Index Scan using parts_order_file_pr_dealer_no_key on parts_order_file (cost=0.00..109369.15 rows=28226 width=10) (actual time=1.287..1.287 rows=1 loops=1) Index Cond: (pr_dealer_no = '000500'::bpchar) Total runtime: 1.413 ms (4 rows) Something is not jiving here. However, if the server plan still does not come out correct, try the following (p.s. why is function returning varchar(32) and not date?): create or replace function player_max_atdate (varchar(32)) returns date as $$ DECLARE player_record record; return date date; BEGIN for player_record in execute 'select atdate from player where playerid = \'' || $1 || '\' order by playerid desc, AtDate desc limit 1;' loop return_date = player_record.atdate; end loop; return return_date; END; $ language plpgsql immutable; Merlin
Dear Merlin, At 20:34 05/09/23, Merlin Moncure wrote: >Can you time just the execution of this function and compare vs. pure >SQL version? If the times are different, can you do a exaplain analyze >of a prepared version of above? esdt=> prepare test(character varying) as select atdate from player where esdt-> playerid = $1 order by playerid desc, AtDate desc limit 1; PREPARE esdt=> explain analyze execute test('22220'); Limit (cost=0.00..0.83 rows=1 width=23) (actual time=0.032..0.033 rows=1 loops=1) -> Index Scan Backward using pk_player on player (cost=0.00..970.53 rows=1166 width=23) (actual time=0.027..0.027 rows=1 loops=1) Index Cond: ((playerid)::text = ($1)::text) Total runtime: 0.088 ms The prepared SQL timing is similar to that of a direct SQL. > > esdt=> create or replace view VCurPlayer3 as select * from Player where > > AtDate = player_max_atdate(PlayerID); > >This is wrong, it should have been >create or replace view VCurPlayer3 as select *, >player_max_atdate(PlayerID) as max_date from Player; Your suggestion returns all the records plus a max AtDate column for each PlayerID. What I want to get with the view is the record that has the max value of AtDate for each PlayerID. The AtDate is a varchar(23) field containing a string date of format 'yyyymmddhh', not the SQL Date field. Sorry if that confused you. >Something is not jiving here. However, if the server plan still does >not come out correct, try the following (p.s. why is function returning >varchar(32) and not date?): esdt=> create or replace function player_max_atdate (varchar(32)) returns varchar(32) as $$ esdt$> DECLARE esdt$> player_record record; esdt$> return_date varchar(32); esdt$> BEGIN esdt$> for player_record in execute esdt$> 'select atdate from player where playerid = \'' || $1 || '\' order by playerid desc, AtDate desc limit 1;' loop esdt$> return_date = player_record.atdate; esdt$> end loop; esdt$> return return_date; esdt$> END; esdt$> $$ language plpgsql immutable; CREATE FUNCTION esdt=> create or replace view VCurPlayer3 as select * from Player where AtDate = player_max_atdate(PlayerID); CREATE VIEW esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where PlayerID='22220'; Index Scan using pk_player on player (cost=0.00..1331.83 rows=9 width=23) (actual time=849.021..849.025 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Filter: ((atdate)::text = (player_max_atdate(playerid))::text) Total runtime: 849.078 ms Your suggested plpgsql function seems to be even slower, with a best time of 849 ms after several tries. Is that expected? Thanks again and best regards, KC.
K C Lau wrote: > Thank you all for your suggestions. I' tried, with some variations too, > but still no success. The times given are the best of a few repeated > tries on an 8.1 beta 2 db freshly migrated from 8.0.3 on Windows. > A small denormalization, where you mark the row with the latest atdate for each playerid may get you the performance you want. e.g: (8.1beta1) ALTER TABLE player ADD islastatdate boolean; UPDATE player SET islastatdate = true where (playerid,atdate) IN (SELECT playerid, atdate FROM vcurplayer); CREATE OR REPLACE VIEW vcurplayer AS SELECT * FROM player a WHERE islastatdate; CREATE INDEX player_id_lastatdate ON player(playerid, islastatdate) WHERE islastatdate; ANALYZE player; Generating some test data produced: EXPLAIN ANALYZE SELECT playerid,atdate FROM vcurplayer WHERE playerid='22220'; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------- Index Scan using player_id_lastatdate on player a (cost=0.00..4.33 rows=1 width=13) (actual time=0.142..0.149 rows=1 loops=1) Index Cond: ((playerid = '22220'::text) AND (lastatdate = true)) Filter: lastatdate Total runtime: 0.272 ms (4 rows) Whereas with the original view definition: CREATE OR REPLACE VIEW vcurplayer AS SELECT * FROM player a WHERE a.atdate = ( SELECT max(b.atdate) FROM player b WHERE a.playerid = b.playerid); EXPLAIN ANALYZE SELECT playerid,atdate FROM vcurplayer WHERE playerid='22220'; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Index Scan using player_id_date on player a (cost=0.00..7399.23 rows=11 width=13) (actual time=121.738..121.745 rows=1 loops=1) Index Cond: (playerid = '22220'::text) Filter: (atdate = (subplan)) SubPlan -> Result (cost=1.72..1.73 rows=1 width=0) (actual time=0.044..0.047 rows=1 loops=2000) InitPlan -> Limit (cost=0.00..1.72 rows=1 width=4) (actual time=0.028..0.031 rows=1 loops=2000) -> Index Scan Backward using player_id_date on player b (cost=0.00..3787.94 rows=2198 width=4) (actual time=0.019..0.019 rows=1 loops=2000) Index Cond: ($0 = playerid) Filter: (atdate IS NOT NULL) Total runtime: 121.916 ms (11 rows) Note that my generated data has too many rows for each playerid, but the difference in performance should illustrate the idea. Cheers Mark
Dear Mark, Thank you. That seems like a more manageable alternative if nothing else works out. It should cover many of the OLTP update transactions. But it does mean quite a bit of programming changes and adding another index on all such tables, and it would not cover those cases when we need to get the latest record before a certain time, for example. I'm wondering if this performance issue is common enough for other users to merit a fix in pg, especially as it seems that with MVCC, each of the data records need to be accessed in addition to scanning the index. Best regards, KC. At 09:40 05/09/24, Mark Kirkwood wrote: >A small denormalization, where you mark the row with the latest atdate for >each playerid may get you the performance you want. > >e.g: (8.1beta1) > >ALTER TABLE player ADD islastatdate boolean; > >UPDATE player SET islastatdate = true where (playerid,atdate) IN >(SELECT playerid, atdate FROM vcurplayer); > >CREATE OR REPLACE VIEW vcurplayer AS >SELECT * FROM player a >WHERE islastatdate; > >CREATE INDEX player_id_lastatdate ON player(playerid, islastatdate) >WHERE islastatdate; > >ANALYZE player; > >Generating some test data produced: > >EXPLAIN ANALYZE >SELECT playerid,atdate >FROM vcurplayer >WHERE playerid='22220'; > > QUERY PLAN >-------------------------------------------------------------------------------------------------------------------------------- > Index Scan using player_id_lastatdate on player a (cost=0.00..4.33 > rows=1 width=13) (actual time=0.142..0.149 rows=1 loops=1) > Index Cond: ((playerid = '22220'::text) AND (lastatdate = true)) > Filter: lastatdate > Total runtime: 0.272 ms >(4 rows) > >Whereas with the original view definition: > >CREATE OR REPLACE VIEW vcurplayer AS >SELECT * FROM player a >WHERE a.atdate = >( SELECT max(b.atdate) FROM player b > WHERE a.playerid = b.playerid); > >EXPLAIN ANALYZE >SELECT playerid,atdate >FROM vcurplayer >WHERE playerid='22220'; > QUERY PLAN >----------------------------------------------------------------------------------------------------------------------------------------------------------------- > Index Scan using player_id_date on player a (cost=0.00..7399.23 rows=11 > width=13) (actual time=121.738..121.745 rows=1 loops=1) > Index Cond: (playerid = '22220'::text) > Filter: (atdate = (subplan)) > SubPlan > -> Result (cost=1.72..1.73 rows=1 width=0) (actual > time=0.044..0.047 rows=1 loops=2000) > InitPlan > -> Limit (cost=0.00..1.72 rows=1 width=4) (actual > time=0.028..0.031 rows=1 loops=2000) > -> Index Scan Backward using player_id_date on player > b (cost=0.00..3787.94 rows=2198 width=4) (actual time=0.019..0.019 > rows=1 loops=2000) > Index Cond: ($0 = playerid) > Filter: (atdate IS NOT NULL) > Total runtime: 121.916 ms >(11 rows) > >Note that my generated data has too many rows for each playerid, but >the difference in performance should illustrate the idea. > >Cheers > >Mark
K C Lau wrote: > I'm wondering if this performance issue is common enough for other users > to merit a fix in pg, especially as it seems that with MVCC, each of the > data records need to be accessed in addition to scanning the index. > Yes - there are certainly cases where index only access (or something similar, like b+tree tables) would be highly desirable. From what I have understood from previous discussions, there are difficulties involved with producing a design that does not cause new problems... regards Mark
At 20:17 05/09/23, K C Lau wrote: >At 19:15 05/09/23, Simon Riggs wrote: >>select distinct on (PlayerID) PlayerID,AtDate from Player a >>where PlayerID='22220' order by PlayerId, AtDate Desc; >> >>Does that work for you? >> >>Best Regards, Simon Riggs > >esdt=> explain analyze select distinct on (PlayerID) PlayerID,AtDate from >Player a where PlayerID='22220' order by PlayerId Desc, AtDate Desc; > Unique (cost=0.00..1327.44 rows=2 width=23) (actual time=0.027..8.438 > rows=1 loops=1) > -> Index Scan Backward using pk_player on player > a (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.022..4.950 > rows=1743 loops=1) > Index Cond: ((playerid)::text = '22220'::text) > Total runtime: 8.499 ms > >That is the fastest of all queries looping the 1743 rows. >I do get the desired result by adding LIMIT 1: > >esdt=> explain analyze select distinct on (PlayerID) PlayerID,AtDate from >Player a where PlayerID='22220' order by PlayerId Desc, AtDate Desc LIMIT 1; > > Limit (cost=0.00..663.72 rows=1 width=23) (actual time=0.032..0.033 > rows=1 loops=1) > -> Unique (cost=0.00..1327.44 rows=2 width=23) (actual > time=0.028..0.028 rows=1 loops=1) > -> Index Scan Backward using pk_player on player > a (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.022..0.022 > rows=1 loops=1) > Index Cond: ((playerid)::text = '22220'::text) > Total runtime: 0.094 ms > >However, when I use that within a function in a view, it is slow again: > >esdt=> create or replace function player_max_atdate (varchar(32)) returns >varchar(32) as $$ >esdt$> select distinct on (PlayerID) AtDate from player where PlayerID= >$1 order by PlayerID desc, AtDate desc limit 1; >esdt$> $$ language sql immutable; >CREATE FUNCTION >esdt=> create or replace view VCurPlayer3 as select * from Player where >AtDate = player_max_atdate(PlayerID); >CREATE VIEW >esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where >PlayerID='22220'; > > Index Scan using pk_player on player (cost=0.00..1331.83 rows=9 > width=23) (actual time=76.660..76.664 rows=1 loops=1) > Index Cond: ((playerid)::text = '22220'::text) > Filter: ((atdate)::text = (player_max_atdate(playerid))::text) > Total runtime: 76.716 ms > >Why wouldn't the function get the row as quickly as the direct sql does? Results from the following query suggests that the explain analyze output above only tells half the story, and that the function is in fact called 1743 times: esdt=> create or replace view VCurPlayer3 as select distinct on (PlayerID) * from Player a where OID = (select distinct on (PlayerID) OID from Player b where b.PlayerID = a.PlayerID and b.AtDate = player_max_atdate(b.PlayerID) order by PlayerID desc, AtDate desc limit 1) order by PlayerId Desc, AtDate desc; CREATE VIEW esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where PlayerID='22220'; Subquery Scan vcurplayer3 (cost=0.00..1715846.91 rows=1 width=68) (actual time=0.640..119.124 rows=1 loops=1) -> Unique (cost=0.00..1715846.90 rows=1 width=776) (actual time=0.633..119.112 rows=1 loops=1) -> Index Scan Backward using pk_player on player a (cost=0.00..1715846.88 rows=9 width=776) (actual time=0.628..119.104 rows=1 loops=1) Index Cond: ((playerid)::text = '22220'::text) Filter: (oid = (subplan)) SubPlan -> Limit (cost=0.00..976.38 rows=1 width=27) (actual time=0.057..0.058 rows=1 loops=1743) -> Unique (cost=0.00..976.38 rows=1 width=27) (actual time=0.052..0.052 rows=1 loops=1743) -> Index Scan Backward using pk_player on player b (cost=0.00..976.36 rows=6 width=27) (actual time=0.047..0.047 rows=1 loops=1743) Index Cond: ((playerid)::text = ($0)::text) Filter: ((atdate)::text = (player_max_atdate(playerid))::text) Total runtime: 119.357 ms It would also explain the very long time taken by the pl/pgsql function I posted a bit earlier. So I guess it all comes back to the basic question: For the query select distinct on (PlayerID) * from Player a where PlayerID='22220' order by PlayerId Desc, AtDate Desc; can the optimizer recognise the fact the query is selecting by the primary key (PlayerID,AtDate), so it can skip the remaining rows for that PlayerID, as if LIMIT 1 is implied? Best regards, KC.
On Fri, Sep 23, 2005 at 08:17:03PM +0800, K C Lau wrote: > esdt=> create or replace function player_max_atdate (varchar(32)) returns > varchar(32) as $$ > esdt$> select distinct on (PlayerID) AtDate from player where PlayerID= $1 > order by PlayerID desc, AtDate desc limit 1; > esdt$> $$ language sql immutable; > CREATE FUNCTION That function is not immutable, it should be defined as stable. > esdt=> create or replace view VCurPlayer3 as select * from Player where > AtDate = player_max_atdate(PlayerID); > CREATE VIEW > esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where > PlayerID='22220'; > > Index Scan using pk_player on player (cost=0.00..1331.83 rows=9 > width=23) (actual time=76.660..76.664 rows=1 loops=1) > Index Cond: ((playerid)::text = '22220'::text) > Filter: ((atdate)::text = (player_max_atdate(playerid))::text) > Total runtime: 76.716 ms > > Why wouldn't the function get the row as quickly as the direct sql does? PostgreSQL doesn't pre-compile functions, at least not until 8.1 (and I'm not sure how much those are pre-compiled, though they are syntax-checked at creation). Do you get the same result time when you run it a second time? What time do you get from running just the function versus the SQL in the function? Also, remember that every layer you add to the cake means more work for the database. If speed is that highly critical you'll probably want to not wrap things in functions, and possibly not use views either. Also, keep in mind that getting below 1ms doesn't automatically mean you'll be able to scale to 1000TPS. Things will definately change when you load the system down, so if performance is that critical you should start testing with the system under load if you're not already. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Fri, Sep 23, 2005 at 04:53:55PM +0800, K C Lau wrote: > Thank you all for your suggestions. I' tried, with some variations too, but > still no success. The times given are the best of a few repeated tries on > an 8.1 beta 2 db freshly migrated from 8.0.3 on Windows. > > For reference, only the following gets the record quickly: > > esdt=> explain analyze select PlayerID,AtDate from Player a > where PlayerID='22220' and AtDate = (select b.AtDate from Player b > where b.PlayerID = '22220' order by b.PlayerID desc, b.AtDate desc LIMIT > 1); > > Index Scan using pk_player on player a (cost=0.75..4.26 rows=1 width=23) > (actual time=0.054..0.057 rows=1 loops=1) > Index Cond: (((playerid)::text = '22220'::text) AND ((atdate)::text = > ($0)::text)) > InitPlan > -> Limit (cost=0.00..0.75 rows=1 width=23) (actual > time=0.027..0.028 rows=1 loops=1) > -> Index Scan Backward using pk_player on player > b (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.023..0.023 rows=1 > loops=1) > Index Cond: ((playerid)::text = '22220'::text) > Total runtime: 0.132 ms If you're doing that, you should try something like the following: decibel=# explain analyze select * from t where ctid=(select ctid from rrs order by rrs_id desc limit 1); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------- Tid Scan on t (cost=0.44..4.45 rows=1 width=42) (actual time=0.750..0.754 rows=1 loops=1) Filter: (ctid = $0) InitPlan -> Limit (cost=0.00..0.44 rows=1 width=10) (actual time=0.548..0.549 rows=1 loops=1) -> Index Scan Backward using rrs_rrs__rrs_id on rrs (cost=0.00..3.08 rows=7 width=10) (actual time=0.541..0.541rows=1 loops=1) Total runtime: 1.061 ms (6 rows) decibel=# select count(*) from t; count -------- 458752 Note that that's on my nice slow laptop to boot (the count took like 10 seconds). Just remember that ctid *is not safe outside of a transaction*!! So you can't do something like SELECT ctid FROM ... store that in some variable... SELECT * FROM table WHERE ctid = variable -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Tue, Oct 04, 2005 at 04:15:41PM -0500, Jim C. Nasby wrote: > > Index Cond: ((playerid)::text = '22220'::text) Also, why is playerid a text field? Comparing ints will certainly be faster... -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
[to K C:] sorry, was out on vactation all last week. I was visualizing the problem incorrectly anyways... Jim wrote: > That function is not immutable, it should be defined as stable. That is 100% correct: however now and then I declare stable functions as immutable in some cases because the planner treats them differently with no side effects...this is a hack of course...see my earlier suggestion to try both immutable and stable versions. I can give a pretty good example of when this can make a big difference. > PostgreSQL doesn't pre-compile functions, at least not until 8.1 (and > I'm not sure how much those are pre-compiled, though they are > syntax-checked at creation). Do you get the same result time when you > run it a second time? What time do you get from running just the > function versus the SQL in the function? plpgsql functions are at least partially compiled (sql functions afaik are not), in that a internal state is generated following the first execution. This is the cause of all those infernal 'invalid table oid' errors. > Also, remember that every layer you add to the cake means more work for > the database. If speed is that highly critical you'll probably want to > not wrap things in functions, and possibly not use views either. The overhead of the function/view is totally inconsequential next to the planner choosing a suboptimal plan. The purpose of the function is to coerce the planner into choosing the correct plan. > Also, keep in mind that getting below 1ms doesn't automatically mean > you'll be able to scale to 1000TPS. Things will definately change when > you load the system down, so if performance is that critical you should > start testing with the system under load if you're not already.
KC wrote: > > So I guess it all comes back to the basic question: > > For the query select distinct on (PlayerID) * from Player a where > PlayerID='22220' order by PlayerId Desc, AtDate Desc; > can the optimizer recognise the fact the query is selecting by the primary > key (PlayerID,AtDate), so it can skip the remaining rows for that > PlayerID, > as if LIMIT 1 is implied? > > Best regards, KC. Hi KC, have you tried: select * from player where playerid = '22220' and atdate < 9999999999 order by platerid desc, atdate desc limit 1; ?? Merlin
Dear Merlin and all, That direct SQL returns in 0 ms. The problem only appears when a view is used. What we've done to work around this problem is to modify the table to add a field DataStatus which is set to 1 for the latest record for each player, and reset to 0 when it is superceded. A partial index is then created as: CREATE INDEX IDX_CurPlayer on Player (PlayerID) where DataStatus = 1; The VCurPlayer view is changed to: CREATE or REPLACE VIEW VCurPlayer as select * from Player where DataStatus = 1; and it now returns in 0 ms. This is not the best solution, but until (if ever) the original problem is fixed, we have not found an alternative work around. The good news is that even with the additional overhead of maintaining an extra index and the problem of vacuuming, pg 8.0.3 still performs significantly faster on Windows than MS Sql 2000 in our OLTP application testing so far. Thanks to all for your help. Best regards, KC. At 20:14 05/10/12, you wrote: >KC wrote: > > > > So I guess it all comes back to the basic question: > > > > For the query select distinct on (PlayerID) * from Player a where > > PlayerID='22220' order by PlayerId Desc, AtDate Desc; > > can the optimizer recognise the fact the query is selecting by the >primary > > key (PlayerID,AtDate), so it can skip the remaining rows for that > > PlayerID, > > as if LIMIT 1 is implied? > > > > Best regards, KC. > >Hi KC, have you tried: >select * from player where playerid = '22220' and atdate < 9999999999 >order by platerid desc, atdate desc limit 1; > >?? >Merlin