Re: SELECT LIMIT 1 VIEW Performance Issue - Mailing list pgsql-performance

From K C Lau
Subject Re: SELECT LIMIT 1 VIEW Performance Issue
Date
Msg-id 6.2.1.2.0.20050923155952.05889a50@localhost
Whole thread Raw
In response to Re: SELECT LIMIT 1 VIEW Performance Issue  ("Merlin Moncure" <merlin.moncure@rcsonline.com>)
Responses Re: SELECT LIMIT 1 VIEW Performance Issue
Re: SELECT LIMIT 1 VIEW Performance Issue
List pgsql-performance
Thank you all for your suggestions. I' tried, with some variations too, but
still no success. The times given are the best of a few repeated tries on
an 8.1 beta 2 db freshly migrated from 8.0.3 on Windows.

For reference, only the following gets the record quickly:

esdt=> explain analyze select PlayerID,AtDate from Player a
  where PlayerID='22220' and AtDate = (select b.AtDate from Player b
  where b.PlayerID = '22220' order by b.PlayerID desc, b.AtDate desc LIMIT 1);

  Index Scan using pk_player on player a  (cost=0.75..4.26 rows=1 width=23)
(actual time=0.054..0.057 rows=1 loops=1)
    Index Cond: (((playerid)::text = '22220'::text) AND ((atdate)::text =
($0)::text))
    InitPlan
      ->  Limit  (cost=0.00..0.75 rows=1 width=23) (actual
time=0.027..0.028 rows=1 loops=1)
            ->  Index Scan Backward using pk_player on player
b  (cost=0.00..1323.05 rows=1756 width=23) (actual time=0.023..0.023 rows=1
loops=1)
                  Index Cond: ((playerid)::text = '22220'::text)
  Total runtime: 0.132 ms

At 02:19 05/09/23, Kevin Grittner wrote:
>Have you tried the "best choice" pattern -- where you select the set of
>candidate rows and then exclude those for which a better choice
>exists within the set?  I often get better results with this pattern than
>with the alternatives.

esdt=> explain analyze select PlayerID,AtDate from Player a where
PlayerID='22220'
and not exists (select * from Player b where b.PlayerID = a.PlayerID and
b.AtDate > a.AtDate);

  Index Scan using pk_player on player a  (cost=0.00..3032.46 rows=878
width=23)
(actual time=35.820..35.823 rows=1 loops=1)
    Index Cond: ((playerid)::text = '22220'::text)
    Filter: (NOT (subplan))
    SubPlan
      ->  Index Scan using pk_player on player b  (cost=0.00..378.68
rows=389 width=776) (actual time=0.013..0.013 rows=1 loops=1743)
            Index Cond: (((playerid)::text = ($0)::text) AND
((atdate)::text > ($1)::text))
  Total runtime: 35.950 ms

Note that it is faster than the LIMIT 1:

esdt=> explain analyze select PlayerID,AtDate from Player a where
PlayerID='22220' and AtDate = (select b.AtDate from Pl
ayer b where b.PlayerID = a.PlayerID order by b.PlayerID desc, b.AtDate
desc LIMIT 1);

  Index Scan using pk_player on player a  (cost=0.00..2789.07 rows=9
width=23) (actual time=41.366..41.371 rows=1 loops=1)
    Index Cond: ((playerid)::text = '22220'::text)
    Filter: ((atdate)::text = ((subplan))::text)
    SubPlan
      ->  Limit  (cost=0.00..0.83 rows=1 width=23) (actual
time=0.013..0.014 rows=1 loops=1743)
            ->  Index Scan Backward using pk_player on player
b  (cost=0.00..970.53 rows=1166 width=23) (actual time=0.008..0.008 rows=1
loops=1743)
                  Index Cond: ((playerid)::text = ($0)::text)
  Total runtime: 41.490 ms

At 02:07 05/09/23, Merlin Moncure wrote:
> > >Here is a trick I use sometimes with views, etc.  This may or may not be
> > >effective to solve your problem but it's worth a shot.  Create one small
> > >SQL function taking date, etc. and returning the values and define it
> > >immutable.  Now in-query it is treated like a constant.

esdt=> create or replace function player_max_atdate (varchar(32)) returns
varchar(32) as $$
esdt$>  select atdate from player where playerid = $1 order by playerid
desc, AtDate desc limit 1;
esdt$> $$ language sql immutable;
CREATE FUNCTION
esdt=> create or replace view VCurPlayer3 as select * from Player where
AtDate = player_max_atdate(PlayerID);
CREATE VIEW
esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where
PlayerID='22220';

  Index Scan using pk_player on player  (cost=0.00..1331.83 rows=9
width=23) (actual time=65.434..65.439 rows=1 loops=1)
    Index Cond: ((playerid)::text = '22220'::text)
    Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
  Total runtime: 65.508 ms

While it says loops=1, the time suggests that it is going through all 1743
records for that PlayerID.

I tried to simulate the fast subquery inside the function, but it is taking
almost twice as much time:

esdt=> create or replace function player_max_atdate (varchar(32)) returns
varchar(32) as $$
esdt$>  select atdate from player a where playerid = $1 and AtDate =
(select b.AtDate from Player b
esdt$>  where b.PlayerID = $1 order by b.PlayerID desc, b.AtDate desc LIMIT 1);
esdt$> $$ language sql immutable;
CREATE FUNCTION
esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where
PlayerID='22220';

  Index Scan using pk_player on player  (cost=0.00..1331.83 rows=9
width=23) (actual time=119.369..119.373 rows=1 loops=1)
    Index Cond: ((playerid)::text = '22220'::text)
    Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
  Total runtime: 119.441 ms

Adding another LIMIT 1 inside the function makes it even slower:

esdt=> create or replace function player_max_atdate (varchar(32)) returns
varchar(32) as $$
esdt$>  select atdate from player where playerid = $1 and AtDate = (select
b.AtDate from Player b
esdt$>  where b.PlayerID = $1 order by b.PlayerID desc, b.AtDate desc LIMIT 1)
esdt$>  order by PlayerID desc, AtDate desc LIMIT 1;
esdt$> $$ language sql immutable;
CREATE FUNCTION
esdt=> explain analyze select PlayerID,AtDate from VCurPlayer3 where
PlayerID='22220';

  Index Scan using pk_player on player  (cost=0.00..1331.83 rows=9
width=23) (actual time=129.858..129.863 rows=1 loops=1)
    Index Cond: ((playerid)::text = '22220'::text)
    Filter: ((atdate)::text = (player_max_atdate(playerid))::text)
  Total runtime: 129.906 ms

At 00:16 05/09/23, Simon Riggs wrote:
>If the current value is used so often, use two tables - one with a
>current view only of the row maintained using UPDATE. Different
>performance issues maybe, but at least not correlated subquery ones.

Many of our tables have similar construct and it would be a huge task to
duplicate and maintain all these tables throughout the system. We would
prefer a solution with SQL or function at the view or db level, or better
still, a fix, if this problem is considered general enough.

>You're welcome in advance, ;)
>Merlin

Thank you all in advance for any further ideas.
KC.


pgsql-performance by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: tsearch2 seem very slow
Next
From: Ulrich Wisser
Date:
Subject: Re: How to determine cause of performance problem?