Re: select max(field) from table much faster with a group by clause? - Mailing list pgsql-performance

From Palle Girgensohn
Subject Re: select max(field) from table much faster with a group by clause?
Date
Msg-id 00DC41259D75AF3300AAF9D2@rambutan.pingpong.net
Whole thread Raw
In response to Re: select max(field) from table much faster with a group by clause?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance

--On torsdag, november 01, 2007 09.43.39 -0400 Tom Lane <tgl@sss.pgh.pa.us>
wrote:

> Palle Girgensohn <girgen@pingpong.net> writes:
>> When running
>>  select max("when") from login where userid='userid'
>> it takes a second or two, but when adding "group by userid" the planner
>> decides on using another plan, and it gets *much* faster. See example
>> below.
>
> It's only faster for cases where there are few or no rows for the
> particular userid ...

Well, no, not really. See below. OTH, it sometimes a bit slower. Seems to
depend on how far away from the estimated number of rows you get? Weird?

>> Number of tuples per user varies from zero to a couple of thousands.
>
> The planner is using an intermediate estimate of 406 rows.  You might be
> well advised to increase the statistics target for login.userid --- with
> luck that would help it to choose the right plan type for both common
> and uncommon userids.

I'll try that, thanks!

--

pp=#  SELECT max("when") FROM login WHERE userid='kudo' group by userid;
              max
-------------------------------
 2007-01-04 15:31:46.863325+01
(1 row)

Time: 6,194 ms
pp=#  SELECT max("when") FROM login WHERE userid='kudo' ;
              max
-------------------------------
 2007-01-04 15:31:46.863325+01
(1 row)

Time: 992,391 ms
pp=#  SELECT max("when") FROM login WHERE userid='kudo' ;
              max
-------------------------------
 2007-01-04 15:31:46.863325+01
(1 row)

Time: 779,582 ms
pp=#  SELECT max("when") FROM login WHERE userid='kudo' ;
              max
-------------------------------
 2007-01-04 15:31:46.863325+01
(1 row)

Time: 818,667 ms
pp=#  SELECT max("when") FROM login WHERE userid='kudo' ;
              max
-------------------------------
 2007-01-04 15:31:46.863325+01
(1 row)

Time: 640,242 ms
pp=#  SELECT max("when") FROM login WHERE userid='kudo' group by userid;
              max
-------------------------------
 2007-01-04 15:31:46.863325+01
(1 row)

Time: 18,384 ms
pp=#  SELECT count(*) FROM login WHERE userid='kudo' group by userid;
 count
-------
  1998
(1 row)

Time: 12,762 ms
pp=# explain analyze SELECT max("when") FROM login WHERE userid='kudo'
group by userid;
                                                               QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.00..648.44 rows=1 width=25) (actual
time=24.700..24.703 rows=1 loops=1)
   ->  Index Scan using login_userid_idx on "login"  (cost=0.00..646.40
rows=406 width=25) (actual time=0.140..16.931 rows=1998 loops=1)
         Index Cond: (userid = 'kudo'::text)
 Total runtime: 24.779 ms
(4 rows)

Time: 25,633 ms
pp=# explain analyze SELECT max("when") FROM login WHERE userid='kudo' ;

QUERY PLAN


------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=323.93..323.94 rows=1 width=0) (actual
time=1400.994..1400.997 rows=1 loops=1)
   InitPlan
     ->  Limit  (cost=0.00..323.93 rows=1 width=8) (actual
time=1400.975..1400.979 rows=1 loops=1)
           ->  Index Scan Backward using login_when_idx on "login"
(cost=0.00..131515.87 rows=406 width=8) (actual time=1400.968..1400.968
rows=1 loops=1)
                 Filter: (("when" IS NOT NULL) AND (userid = 'kudo'::text))
 Total runtime: 1401.057 ms
(6 rows)

Time: 1401,881 ms


pp=#  SELECT userid, count("when") FROM login WHERE userid in ('girgen' ,
'kudo') group by userid;
 userid | count
--------+-------
 kudo   |  1998
 girgen |  1120
(2 rows)


pp=# explain analyze SELECT max("when") FROM login WHERE userid='girgen'
group by userid;
                                                               QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=0.00..648.44 rows=1 width=25) (actual
time=25.137..25.141 rows=1 loops=1)
   ->  Index Scan using login_userid_idx on "login"  (cost=0.00..646.40
rows=406 width=25) (actual time=0.121..20.712 rows=1120 loops=1)
         Index Cond: (userid = 'girgen'::text)
 Total runtime: 25.209 ms
(4 rows)

Time: 25,986 ms

pp=# explain analyze SELECT max("when") FROM login WHERE userid='girgen' ;
                                                                      QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=323.93..323.94 rows=1 width=0) (actual time=6.695..6.698
rows=1 loops=1)
   InitPlan
     ->  Limit  (cost=0.00..323.93 rows=1 width=8) (actual
time=6.669..6.675 rows=1 loops=1)
           ->  Index Scan Backward using login_when_idx on "login"
(cost=0.00..131515.87 rows=406 width=8) (actual time=6.660..6.660 rows=1
loops=1)
                 Filter: (("when" IS NOT NULL) AND (userid =
'girgen'::text))
 Total runtime: 6.785 ms
(6 rows)

Time: 7,776 ms


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: select max(field) from table much faster with a group by clause?
Next
From: Sami Dalouche
Date:
Subject: Re: [Fwd: Re: Outer joins and Seq scans]