Re: *very* slow query to summarize data for a month ... - Mailing list pgsql-performance

From Marc G. Fournier
Subject Re: *very* slow query to summarize data for a month ...
Date
Msg-id 20031110203259.Y727@ganymede.hub.org
Whole thread Raw
In response to Re: *very* slow query to summarize data for a month ...  (Josh Berkus <josh@agliodbs.com>)
List pgsql-performance

On Mon, 10 Nov 2003, Josh Berkus wrote:

> Marc,
>
> I'd say your machine is very low on available RAM, particularly sort_mem.
> The steps which are taking a long time are:

Here's the server:

last pid: 42651;  load averages:  1.52,  0.96,  0.88
up 28+07:43:33  20:35:44
307 processes: 2 running, 304 sleeping, 1 zombie
CPU states: 18.0% user,  0.0% nice, 29.1% system,  0.6% interrupt, 52.3% idle
Mem: 1203M Active, 1839M Inact, 709M Wired, 206M Cache, 199M Buf, 5608K Free
Swap: 8192M Total, 1804K Used, 8190M Free

>
> >  Aggregate  (cost=32000.94..32083.07 rows=821 width=41) (actual
> time=32983.36..47586.17 rows=144 loops=1)
> >    ->  Group  (cost=32000.94..32062.54 rows=8213 width=41) (actual
> time=32957.40..42817.88 rows=462198 loops=1)
>
> and:
>
> >                ->  Merge Join  (cost=31321.45..31466.92 rows=8213 width=41)
> (actual time=13983.07..22642.14 rows=462198 loops=1)
> >                      Merge Cond: ("outer".company_id = "inner".company_id)
> >                      ->  Sort  (cost=24.41..25.29 rows=352 width=25) (actual
> time=5.52..7.40 rows=348 loops=1)
>
> There are also *large* delays between steps.    Either your I/O is saturated,
> or you haven't run a VACUUM FULL ANALYZE in a while (which would also explain
> the estimates being off).

thought about that before I started the thread, and ran it just in case ...

just restarted the server with sort_mem set to 10M, and didn't help much on the Aggregate, or MergeJoin ... :

                                                                     QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=39674.38..39756.70 rows=823 width=41) (actual time=33066.25..54021.50 rows=144 loops=1)
   ->  Group  (cost=39674.38..39736.12 rows=8232 width=41) (actual time=33040.25..47005.57 rows=462198 loops=1)
         ->  Sort  (cost=39674.38..39694.96 rows=8232 width=41) (actual time=33040.22..37875.97 rows=462198 loops=1)
               Sort Key: c.company_name, ts.company_id
               ->  Merge Join  (cost=38993.22..39139.02 rows=8232 width=41) (actual time=14428.17..23568.80 rows=462198
loops=1)
                     Merge Cond: ("outer".company_id = "inner".company_id)
                     ->  Sort  (cost=24.41..25.29 rows=352 width=25) (actual time=5.80..7.66 rows=348 loops=1)
                           Sort Key: c.company_id
                           ->  Seq Scan on company c  (cost=0.00..9.52 rows=352 width=25) (actual time=0.08..3.06
rows=352loops=1) 
                     ->  Sort  (cost=38968.82..38989.40 rows=8232 width=16) (actual time=14422.27..17429.34 rows=462198
loops=1)
                           Sort Key: ts.company_id
                           ->  Seq Scan on traffic_logs ts  (cost=0.00..38433.46 rows=8232 width=16) (actual
time=0.15..8119.72rows=462198 loops=1) 
                                 Filter: (date_trunc('month'::text, runtime) = '2003-10-01 00:00:00'::timestamp without
timezone) 
 Total runtime: 54034.44 msec
(14 rows)

the problem is that the results we are comparing with right now is the one
that had the - time on it :(  Just restarted the server with default
sort_mem, and here is the query with that:

                                                                     QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=39691.27..39773.61 rows=823 width=41) (actual time=35077.18..50424.74 rows=144 loops=1)
   ->  Group  (cost=39691.27..39753.03 rows=8234 width=41) (actual time=35051.29..-650049.84 rows=462198 loops=1)
         ->  Sort  (cost=39691.27..39711.86 rows=8234 width=41) (actual time=35051.26..38847.40 rows=462198 loops=1)
               Sort Key: c.company_name, ts.company_id
               ->  Merge Join  (cost=39009.92..39155.76 rows=8234 width=41) (actual time=16155.37..25439.42 rows=462198
loops=1)
                     Merge Cond: ("outer".company_id = "inner".company_id)
                     ->  Sort  (cost=24.41..25.29 rows=352 width=25) (actual time=5.85..7.71 rows=348 loops=1)
                           Sort Key: c.company_id
                           ->  Seq Scan on company c  (cost=0.00..9.52 rows=352 width=25) (actual time=0.10..3.07
rows=352loops=1) 
                     ->  Sort  (cost=38985.51..39006.10 rows=8234 width=16) (actual time=16149.46..19437.47 rows=462198
loops=1)
                           Sort Key: ts.company_id
                           ->  Seq Scan on traffic_logs ts  (cost=0.00..38450.00 rows=8234 width=16) (actual
time=0.16..8869.37rows=462198 loops=1) 
                                 Filter: (date_trunc('month'::text, runtime) = '2003-10-01 00:00:00'::timestamp without
timezone) 
 Total runtime: 50426.80 msec
(14 rows)


And, just on a whim, here it is set to 100M:

                                                                     QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=39691.27..39773.61 rows=823 width=41) (actual time=25888.20..38909.88 rows=144 loops=1)
   ->  Group  (cost=39691.27..39753.03 rows=8234 width=41) (actual time=25862.81..34591.76 rows=462198 loops=1)
         ->  Sort  (cost=39691.27..39711.86 rows=8234 width=41) (actual time=25862.77..723885.95 rows=462198 loops=1)
               Sort Key: c.company_name, ts.company_id
               ->  Merge Join  (cost=39009.92..39155.76 rows=8234 width=41) (actual time=12471.23..21855.08 rows=462198
loops=1)
                     Merge Cond: ("outer".company_id = "inner".company_id)
                     ->  Sort  (cost=24.41..25.29 rows=352 width=25) (actual time=5.87..7.74 rows=348 loops=1)
                           Sort Key: c.company_id
                           ->  Seq Scan on company c  (cost=0.00..9.52 rows=352 width=25) (actual time=0.11..3.14
rows=352loops=1) 
                     ->  Sort  (cost=38985.51..39006.10 rows=8234 width=16) (actual time=12465.29..14941.24 rows=462198
loops=1)
                           Sort Key: ts.company_id
                           ->  Seq Scan on traffic_logs ts  (cost=0.00..38450.00 rows=8234 width=16) (actual
time=0.18..9106.16rows=462198 loops=1) 
                                 Filter: (date_trunc('month'::text, runtime) = '2003-10-01 00:00:00'::timestamp without
timezone) 
 Total runtime: 39077.75 msec
(14 rows)

So, it does give a noticeable improvement the higher the sort_mem ...

And, @ 100M for sort_mem and using the month_trunc index:

                                                                              QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=32089.29..32171.63 rows=823 width=41) (actual time=30822.51..57202.44 rows=144 loops=1)
   ->  Group  (cost=32089.29..32151.04 rows=8234 width=41) (actual time=30784.24..743396.18 rows=462198 loops=1)
         ->  Sort  (cost=32089.29..32109.87 rows=8234 width=41) (actual time=30784.21..36212.96 rows=462198 loops=1)
               Sort Key: c.company_name, ts.company_id
               ->  Merge Join  (cost=31407.94..31553.77 rows=8234 width=41) (actual time=11384.79..24918.56 rows=462198
loops=1)
                     Merge Cond: ("outer".company_id = "inner".company_id)
                     ->  Sort  (cost=24.41..25.29 rows=352 width=25) (actual time=5.92..9.55 rows=348 loops=1)
                           Sort Key: c.company_id
                           ->  Seq Scan on company c  (cost=0.00..9.52 rows=352 width=25) (actual time=0.08..3.21
rows=352loops=1) 
                     ->  Sort  (cost=31383.53..31404.12 rows=8234 width=16) (actual time=11378.81..15211.07 rows=462198
loops=1)
                           Sort Key: ts.company_id
                           ->  Index Scan using tl_month on traffic_logs ts  (cost=0.00..30848.02 rows=8234 width=16)
(actualtime=0.46..7055.75 rows=462198 loops=1) 
                                 Index Cond: (month_trunc(runtime) = '2003-10-01 00:00:00'::timestamp without time
zone)
 Total runtime: 57401.72 msec
(14 rows)


pgsql-performance by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: Re: *very* slow query to summarize data for a month ...
Next
From: Dennis Bjorklund
Date:
Subject: Re: *very* slow query to summarize data for a month ...