Re: database performance and query performance question - Mailing list pgsql-performance

From Shea,Dan [CIS]
Subject Re: database performance and query performance question
Date
Msg-id F2D63B916C88C14D9B59F93C2A5DD33F0B9118@cisxa.cis.ec.gc.ca
Whole thread Raw
In response to database performance and query performance question  ("Shea,Dan [CIS]" <Dan.Shea@ec.gc.ca>)
List pgsql-performance
-----Original Message-----
From: Josh Berkus [mailto:josh@agliodbs.com]
Sent: Thursday, January 22, 2004 3:01 PM
To: Shea,Dan [CIS]; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] database performance and query performance
question


Dan,

> Should there be less columns in the index?
> How can we improve database performance?
> How should I improve my query?

>>Your query plan isn't the problem.  It's a good plan, and a reasonably
>>efficient query.   Under other circumstances, the SELECT DISTINCT with the

>>to_char could be a performance-killer, but it's not in that result set.

>>Overall, you're taking 9 seconds to scan 93 million records.  Is this the
time
>>the first time you run the query, or the 2nd and successive times?

This is actually the second time.  The first query took more time.
Concerning the number of columns for an index, I switched the index to have
only one column and tried the same query.  It is below.


>>When did you last run VACUUM ANALYZE on the table?   Have you tried
increasing
>>the ANALYZE statistics on the index columns to, say, 500?
 It is run nightly. But last night's did not complete.  It was taking quite
some time and I cancelled it, over 4 hours. I will try increasing the
ANALYZE statistics to 500.

>>Your disks are RAID 5.  How many drives?  In RAID5, more drives improves
the
>>speed of large scans.
 There are 4 drives in this raid 5.  We are using lvm with ext3 filesystem.
Will be moving the database to a SAN within the next month.

And what's your sort_mem setting?   You didn't mention it.
>>The sort_mem is the default
PWFPM_DEV=# show sort_mem;
 sort_mem
----------
 1024

Why is your effective cache size only 300mb when you have 3 GB of RAM?  It's

not affecting this query, but it could affect others.
>> Oh, I thought I had it set for 2.5 GB of RAM. 312500 * 8k = 2.5 GB


QUERY WITH 1 column in index.

 Unique  (cost=717633.28..717633.29 rows=1 width=83) (actual
time=62922.399..62923.334 rows=115 loops=1)
   ->  Sort  (cost=717633.28..717633.29 rows=1 width=83) (actual
time=62922.395..62922.615 rows=194 loops=1)
         Sort Key: to_char(valid_time, 'YYYYMMDDHH24MISS'::text), issue_time
         ->  Subquery Scan foo  (cost=717633.26..717633.27 rows=1 width=83)
(actual time=62918.232..62919.989 rows=194 loops=1)
               ->  Sort  (cost=717633.26..717633.26 rows=1 width=30) (actual
time=62902.378..62902.601 rows=194 loops=1)
                     Sort Key: issue_time, reception_time, valid_time
                     ->  Index Scan using forecastelement_v_idx on
forecastelement  (cost=0.00..717633.25 rows=1 width=30) (actual
time=1454.974..62900.752 rows=194 loops=1)
                           Index Cond: ((valid_time >= '2002-09-02
04:00:00'::timestamp without time zone) AND (valid_time <= '2002-09-07
03:59:59'::timestamp without time zone))
                           Filter: (((region_id)::text =
'PU-REG-WTO-00200'::text) AND ((wx_element)::text = 'TEMP_VALEUR1'::text)
AND (issue_time >= '2002-09-02 05:00:00'::timestamp without time zone) AND
(issue_time <= '2002-09-06 05:00:00'::timestamp without time zone) AND
((origin)::text = 'REGIONAL'::text) AND ("time"(issue_time) =
'05:00:00'::time without time zone) AND ((date(valid_time) =
((date(issue_time) + 1) - 1)) OR (date(valid_time) = (date(issue_time) + 1))
OR ((valid_time >= '2002-09-07 00:00:00'::timestamp without time zone) AND
(valid_time <= '2002-09-07 03:59:59'::timestamp without time zone) AND
(issue_time = '2002-09-06 05:00:00'::timestamp without time zone))))
 Total runtime: 62923.723 ms
(10 rows)

PWFPM_DEV=# expalin analyze 312500
PWFPM_DEV=# explain analyze select  DISTINCT ON (valid_time)
to_char(valid_time,'YYYYMMDDHH24MISS') as valid_time,value from
PWFPM_DEV-#                    (select valid_time,value,"time"(valid_time)
as hour,reception_time,
PWFPM_DEV(#                    issue_time from forecastelement where
PWFPM_DEV(#                    valid_time between '2002-09-02 04:00:00' and
PWFPM_DEV(#                    '2002-09-07 03:59:59' and region_id =
'PU-REG-WTO-00200'
PWFPM_DEV(#                    and wx_element = 'TEMP_VALEUR1' and
issue_time between
PWFPM_DEV(#                    '2002-09-02 05:00:00' and '2002-09-06
05:00:00'
PWFPM_DEV(#                    and origin = 'REGIONAL'    and
"time"(issue_time) = '05:00:00'
PWFPM_DEV(#                    order by issue_time,reception_time
DESC,valid_time) as foo where
PWFPM_DEV-#                    (date(valid_time) = date(issue_time)+1 -1  or
date(valid_time) = date(issue_time)+1   or
PWFPM_DEV(#                    (valid_time between '2002-09-07 00:00:00' and
'2002-09-07 03:59:59'
PWFPM_DEV(#                    and issue_time = '2002-09-06 05:00:00'))
order by valid_time ,issue_time DESC;

QUERY PLAN

----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----------------------------------------------------------------------------
--------
 Unique  (cost=717633.28..717633.29 rows=1 width=83) (actual
time=21468.227..21469.164 rows=115 loops=1)
   ->  Sort  (cost=717633.28..717633.29 rows=1 width=83) (actual
time=21468.223..21468.452 rows=194 loops=1)
         Sort Key: to_char(valid_time, 'YYYYMMDDHH24MISS'::text), issue_time
         ->  Subquery Scan foo  (cost=717633.26..717633.27 rows=1 width=83)
(actual time=21465.274..21467.006 rows=194 loops=1)
               ->  Sort  (cost=717633.26..717633.26 rows=1 width=30) (actual
time=21465.228..21465.452 rows=194 loops=1)
                     Sort Key: issue_time, reception_time, valid_time
                     ->  Index Scan using forecastelement_v_idx on
forecastelement  (cost=0.00..717633.25 rows=1 width=30) (actual
time=1479.649..21463.779 rows=194 loops=1)
                           Index Cond: ((valid_time >= '2002-09-02
04:00:00'::timestamp without time zone) AND (valid_time <= '2002-09-07
03:59:59'::timestamp without time zone))
                           Filter: (((region_id)::text =
'PU-REG-WTO-00200'::text) AND ((wx_element)::text = 'TEMP_VALEUR1'::text)
AND (issue_time >= '2002-09-02 05:00:00'::timestamp without time zone) AND
(issue_time <= '2002-09-06 05:00:00'::timestamp without time zone) AND
((origin)::text = 'REGIONAL'::text) AND ("time"(issue_time) =
'05:00:00'::time without time zone) AND ((date(valid_time) =
((date(issue_time) + 1) - 1)) OR (date(valid_time) = (date(issue_time) + 1))
OR ((valid_time >= '2002-09-07 00:00:00'::timestamp without time zone) AND
(valid_time <= '2002-09-07 03:59:59'::timestamp without time zone) AND
(issue_time = '2002-09-06 05:00:00'::timestamp without time zone))))
 Total runtime: 21469.485 ms
(10 rows)

PWFPM_DEV=#
--
-Josh Berkus
 Aglio Database Solutions
 San Francisco

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: database performance and query performance question
Next
From: "Shea,Dan [CIS]"
Date:
Subject: Re: database performance and query performance question