Thread: Very slow query in PostgreSQL 9.3.3
<span style="font-family:Verdana; color:#000000; font-size:10pt;"><div>PostgreSQL 9.3.3 RHEL 6.4<br /><br />Total db Servermemory 64GB<br /><br /><br /># -----------------------------<br /># PostgreSQL configuration file<br /># -----------------------------<br/>max_connections = 100<br />shared_buffers = 16GB<br />work_mem = 32MB <br />maintenance_work_mem = 1GB<br />seq_page_cost = 1.0 <br />random_page_cost= 2.0 <br />cpu_tuple_cost = 0.03 <br />#cpu_index_tuple_cost = 0.005 <br />#cpu_operator_cost = 0.0025 <br />effective_cache_size = 48MB<br />default_statistics_target= 100 <br />constraint_exclusion = partition <br /><br />Partition table Setup<br />---------------------<br/><br />CREATE TABLE measurement (<br /> id bigint not null,<br /> city_id bigint not null,<br /> logdate date not null,<br /> peaktemp bigint,<br /> unitsales bigint,<br /> type bigint,<br /> uuid uuid,<br /> geom geometry<br/>);<br /><br /><br />CREATE TABLE measurement_y2006m02 (<br /> CHECK ( logdate >= DATE '2006-02-01' ANDlogdate < DATE '2006-03-01' )<br />) INHERITS (measurement);<br />CREATE TABLE measurement_y2006m03 (<br /> CHECK( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )<br />) INHERITS (measurement);<br />...<br />CREATETABLE measurement_y2007m11 (<br /> CHECK ( logdate >= DATE '2007-11-01' AND logdate < DATE '2007-12-01')<br />) INHERITS (measurement);<br />CREATE TABLE measurement_y2007m12 (<br /> CHECK ( logdate >= DATE'2007-12-01' AND logdate < DATE '2008-01-01' )<br />) INHERITS (measurement);<br />CREATE TABLE measurement_y2008m01(<br /> CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )<br />) INHERITS(measurement);<br /><br />Partition measurement_y2007m12 contains 38,261,732 rows<br /><br />Indexes on partitionmeasurement_y2007m12:<br /> "pkey_measurement_y2007m12" PRIMARY KEY, btree (id), tablespace "measurement_y2007"<br/> "idx_measurement_uuid_y2003m12" btree (uuid), tablespace "measurement_y2007"<br /> "idx_measurement_type_y2003m12"btree (type), tablespace "measurement_y2007"<br /> "idx_measurement_city_y2003m12" btree(city_id), tablespace "measurement_y2007"<br /> "idx_measurement_logdate_y2003m12" btree (logdate), tablespace "measurement_y2007"<br/> "sidx_measurement_geom_y2003m12" gist (geom), tablespace "measurement_y2007"<br /><br /><b>***Problem Query *** </b><br /><br />explain (analyze on, buffers on) Select * from measurement this_ <br /> where this_.logdate between '2007-12-19 23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp<br/> and this_.city_id=25183 order by this_.logdate asc, this_.peaktempasc, this_.unitsales asc limit 10000;<br /><br /> QUERY PLAN <br />-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<br /> Limit (cost=33849.98..33855.15 rows=2068 width=618) (actual time=51710.803..51714.266 rows=10000 loops=1)<br /> Buffers:shared hit=25614 read=39417<br /> -> Sort (cost=33849.98..33855.15 rows=2068 width=618) (actual time=51710.799..51712.924rows=10000 loops=1)<br /> Sort Key: this_.logdate, this_.unitsales<br /> Sort Method:top-N heapsort Memory: 15938kB<br /> Buffers: shared hit=25614 read=39417<br /> -> Append (cost=0.00..33736.09rows=2068 width=618) (actual time=50.210..50793.589 rows=312046 loops=1)<br /> Buffers:shared hit=25608 read=39417<br /> -> Seq Scan on measurement this_ (cost=0.00..0.00 rows=1 width=840)(actual time=0.002..0.002 rows=0 loops=1)<br /> Filter: ((logdate >= '2007-12-19 23:38:41.22'::timestampwithout time zone) AND (logdate <= '2007-12-20 08:01:04.22'::timestamp without time zone) AND (city_id= 25183))<br /> -> Index Scan using idx_measurement_city_y2007m12 on measurement_y2007m12 this__1 (cost=0.56..33736.09 rows=2067 width=618) (actual time=50.206..50731.637 rows=312046 loops=1)<br /> Index Cond: (city_id = 25183)<br /> Filter: ((logdate >= '2007-12-19 23:38:41.22'::timestampwithout time zone) AND (logdate <= '2007-12-20 08:01:04.22'::timestamp without time zone))<br /> Buffers: shared hit=25608 read=39417<br /><br /> Total runtime: <b>51717.639 ms</b> <--- *** unacceptable***<br /><br />(15 rows) <br /><br />Total Rows meeting query criteria<br />---------------------------------<br/><br />Select count(*) from measurement this_ where this_.logdate between '2007-12-1923:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp and this_.city_id=25183;<br /><br />count<br />------<br/>312046</div><div><br />Total Rows in the partition table referenced<br />------------------------------------------<br/><br />Select count(*) from measurement_y2007m12;<br /><br /> count<br />---------<br/>38261732</div><div><br /></div><div><b>Does anyone know how to speed up this query? I removed the order byclause and that significantly reduced the run time to approx. 2000-3000 ms. This query is being recorded repeatedly <br/>in our logs and executes very slowly for our UI users from 12000 ms thru 68000 ms<br /><br />Any suggestions would beappreciated.</b><br /><br />thanks<br mce_bogus="1" /></div></span>
2014-03-13 20:26 GMT+01:00 <fburgess@radiantblue.com>:
PostgreSQL 9.3.3 RHEL 6.4
Total db Server memory 64GB
# -----------------------------
# PostgreSQL configuration file
# -----------------------------
max_connections = 100
shared_buffers = 16GB
work_mem = 32MB
maintenance_work_mem = 1GB
seq_page_cost = 1.0
random_page_cost = 2.0
cpu_tuple_cost = 0.03
#cpu_index_tuple_cost = 0.005
#cpu_operator_cost = 0.0025
effective_cache_size = 48MB
default_statistics_target = 100
constraint_exclusion = partition
Partition table Setup
---------------------
CREATE TABLE measurement (
id bigint not null,
city_id bigint not null,
logdate date not null,
peaktemp bigint,
unitsales bigint,
type bigint,
uuid uuid,
geom geometry
);
CREATE TABLE measurement_y2006m02 (
CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
) INHERITS (measurement);
CREATE TABLE measurement_y2006m03 (
CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )
) INHERITS (measurement);
...
CREATE TABLE measurement_y2007m11 (
CHECK ( logdate >= DATE '2007-11-01' AND logdate < DATE '2007-12-01' )
) INHERITS (measurement);
CREATE TABLE measurement_y2007m12 (
CHECK ( logdate >= DATE '2007-12-01' AND logdate < DATE '2008-01-01' )
) INHERITS (measurement);
CREATE TABLE measurement_y2008m01 (
CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
) INHERITS (measurement);
Partition measurement_y2007m12 contains 38,261,732 rows
Indexes on partition measurement_y2007m12:
"pkey_measurement_y2007m12" PRIMARY KEY, btree (id), tablespace "measurement_y2007"
"idx_measurement_uuid_y2003m12" btree (uuid), tablespace "measurement_y2007"
"idx_measurement_type_y2003m12" btree (type), tablespace "measurement_y2007"
"idx_measurement_city_y2003m12" btree (city_id), tablespace "measurement_y2007"
"idx_measurement_logdate_y2003m12" btree (logdate), tablespace "measurement_y2007"
"sidx_measurement_geom_y2003m12" gist (geom), tablespace "measurement_y2007"
*** Problem Query ***
explain (analyze on, buffers on) Select * from measurement this_
where this_.logdate between '2007-12-19 23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp
and this_.city_id=25183 order by this_.logdate asc, this_.peaktemp asc, this_.unitsales asc limit 10000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=33849.98..33855.15 rows=2068 width=618) (actual time=51710.803..51714.266 rows=10000 loops=1)
Buffers: shared hit=25614 read=39417
-> Sort (cost=33849.98..33855.15 rows=2068 width=618) (actual time=51710.799..51712.924 rows=10000 loops=1)
Sort Key: this_.logdate, this_.unitsales
Sort Method: top-N heapsort Memory: 15938kB
Buffers: shared hit=25614 read=39417
-> Append (cost=0.00..33736.09 rows=2068 width=618) (actual time=50.210..50793.589 rows=312046 loops=1)
Buffers: shared hit=25608 read=39417
-> Seq Scan on measurement this_ (cost=0.00..0.00 rows=1 width=840) (actual time=0.002..0.002 rows=0 loops=1)
Filter: ((logdate >= '2007-12-19 23:38:41.22'::timestamp without time zone) AND (logdate <= '2007-12-20 08:01:04.22'::timestamp without time zone) AND (city_id = 25183))
-> Index Scan using idx_measurement_city_y2007m12 on measurement_y2007m12 this__1 (cost=0.56..33736.09 rows=2067 width=618) (actual time=50.206..50731.637 rows=312046 loops=1)
Index Cond: (city_id = 25183)
Filter: ((logdate >= '2007-12-19 23:38:41.22'::timestamp without time zone) AND (logdate <= '2007-12-20 08:01:04.22'::timestamp without time zone))
Buffers: shared hit=25608 read=39417
Total runtime: 51717.639 ms <--- *** unacceptable ***
(15 rows)
Total Rows meeting query criteria
---------------------------------
Select count(*) from measurement this_ where this_.logdate between '2007-12-19 23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp and this_.city_id=25183;
count
------
312046
Total Rows in the partition table referenced
------------------------------------------
Select count(*) from measurement_y2007m12;
count
---------
38261732Does anyone know how to speed up this query? I removed the order by clause and that significantly reduced the run time to approx. 2000-3000 ms. This query is being recorded repeatedly
in our logs and executes very slowly for our UI users from 12000 ms thru 68000 ms
Any suggestions would be appreciated.
sort (ORDER BY clause) enforce a reading of complete partitions. And it is slow - it is strange so reading 300K rows needs a 5K sec. Probably your IO is overloaded.
Regards
Pavel Stehule
Pavel Stehule
thanks
On Thu, Mar 13, 2014 at 12:26 PM, <fburgess@radiantblue.com> wrote: > *** Problem Query *** > > explain (analyze on, buffers on) Select * from measurement this_ > where this_.logdate between '2007-12-19 > 23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp > and this_.city_id=25183 order by > this_.logdate asc, this_.peaktemp asc, this_.unitsales asc limit 10000; > [...] > Total runtime: 51717.639 ms <--- *** unacceptable *** Try to create a multi-column index on the partition by (city_id, logdate). Then run the original query and the query without peaktemp and nitsales on the order by. Compare the results, and if the first one will not be satisfying try to add these two columns to the end of the column list of your multi-column index on the order as they appear in your query. It should do the trick. If it wont, please, show the plans. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray.ru@gmail.com
2014-03-14 4:26 GMT+09:00 <fburgess@radiantblue.com>: > PostgreSQL 9.3.3 RHEL 6.4 > > Total db Server memory 64GB (...) > effective_cache_size = 48MB I'm not sure if this will help directly, but is the value for 'effective_cache_size' intentional? 48 *GB* would be a more likely setting. Regards Ian Barwick