Re: Large # of rows in query extremely slow, not using - Mailing list pgsql-performance

From Manfred Koizar
Subject Re: Large # of rows in query extremely slow, not using
Date
Msg-id v70tk05ad5ngfrv1hr774d0qap5k0lr0j4@email.aon.at
Whole thread Raw
In response to Re: Large # of rows in query extremely slow, not using  (Stephen Crowley <stephen.crowley@gmail.com>)
List pgsql-performance
On Fri, 17 Sep 2004 19:23:44 -0500, Stephen Crowley
<stephen.crowley@gmail.com> wrote:
>Seq Scan [...] rows=265632
>  Filter: ((date = '2004-09-07'::date) AND ((stock)::text = 'MSFT'::text))
>Total runtime: 412703.000 ms
>
>random_page_cost and effective_cache_size are both default, 8 and 1000

Usually random_page_cost is 4.0 by default.  And your
effective_cache_size setting is far too low for a modern machine.

>"Index Scan [...] rows=159618
>"  Index Cond: ((date = '2004-09-07'::date) AND ((stock)::text = 'ORCL'::text))"
>"Total runtime: 201009.000 ms"

Extrapolating this to 265000 rows you should be able to get the MSFT
result in ca. 330 seconds, if you can persuade the planner to choose an
index scan.  Fiddling with random_page_cost and effective_cache_size
might do the trick.

>So now  this in all in proportion and works as expected.. the question
>is, why would the fact that it needs to be vaccumed cause such a huge
>hit in performance? When i vacuumed it did free up nearly 25% of the
>space.

So before the VACCUM a seq scan would have taken ca. 550 seconds.  Your
MSFT query with LIMIT 10 took ca. 350 seconds.  It's not implausible to
assume that more than half of the table had to be scanned to find the
first ten rows matching the filter condition.

Servus
 Manfred

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Large # of rows in query extremely slow, not using
Next
From: Guy Thornley
Date:
Subject: O_DIRECT setting