Re: understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr - Mailing list pgsql-general

From Shaun Thomas
Subject Re: understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr
Date
Msg-id 53E13DEB.5070009@optionshouse.com
Whole thread Raw
In response to Re: understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr  (john gale <john@smadness.com>)
Responses Re: understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr  (john gale <john@smadness.com>)
List pgsql-general
On 08/05/2014 03:06 PM, john gale wrote:

> Even on a 114G table with a 16G index, you would consider this slow?
>   (physical disk space is closer to 800G, that was our high-water before
> removing lots of rows and vacuuming, although it is running on SSD)

Yes, actually, Especially now that you've told me you're on an SSD.
While that's way over my comfort level since it complicates maintenance,
table size shouldn't significantly impact index seek performance. What
really matters is the number of rows matched, since that directly
corresponds to how much work the database needs to do.

>   Hash Join  (cost=292303.24..348662.93 rows=28273 width=1466) (actual
> time=23875.727..824373.654 rows=335032 loops=1)

Well, the estimated versus actual are off by an order of magnitude.
Instead of 28k rows, it actually got 335k. However, in this case you'd
want a hash or merge join, so that isn't your primary issue. I'd be more
worried if we saw these numbers in a nested loop, but that isn't the
case here.

Still, what is your default_statistics_target set to?

>                 ->  Bitmap Index Scan on
> index_testruns_on_custom_spawnid  (cost=0.00..41437.84 rows=500170
> width=0) (actual time=4872.404..4872.404 rows=2438520 loops=1)

Ouch, ouch, and more ouch. Your index_testruns_on_custom_spawnid index
on matched 2.5 million rows...

>                 ->  Bitmap Index Scan on
> index_testruns_on_started_at_2  (cost=0.00..250112.08 rows=3188736
> width=0) (actual time=18679.875..18679.875 rows=5822899 loops=1)

And the index_testruns_on_started_at_2 index matched 5.8 million. Even
the estimate guessed 3.1 million here, so at least it's the right scale.
Still, this is a *ton* of data to verify with random seeks, even on an SSD.

> Time: 824473.429 ms

That's a time I would expect from wading through so much data. I'm
suspicious of your queries and index selection. Are you regularly
pulling and processing 300k records from a 100GB+ table? Since the
results are about an order of magnitude less than the index matches, you
need to really look at reducing your amount of matches before your run
time will decrease by any significant amount.

I'm not entirely familiar with the most efficient manner to index hstore
data, but someone here can pipe in. The stats and run times all look
legitimate to me. The question is, can you reduce the number of matches
through better index correlation, or more specific where clauses? The
amount of data you're matching will necessarily require a lot of
processing time.

--
Shaun Thomas
OptionsHouse, LLC | 141 W. Jackson Blvd. | Suite 800 | Chicago IL, 60604
312-676-8870
sthomas@optionshouse.com

______________________________________________

See http://www.peak6.com/email_disclaimer/ for terms and conditions related to this email


pgsql-general by date:

Previous
From: john gale
Date:
Subject: Re: understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr
Next
From: john gale
Date:
Subject: Re: understanding why two nearly identical queries take two different planner routes, one 5s and one 2hr