Re: tweaking costs to favor nestloop - Mailing list pgsql-performance

From Jim C. Nasby
Subject Re: tweaking costs to favor nestloop
Date
Msg-id 20030616001740.R66185@flake.decibel.org
Whole thread Raw
In response to Re: tweaking costs to favor nestloop  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Tom-
FWIW, these are the same kind of numbers I'm seeing for the project I'm
working on.. ie: nested loop estimates at 0.00-3.01 but reality is much
closer to 0.2. I agrees that it probably makes sense to take the
correlation of both tables into account for nested-loop joins.

On Wed, Jun 11, 2003 at 04:17:53PM -0400, Tom Lane wrote:
> Vincent van Leeuwen <pgsql.spam@vinz.nl> writes:
> > I'm unable to tweak the various _cost settings in such a way that attached
> > query will use the right plan.
>
> You aren't going to be able to.  You've already overshot a reasonable
> random_page_cost setting --- to judge by the relative actual costs of
> the merge and hash join, a value somewhere around 3 is appropriate for
> your setup.  (Assuming I did the math right --- if you set it to 3,
> do you get a ratio of merge and hash estimated costs that agrees with
> the ratio of actual runtimes?)
>
> The problem here is that the costing of the repeated inner index scans
> isn't realistic: 35417 probes into "auth" are clearly taking much less
> than 35417 times what a single probe could be expected to take.  We
> talked about how repeated scans would win from caching of the upper
> btree levels, but I think there's more to it than that.  It occurs to me
> that the probes you are making are probably not random and uncorrelated.
> They are driven by the values of reportuser.idreporter ... is it fair
> to guess that most of the reportuser rows link to just a small fraction
> of the total auth population?  If so, the caching could be eliminating
> most of the reads, not just the upper btree levels, because we're
> mostly hitting only small parts of the index and auth tables.
>
> I'm beginning to think that the only reasonable way to model this is to
> cost the entire nestloop join as a unit, so that we have access to
> statistics about the outer table as well as the indexed table.  That
> would give us a shot at estimating how much of the index is likely to
> get touched.
>
> As of 7.3 I think all you can do is force nestloop by disabling the
> other two join types.
>
>             regards, tom lane
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>

--
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"

pgsql-performance by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: left join performance problem
Next
From: Tom Lane
Date:
Subject: Re: left join performance problem