Re: Nested loop question - Mailing list pgsql-performance

From Richard Huxton
Subject Re: Nested loop question
Date
Msg-id 200312170952.15759.dev@archonet.com
Whole thread Raw
In response to Nested loop question  ("Nick Fankhauser - Doxpop" <nickf@doxpop.com>)
Responses Re: Nested loop question  ("Nick Fankhauser" <nickf@ontko.com>)
List pgsql-performance
On Tuesday 16 December 2003 17:06, Nick Fankhauser - Doxpop wrote:
> Hi-
>
> I'm trying to optimize a query that I *think* should run very fast.
> Essentially, I'm joining two tables that have very selective indexes and
> constraining the query on an indexed field. (There's a third small lookup
> table in the mix, but it doesn't really affect the bottom line.)

> I'm unsure what is happening next. I notice that an index scan is occurring
> on actor_summary_pk, with an "actual time" of 9.15, but then it looks like
> a nested loop occurs at the next level to join these tables. Does this mean
> that each probe of the actor_summary index will take 9.15 msec, but the
> nested loop is going to do this once for each actor_id?

That's right - you need to multiply the actual time by the number of loops. In
your case this would seem to be about 33 seconds.

>                                  ->  Index Scan using actor_summary_pk on
> actor_summary  (cost=0.00..8.11 rows=1 width=72) (actual time=9.14..9.15
> rows=1 loops=3639)
>                                        Index Cond: ("outer".actor_id =
> actor_summary.actor_id)

> The nested loop appears to be where most of my time is going, so I'm
> focusing on this area, but don't know if there is a better approach to this
> join.
>
> Is there a more efficient means than a nested loop to handle such a join?
> Would a different method be chosen if there was exactly one row in
> actor_summary for every row in actor?

Hmm - tricky to say in your case. PG has decided to filter on actor then look
up the corresponding values in actor_summary. Given that you have 3 million
rows in both tables that seems a reasonable approach. You could always try
forcing different plans by switching the various ENABLE_HASHJOIN etc options
(see the runtime configuration section of the manuals). I'm not sure that
will help you here though.

The fact that it's taking you 9ms to do each index lookup suggests to me that
it's going to disk each time. Does that sound plausible, or do you think you
have enough RAM to cache your large indexes?

--
  Richard Huxton
  Archonet Ltd

pgsql-performance by date:

Previous
From: Dennis Bjorklund
Date:
Subject: Re: Why is restored database faster?
Next
From: "Nick Fankhauser"
Date:
Subject: Re: Nested loop performance