Thread: Problem with planner choosing nested loop

Problem with planner choosing nested loop

From

Alex Solovey

Date:

02 April 2008, 17:37:26

Hello,

I was trying to optimize a slow query in database running 8.3.1. It
turned out that planner is choosing nested loop join resulting in
multiple sequential scans over the long table. Here is a simplified
database schema, consisting of two tables:

     CREATE TABLE bar (
         bar_id integer PRIMARY KEY,
         bar_a integer,
         bar_b integer,
         bar_c integer,
         bar_d integer,
         bar_e integer,
         bar_f integer,
         bar_g integer,
        bar_h integer
     );

     CREATE TABLE foo (
         foo_a  integer,
         foo_b  integer,
         foo_c  integer,
         bar_id integer
     );

Table "bar" has 16805 records and table "foo" is fairly big, having over
6 million records. default_statistics_target is set to 1000 (in fact, I
tried many values from 100 to 1000 but it did not help), VACUUM ANALYZE
was executed before running test queries.

Running this query:
     EXPLAIN ANALYZE SELECT foo_b, SUM(foo_c)
     FROM foo JOIN bar USING (bar_id) WHERE foo_a = 1001
     AND bar_h = 1821 AND bar_c = 519 GROUP BY foo_b;

produces this plan:
                                                       QUERY PLAN

-----------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=110916.41..110916.42 rows=1 width=8) (actual
time=20547.433..20547.433 rows=1 loops=1)
    ->  Nested Loop  (cost=0.00..110916.40 rows=1 width=8) (actual
time=17952.622..20547.175 rows=59 loops=1)
          Join Filter: (foo.bar_id = bar.bar_id)
          ->  Seq Scan on bar  (cost=0.00..393.07 rows=1 width=4)
(actual time=0.098..3.561 rows=24 loops=1)
                Filter: ((bar_h = 1821) AND (bar_c = 519))
          ->  Seq Scan on foo  (cost=0.00..110510.89 rows=995 width=12)
(actual time=0.957..855.366 rows=1369 loops=24)
                Filter: (foo.foo_a = 1001)
  Total runtime: 20547.518 ms

The problem is that 6+ million rows table "foo" is scanned 24 times:
      Seq Scan on foo  (... loops=24)

If I try to disable nested loops using set enable_nestloop=off, the plan
is just fine:
                                                       QUERY PLAN

----------------------------------------------------------------------------------------------------------------------
  HashAggregate  (cost=110907.72..110907.73 rows=1 width=8) (actual
time=889.239..889.240 rows=1 loops=1)
    ->  Hash Join  (cost=393.09..110907.72 rows=1 width=8) (actual
time=17.825..889.065 rows=59 loops=1)
          Hash Cond: (foo.bar_id = bar.bar_id)
          ->  Seq Scan on foo  (cost=0.00..110510.89 rows=995 width=12)
(actual time=2.309..883.841 rows=1369 loops=1)
                Filter: (foo_a = 1001)
          ->  Hash  (cost=393.07..393.07 rows=1 width=4) (actual
time=4.168..4.168 rows=24 loops=1)
                ->  Seq Scan on bar  (cost=0.00..393.07 rows=1 width=4)
(actual time=0.118..4.141 rows=24 loops=1)
                      Filter: ((bar_h = 1821) AND (bar_c = 519))
  Total runtime: 889.329 ms

Unfortunately, I cannot disable nested loops because if I do, some other
queries degrade miserably, and disabling nested loops just for this
query is not an option.
I think the problem is caused by wrong estimate for the table "bar":

     Seq Scan on bar  (cost=0.00..393.07 rows=1 width=4) (actual
time=0.098..3.561 rows=24 loops=1)

but so far, I have no idea how it could be fixed. As I've said, I tried
increasing statistics_target to the max value (1000) but it did not help.
The test database dump (6.3 Mb download) is available at
http://216.159.242.194/test_dump.sql.bz2

Alex

Re: Problem with planner choosing nested loop

From

"Rodrigo E. De León Plicet"

Date:

02 April 2008, 18:08:21

On Wed, Apr 2, 2008 at 12:36 PM, Alex Solovey <a.solovey@gmail.com> wrote:
>  ... I have no idea how it could be fixed.

- CREATE INDEX xifoo ON foo(bar_id);
- ANALYZE;
- Retry.

Re: Problem with planner choosing nested loop

From

"Rodrigo E. De León Plicet"

Date:

02 April 2008, 18:16:57

Also important, consider creating additional indexes based on your
access patterns.

Re: Problem with planner choosing nested loop

From

Alex Solovey

Date:

02 April 2008, 18:20:57

 > - CREATE INDEX xifoo ON foo(bar_id);

In this simple (which means "reduced") test database, yes. But the
actual table "foo" in production database:

1. partitioned, with 50+ partitions
2. heavily updated (and indexes make it slow)
3. has more fields like bar_id

We had indexes on several fields based on typical access patterns, but
after "foo" grew to a certain size (many gigabytes), sequential scans on
selected partitions became the only feasible solution.
We can fix this particular query with an index, but the more general
problem with planner choosing multiple scans over big table due to wrong
estimate of results from the small table, remains.

Alex

Rodrigo E. De León Plicet wrote:
> On Wed, Apr 2, 2008 at 12:36 PM, Alex Solovey <a.solovey@gmail.com> wrote:
>>  ... I have no idea how it could be fixed.
>
> - CREATE INDEX xifoo ON foo(bar_id);
> - ANALYZE;
> - Retry.

Re: Problem with planner choosing nested loop

From

"Scott Marlowe"

Date:

02 April 2008, 18:27:39

On Wed, Apr 2, 2008 at 12:12 PM, Rodrigo E. De León Plicet
<rdeleonp@gmail.com> wrote:
> Also important, consider creating additional indexes based on your
>  access patterns.

Good point.  Note that you can create indexes and then track their
usefulness with the pg_stat_user_indexes view, which I find incredibly
useful.  In fact, all the pg_stat_ views are useful.

Re: Problem with planner choosing nested loop

From

"Rodrigo E. De León Plicet"

Date:

02 April 2008, 18:57:14

On Wed, Apr 2, 2008 at 1:20 PM, Alex Solovey <a.solovey@gmail.com> wrote:
>  In this simple (which means "reduced") test database, yes. But the actual
> table "foo" in production database:
>
>  1. partitioned, with 50+ partitions
>  2. heavily updated (and indexes make it slow)
>  3. has more fields like bar_id
>
>  We had indexes on several fields based on typical access patterns, but
> after "foo" grew to a certain size (many gigabytes), sequential scans on
> selected partitions became the only feasible solution.
>  We can fix this particular query with an index, but the more general
> problem with planner choosing multiple scans over big table due to wrong
> estimate of results from the small table, remains.

Then provide actual DDL plus the production EXPLAIN ANALYZE output and
post it, maybe one of the Postgres gurus can help.

Good luck.

Re: Problem with planner choosing nested loop

From

Alex Solovey

Date:

02 April 2008, 19:02:26

The reduced database example has the same problem in EXPLAIN ANALYZE as
production one, here:

     Seq Scan on bar  (cost=0.00..393.07 rows=1 width=4) (actual
time=0.098..3.561 rows=24 loops=1)

That's why I posted the smallest dataset I could reproduce the problem with.

Rodrigo E. De León Plicet wrote:
 >
> Then provide actual DDL plus the production EXPLAIN ANALYZE output and
> post it, maybe one of the Postgres gurus can help.
>
> Good luck.

Re: Problem with planner choosing nested loop

From

Harald Fuchs

Date:

02 April 2008, 19:07:00

In article <a55915760804021107h4410dd93m33a6f94e28a83a2b@mail.gmail.com>,
"Rodrigo E. De León Plicet" <rdeleonp@gmail.com> writes:

> On Wed, Apr 2, 2008 at 12:36 PM, Alex Solovey <a.solovey@gmail.com> wrote:
>> ... I have no idea how it could be fixed.

> - CREATE INDEX xifoo ON foo(bar_id);
> - ANALYZE;
> - Retry.

A compound index
  CREATE INDEX xifoo2 ON foo (foo_a, bar_id)
might be more worthwhile.

Re: Problem with planner choosing nested loop

From

"Scott Marlowe"

Date:

02 April 2008, 19:21:12

On Wed, Apr 2, 2008 at 1:06 PM, Harald Fuchs <hari.fuchs@googlemail.com> wrote:
> In article <a55915760804021107h4410dd93m33a6f94e28a83a2b@mail.gmail.com>,
>
> "Rodrigo E. De León Plicet" <rdeleonp@gmail.com> writes:
>
>  > On Wed, Apr 2, 2008 at 12:36 PM, Alex Solovey <a.solovey@gmail.com> wrote:
>  >> ... I have no idea how it could be fixed.
>
>  > - CREATE INDEX xifoo ON foo(bar_id);
>  > - ANALYZE;
>  > - Retry.
>
>  A compound index
>   CREATE INDEX xifoo2 ON foo (foo_a, bar_id)
>  might be more worthwhile.

And, if you're only interested in certain values of foo_a or bar_id
then partial indexes might be helpful but still cheap enough to
maintain in a large table.

Re: Problem with planner choosing nested loop

From

Alban Hertroys

Date:

03 April 2008, 06:34:51

On Apr 2, 2008, at 9:02 PM, Alex Solovey wrote:
> The reduced database example has the same problem in EXPLAIN
> ANALYZE as production one, here:
>
>     Seq Scan on bar  (cost=0.00..393.07 rows=1 width=4) (actual
> time=0.098..3.561 rows=24 loops=1)

Hang on... You prefer sequential scans because indexes make your
database too slow, but you don't want a sequential scan now? What
kind of solution do you expect then? An oracle maybe?

You will need an index if this query is too slow for you, or you will
have to live with the slowness of this query. Pick one ;)

Regards,
Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.

!DSPAM:737,47f47a7a927661963919006!

Re: Problem with planner choosing nested loop

From

Craig Ringer

Date:

03 April 2008, 07:03:52

Alban Hertroys wrote:
>
> On Apr 2, 2008, at 9:02 PM, Alex Solovey wrote:
>> The reduced database example has the same problem in EXPLAIN ANALYZE
>> as production one, here:
>>
>>     Seq Scan on bar  (cost=0.00..393.07 rows=1 width=4) (actual
>> time=0.098..3.561 rows=24 loops=1)
>
> Hang on... You prefer sequential scans because indexes make your
> database too slow, but you don't want a sequential scan now? What kind
> of solution do you expect then? An oracle maybe?

It sounds to me like the issue is with *multiple* sequential scans
inside a nested loop instead of the single sequential scan expected.

The quoted explain line reflects a claimed cost misestimation, rather
than being a claim that sequential scans in general are not desired.

> You will need an index if this query is too slow for you, or you will
> have to live with the slowness of this query. Pick one ;)

He's already noted that it's preferable to avoid adding indexes due to
insert/update performance issues.

--
Craig Ringer