Thread: Planner choosing NestedLoop, although it is slower...

From:
Mario Splivalo
Date:

Hi, all.

I have a query, looking like this:
SELECT
    pub_date
FROM
    tubesite_object
    INNER JOIN tubesite_image
        ON tubesite_image.object_ptr_id = tubesite_object.id
WHERE
    tubesite_object.site_id = 8
    AND tubesite_object.pub_date < E'2011-07-12 13:25:00'
ORDER BY
    tubesite_object.pub_date ASC
LIMIT 21;


That query takes cca 10-15 seconds to run. Here is query plan:

  Limit  (cost=0.00..415.91 rows=21 width=8) (actual
time=11263.089..11263.089 rows=0 loops=1)
    ->  Nested Loop  (cost=0.00..186249.55 rows=9404 width=8) (actual
time=11263.087..11263.087 rows=0 loops=1)
          ->  Index Scan using tubesite_object_pub_date_idx on
tubesite_object  (cost=0.00..183007.09 rows=9404 width=12) (actual
time=0.024..11059.487 rows=9374 loops=1)
                Index Cond: (pub_date < '2011-07-12
13:25:00-05'::timestamp with time zone)
                Filter: (site_id = 8)
          ->  Index Scan using tubesite_image_pkey on tubesite_image
(cost=0.00..0.33 rows=1 width=4) (actual time=0.021..0.021 rows=0
loops=9374)
                Index Cond: (tubesite_image.object_ptr_id =
tubesite_object.id)
  Total runtime: 11263.141 ms


This query runs quickly (around second or two) when there is only few
connections to the database. Once I have 50-80 connections (200 is the
limit, although I never have more than 120-150 connections), that query
takes around 10-15 seconds.

But, if I disable nestedloops, here is the query plan:

  Limit  (cost=22683.45..22683.51 rows=21 width=8) (actual
time=136.009..136.009 rows=0 loops=1)
    ->  Sort  (cost=22683.45..22706.96 rows=9404 width=8) (actual
time=136.007..136.007 rows=0 loops=1)
          Sort Key: tubesite_object.pub_date
          Sort Method:  quicksort  Memory: 25kB
          ->  Hash Join  (cost=946.51..22429.91 rows=9404 width=8)
(actual time=135.934..135.934 rows=0 loops=1)
                Hash Cond: (tubesite_object.id =
tubesite_image.object_ptr_id)
                ->  Bitmap Heap Scan on tubesite_object
(cost=545.40..21828.97 rows=9404 width=12) (actual time=20.874..104.075
rows=9374 loops=1)
                      Recheck Cond: (site_id = 8)
                      Filter: (pub_date < '2011-07-12
13:25:00-05'::timestamp with time zone)
                      ->  Bitmap Index Scan on tubesite_object_site_id
(cost=0.00..543.05 rows=9404 width=0) (actual time=18.789..18.789
rows=9374 loops=1)
                            Index Cond: (site_id = 8)
                ->  Hash  (cost=215.49..215.49 rows=14849 width=4)
(actual time=21.068..21.068 rows=14849 loops=1)
                      ->  Seq Scan on tubesite_image  (cost=0.00..215.49
rows=14849 width=4) (actual time=0.029..9.073 rows=14849 loops=1)
  Total runtime: 136.287 ms


Now, if I disable nested loops in postgres.conf, then my load average on
the server goes skyhigh (i presume because a lot of other queries are
now being planned incorrectly).

I have set up default_statistics_target to 2000, and have vacumed and
analyzed the database.

Here are the other options I have set up in postgresql.conf (that differ
from the default settings):

  version                     | PostgreSQL 8.4.8 on x86_64-pc-linux-gnu,
compiled by GCC gcc-4.3.real (Debian 4.3.2-1.1) 4.3.2, 64-bit
  checkpoint_segments         | 64
  default_statistics_target   | 2000
  effective_cache_size        | 20GB
  external_pid_file           | /var/run/postgresql/8.4-main.pid
  lc_collate                  | en_US.UTF-8
  lc_ctype                    | en_US.UTF-8
  listen_addresses            | *
  log_autovacuum_min_duration | 0
  log_checkpoints             | on
  log_line_prefix             | %t [%p]: [%l-1]
  log_min_duration_statement  | 1s
  maintenance_work_mem        | 256MB
  max_connections             | 200
  max_stack_depth             | 3MB
  port                        | 5432
  server_encoding             | UTF8
  shared_buffers              | 2GB
  statement_timeout           | 30min
  temp_buffers                | 4096
  TimeZone                    | localtime
  track_activity_query_size   | 2048
  unix_socket_directory       | /var/run/postgresql
  wal_buffers                 | 128MB
  work_mem                    | 64MB



Why is planner using NestedLoops, that is, what can I do to make him NOT
to use NestedLoops (other than issuing SET enable_nestloop TO false;
before each query) ?

    Mario

From:
Tom Lane
Date:

Mario Splivalo <> writes:
>   Limit  (cost=0.00..415.91 rows=21 width=8) (actual
> time=11263.089..11263.089 rows=0 loops=1)
>     ->  Nested Loop  (cost=0.00..186249.55 rows=9404 width=8) (actual
> time=11263.087..11263.087 rows=0 loops=1)

> Why is planner using NestedLoops,

Because it thinks the LIMIT will kick in and end the query when the join
is only 21/9404ths (ie, a fraction of a percent) complete.  A NestLoop
results in saving a lot of work in that situation, whereas hash-and-sort
has to do the whole join despite the LIMIT.

What you need to look into is why the estimated join size is 9400 rows
when the actual join size is zero.  Are both tables ANALYZEd?  Are you
intentionally selecting rows that have no join partners?

            regards, tom lane

From:
Mario Splivalo
Date:

On 07/12/2011 10:04 PM, Tom Lane wrote:
> Mario Splivalo<>  writes:
>>    Limit  (cost=0.00..415.91 rows=21 width=8) (actual
>> time=11263.089..11263.089 rows=0 loops=1)
>>      ->   Nested Loop  (cost=0.00..186249.55 rows=9404 width=8) (actual
>> time=11263.087..11263.087 rows=0 loops=1)
>
>> Why is planner using NestedLoops,
>
> Because it thinks the LIMIT will kick in and end the query when the join
> is only 21/9404ths (ie, a fraction of a percent) complete.  A NestLoop
> results in saving a lot of work in that situation, whereas hash-and-sort
> has to do the whole join despite the LIMIT.
>
> What you need to look into is why the estimated join size is 9400 rows
> when the actual join size is zero.  Are both tables ANALYZEd?  Are you
> intentionally selecting rows that have no join partners?

Hi, Tom.

Yes, both tables have been ANALYZEd. What do you mean, intentilnaly
selecting rows taht have no join partners?

    Mario

From:
Tom Lane
Date:

Mario Splivalo <> writes:
> On 07/12/2011 10:04 PM, Tom Lane wrote:
>> What you need to look into is why the estimated join size is 9400 rows
>> when the actual join size is zero.  Are both tables ANALYZEd?  Are you
>> intentionally selecting rows that have no join partners?

> Yes, both tables have been ANALYZEd. What do you mean, intentilnaly
> selecting rows taht have no join partners?

I'm wondering why the actual join size is zero.  That seems like a
rather unexpected case for a query like this.

            regards, tom lane

From:
Mario Splivalo
Date:

On 07/13/2011 12:39 AM, Tom Lane wrote:
> Mario Splivalo<>  writes:
>> On 07/12/2011 10:04 PM, Tom Lane wrote:
>>> What you need to look into is why the estimated join size is 9400 rows
>>> when the actual join size is zero.  Are both tables ANALYZEd?  Are you
>>> intentionally selecting rows that have no join partners?
>
>> Yes, both tables have been ANALYZEd. What do you mean, intentilnaly
>> selecting rows taht have no join partners?
>
> I'm wondering why the actual join size is zero.  That seems like a
> rather unexpected case for a query like this.

It is true that this particular query returns 0 rows. But it's created
by django, and I can't do much to alter it.

    Mario

From:
Mario Splivalo
Date:

On 07/13/2011 12:39 AM, Tom Lane wrote:
> Mario Splivalo<>  writes:
>> On 07/12/2011 10:04 PM, Tom Lane wrote:
>>> What you need to look into is why the estimated join size is 9400 rows
>>> when the actual join size is zero.  Are both tables ANALYZEd?  Are you
>>> intentionally selecting rows that have no join partners?
>
>> Yes, both tables have been ANALYZEd. What do you mean, intentilnaly
>> selecting rows taht have no join partners?
>
> I'm wondering why the actual join size is zero.  That seems like a
> rather unexpected case for a query like this.

Yes, seems that planer gets confused by LIMIT. This query:

select * from tubesite_object join tubesite_image on id=object_ptr_id
where site_id = 8 and pub_date < '2011-07-12 13:25:00' order by pub_date
desc ;

Does not choose Nested Loop, and is done instantly (20 ms), and returns
no rows. However, if I add LIMIT at the end, it chooses NestedLoop and
it takes 500ms if I'm alone on the server, and 10+ seconds if there 50+
connections on the server.

    Mario

From:
Mario Splivalo
Date:

On 07/13/2011 02:53 AM, Mario Splivalo wrote:
> On 07/13/2011 12:39 AM, Tom Lane wrote:
>> Mario Splivalo<> writes:
>>> On 07/12/2011 10:04 PM, Tom Lane wrote:
>>>> What you need to look into is why the estimated join size is 9400 rows
>>>> when the actual join size is zero. Are both tables ANALYZEd? Are you
>>>> intentionally selecting rows that have no join partners?
>>
>>> Yes, both tables have been ANALYZEd. What do you mean, intentilnaly
>>> selecting rows taht have no join partners?
>>
>> I'm wondering why the actual join size is zero. That seems like a
>> rather unexpected case for a query like this.
>
> Yes, seems that planer gets confused by LIMIT. This query:
>
> select * from tubesite_object join tubesite_image on id=object_ptr_id
> where site_id = 8 and pub_date < '2011-07-12 13:25:00' order by pub_date
> desc ;
>
> Does not choose Nested Loop, and is done instantly (20 ms), and returns
> no rows. However, if I add LIMIT at the end, it chooses NestedLoop and
> it takes 500ms if I'm alone on the server, and 10+ seconds if there 50+
> connections on the server.

As explained/suggested by RhodiumToad on IRC, adding composite index on
(site_id, pub_date) made nestedloop query finish in around 100 seconds!

Thank you!

    Mario

From:
Clem Dickey
Date:

On 07/12/2011 11:11 AM, Mario Splivalo wrote:
> Hi, all.
>
> I have a query, looking like this:
> SELECT
> pub_date
> FROM
> tubesite_object
> INNER JOIN tubesite_image
> ON tubesite_image.object_ptr_id = tubesite_object.id
> WHERE
> tubesite_object.site_id = 8
> AND tubesite_object.pub_date < E'2011-07-12 13:25:00'
> ORDER BY
> tubesite_object.pub_date ASC
> LIMIT 21;
>

> Why is planner using NestedLoops, that is, what can I do to make him NOT
> to use NestedLoops (other than issuing SET enable_nestloop TO false;
> before each query) ?

The planner is using a nested loops because the startup overhead is
less, and it think that it will only have run a small 0.2% (21/9404) of
the loops before reaching your limit of 21 results. In fact it has to
run all the loops, because there are 0 results. (Is that what you expected?)

Try a using CTE to make the planner think you are going to use all the
rows of the joined table. That may cause the planner to use a merge
join, which has higher startup cost (sort) but less overall cost if it
the join will not finish early.

WITH t AS (
   SELECT tubesite_object.site_id AS site_id,
     tubesite_object.pub_date as pub_date
   FROM tubesite_object
   INNER JOIN tubesite_image
   ON tubesite_image.object_ptr_id = tubesite_object.id
)
SELECT pub_date
FROM t
WHERE t.site_id = 8 AND t.pub_date < E'2011-07-12 13:25:00'
ORDER BY t.pub_date ASC LIMIT 21;