Re: Planner choosing NestedLoop, although it is slower...

From: Clem Dickey
Subject: Re: Planner choosing NestedLoop, although it is slower...
Date: ,
Msg-id: ivi8r5$16se$1@news.hub.org
(view: Whole thread, Raw)
In response to: Planner choosing NestedLoop, although it is slower...  (Mario Splivalo)
List: pgsql-performance


On 07/12/2011 11:11 AM, Mario Splivalo wrote:
> Hi, all.
>
> I have a query, looking like this:
> SELECT
> pub_date
> FROM
> tubesite_object
> INNER JOIN tubesite_image
> ON tubesite_image.object_ptr_id = tubesite_object.id
> WHERE
> tubesite_object.site_id = 8
> AND tubesite_object.pub_date < E'2011-07-12 13:25:00'
> ORDER BY
> tubesite_object.pub_date ASC
> LIMIT 21;
>

> Why is planner using NestedLoops, that is, what can I do to make him NOT
> to use NestedLoops (other than issuing SET enable_nestloop TO false;
> before each query) ?

The planner is using a nested loops because the startup overhead is
less, and it think that it will only have run a small 0.2% (21/9404) of
the loops before reaching your limit of 21 results. In fact it has to
run all the loops, because there are 0 results. (Is that what you expected?)

Try a using CTE to make the planner think you are going to use all the
rows of the joined table. That may cause the planner to use a merge
join, which has higher startup cost (sort) but less overall cost if it
the join will not finish early.

WITH t AS (
   SELECT tubesite_object.site_id AS site_id,
     tubesite_object.pub_date as pub_date
   FROM tubesite_object
   INNER JOIN tubesite_image
   ON tubesite_image.object_ptr_id = tubesite_object.id
)
SELECT pub_date
FROM t
WHERE t.site_id = 8 AND t.pub_date < E'2011-07-12 13:25:00'
ORDER BY t.pub_date ASC LIMIT 21;



pgsql-performance by date:

From: Robert Klemme
Date:
Subject: Re: Trigger or Function
From: "Kevin Grittner"
Date:
Subject: Re: UPDATEDs slowing SELECTs in a fully cached database