Re: bad plan and LIMIT - Mailing list pgsql-performance

From Tom Lane
Subject Re: bad plan and LIMIT
Date
Msg-id 11772.1241189860@sss.pgh.pa.us
Whole thread Raw
In response to bad plan and LIMIT  (James Nelson <james@photoshelter.com>)
Responses Re: bad plan and LIMIT
List pgsql-performance
James Nelson <james@photoshelter.com> writes:
> Hi, I'm hoping you guys can help with improving this query I'm having
> a problem with. The main problem is that the query plan changes
> depending on the value of the LIMIT clause, with small values using a
> poor plan and running very slowly. The two times are roughly 5 minutes
> for the bad plan and 1.5 secs for the good plan.

> photoshelter=# explain  analyze SELECT ID FROM ps_image WHERE id IN
> (SELECT image_id FROM ps_gallery_image WHERE
> gallery_id='G00007ejKGoWS_cY') ORDER BY LOWER(FILE_NAME) ASC limit 1;

The problem here is an overoptimistic assessment of how long it will
take to find a match to gallery_id='G00007ejKGoWS_cY' while searching
in file_name order.  You might be able to fix that by increasing the
statistics target for gallery_id.  However, if the issue is not so
much how many occurrences of 'G00007ejKGoWS_cY' there are as that
they're all associated with high values of file_name, that won't
help.  In that case I think it would work to restructure the query
along the lines of

select * from (
  SELECT ID FROM ps_image WHERE id IN
    (SELECT image_id FROM ps_gallery_image WHERE
     gallery_id='G00007ejKGoWS_cY') ORDER BY LOWER(FILE_NAME) ASC
  offset 0
  ) ss
limit 1;

The OFFSET should act as an optimization fence to prevent the LIMIT
from being used in the planning of the subquery.

            regards, tom lane

pgsql-performance by date:

Previous
From: henk de wit
Date:
Subject: Transparent table partitioning in future version of PG?
Next
From: James Nelson
Date:
Subject: Re: bad plan and LIMIT