James Nelson <james@photoshelter.com> writes:
> Hi, I'm hoping you guys can help with improving this query I'm having
> a problem with. The main problem is that the query plan changes
> depending on the value of the LIMIT clause, with small values using a
> poor plan and running very slowly. The two times are roughly 5 minutes
> for the bad plan and 1.5 secs for the good plan.
> photoshelter=# explain analyze SELECT ID FROM ps_image WHERE id IN
> (SELECT image_id FROM ps_gallery_image WHERE
> gallery_id='G00007ejKGoWS_cY') ORDER BY LOWER(FILE_NAME) ASC limit 1;
The problem here is an overoptimistic assessment of how long it will
take to find a match to gallery_id='G00007ejKGoWS_cY' while searching
in file_name order. You might be able to fix that by increasing the
statistics target for gallery_id. However, if the issue is not so
much how many occurrences of 'G00007ejKGoWS_cY' there are as that
they're all associated with high values of file_name, that won't
help. In that case I think it would work to restructure the query
along the lines of
select * from (
SELECT ID FROM ps_image WHERE id IN
(SELECT image_id FROM ps_gallery_image WHERE
gallery_id='G00007ejKGoWS_cY') ORDER BY LOWER(FILE_NAME) ASC
offset 0
) ss
limit 1;
The OFFSET should act as an optimization fence to prevent the LIMIT
from being used in the planning of the subquery.
regards, tom lane