Re: - Mailing list pgsql-performance

From Tom Lane
Subject Re:
Date
Msg-id 4582.1402105134@sss.pgh.pa.us
Whole thread Raw
In response to  (Scott Marlowe <scott.marlowe@gmail.com>)
Responses Re:
List pgsql-performance
Scott Marlowe <scott.marlowe@gmail.com> writes:
> Well it's me again, with another performance regression. We have this query:
> SELECT *
> FROM users u
> WHERE (u.user_group_id IN
>     (SELECT ug.id
>       FROM user_groups ug, pro_partners p
>       WHERE ug.pro_partner_id = p.id
>       AND p.tree_sortkey BETWEEN
> E'0000000000010101000001000101000110000000000000000000000101101010'
> AND
> tree_right(E'0000000000010101000001000101000110000000000000000000000101101010')
> OFFSET 0)
> AND u.deleted_time IS NULL)
> ORDER BY u.id LIMIT 1000;

> OK so on 8.4.2 it runs fast. If I take out the offset 0 it runs slow.
> If I run this on 8.4.15. 8.4.19 or 8.4.21 it also runs slow.

This seems to be about misestimation of the number of rows out of a
semijoin, so I'm thinking that the reason for the behavior change is
commit 899d7b00e9 or 46f775144e.  It's unfortunate that your example
ends up on the wrong side of that change, but the original 8.4.x behavior
was definitely pretty bogus; I think it's only accidental that 8.4.2
manages to choose a better plan.  (The fact that you need the crutch
of the "OFFSET 0" to get it to do so is evidence that it doesn't
really know what its doing ;-).)

One thing you might try is back-patching commit 4c2777d0b733, as I
suspect that you're partially getting burnt by that in this scenario.
I was afraid to back-patch that because of the API change possibly
breaking third-party code, but in a private build that's unlikely
to be an issue.

            regards, tom lane


pgsql-performance by date:

Previous
From: Scott Marlowe
Date:
Subject:
Next
From: Scott Marlowe
Date:
Subject: Re: