Re: PG choosing nested loop for set membership? - Mailing list pgsql-general

From Brian Crowell
Subject Re: PG choosing nested loop for set membership?
Date
Msg-id CAAQkdDot3AoyFC1Jyrwkat2Jm=sbdEaaKPkVxhDdqMj7+-gC8w@mail.gmail.com
Whole thread Raw
In response to Re: PG choosing nested loop for set membership?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: PG choosing nested loop for set membership?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
On Tue, Mar 25, 2014 at 5:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Yeah.  The weird thing about that is that the nestloop rowcount estimate
> isn't the product of the two input rowcounts --- you'd sort of expect an
> estimate of 158 given the input-relation sizes.  While that's not ipso
> facto evidence of a bug (because the estimates are arrived at in different
> ways), I'm having a hard time replicating it here.  Are you using an
> up-to-date PG release?

All right, I think I'm onto something. But first I'll answer your questions.

Version is 9.3.3 from the Postgres Debian archives.


> One thing that might help is to increase the statistics target for
> pl2._visible_accounts_by_rule_set.  The other two tables are small enough
> that you don't need to do that for them.  (Although come to think of it,
> they are also small enough that maybe auto-analyze isn't triggering for
> them ... does a manual ANALYZE improve matters?)

You were right that auto-analyze didn't go after them. Weird. But a
few manual analyzes later, and no change.

Here's what I did, though. I collapsed the pl2.current_user view into
pl2.visible_accounts:

===
select
    acc.account,
    acc.manager,
    acc.is_fund
from pl2._visible_accounts_by_rule_set acc
    inner join (pl2._users u
    left join pl2._users iu on u.impersonating = iu.user_id)
        on acc.rule_set_id = coalesce(iu.permission_rule_set_id,
u.permission_rule_set_id)
where u.user_principal_name = session_user
===

I noticed that join-on-coalesce pattern that gave us trouble in SQL
Server. The query planner can't do a thing with that. So I rewrote the
query so the last join would be solid:

===
select
    acc.account,
    acc.manager,
    acc.is_fund
from pl2._users lu
    inner join pl2._users u on u.user_id = coalesce(lu.impersonating,
lu.user_id)
    inner join pl2._visible_accounts_by_rule_set acc
        on acc.rule_set_id = u.permission_rule_set_id
where lu.user_principal_name = session_user
===

The join order is the same, and the indexes used are the same, but the
estimate is much better:

'Nested Loop  (cost=0.68..13.70 rows=133 width=10) (actual
time=0.073..0.211 rows=241 loops=1)'
'  Output: acc.account, acc.manager, acc.is_fund'
'  Buffers: shared hit=10'
'  ->  Nested Loop  (cost=0.54..8.58 rows=1 width=4) (actual
time=0.056..0.059 rows=1 loops=1)'
'        Output: u.permission_rule_set_id'
'        Buffers: shared hit=7'
'        ->  Index Scan using _pl2_users_user_principal_name_idx on
pl2._users lu  (cost=0.27..4.29 rows=1 width=8) (actual
time=0.045..0.047 rows=1 loops=1)'
'              Output: lu.user_id, lu.user_principal_name, lu.name,
lu.permission_rule_set_id, lu.impersonating, lu.is_admin'
'              Index Cond: (lu.user_principal_name = ("session_user"())::text)'
'              Buffers: shared hit=4'
'        ->  Index Scan using _users_pkey on pl2._users u
(cost=0.27..4.29 rows=1 width=8) (actual time=0.006..0.006 rows=1
loops=1)'
'              Output: u.user_id, u.user_principal_name, u.name,
u.permission_rule_set_id, u.impersonating, u.is_admin'
'              Index Cond: (u.user_id = COALESCE(lu.impersonating, lu.user_id))'
'              Buffers: shared hit=3'
'  ->  Index Scan using _visible_accounts_by_rule_set_idx on
pl2._visible_accounts_by_rule_set acc  (cost=0.15..3.54 rows=158
width=14) (actual time=0.015..0.089 rows=241 loops=1)'
'        Output: acc.rule_set_id, acc.account, acc.manager, acc.is_fund'
'        Index Cond: (acc.rule_set_id = u.permission_rule_set_id)'
'        Buffers: shared hit=3'
'Total runtime: 0.297 ms'

I'll see if I can write an isolated test case for the coalesce
misestimate. Or do you think the query planner will ever be able to do
anything with that form?

--Brian


pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Failure upgrading PG 9.2 to 9.3
Next
From: Tom Lane
Date:
Subject: Re: PG choosing nested loop for set membership?