Re: SV: bad plan using nested loops - Mailing list pgsql-performance

From Johan Fredriksson
Subject Re: SV: bad plan using nested loops
Date
Msg-id 1517562127.315.1.camel@kth.se
Whole thread Raw
In response to SV: bad plan using nested loops  (Johan Fredriksson <eskil@kth.se>)
Responses Re: SV: bad plan using nested loops
List pgsql-performance
tor 2018-02-01 klockan 20:34 +0000 skrev Johan Fredriksson:
> > Johan Fredriksson <eskil@kth.se> writes:
> > > Bad plan: https://explain.depesz.com/s/avtZ
> > > Good plan: https://explain.depesz.com/s/SJSt
> > > Any suggestions on how to make the planner make better decisions
> > > for
> > > this query?
> > 
> > Core of the problem looks to be the misestimation here:
> > 
> >        Index Only Scan using shredder_cgm1 on
> > public.cachedgroupmembers cachedgroupmembers_4
> > (cost=0.43..2.33 rows=79 width=8) (actual time=0.020..0.903
> > rows=1492 loops=804)
> >          Output: cachedgroupmembers_4.memberid,
> > cachedgroupmembers_4.groupid,
> > cachedgroupmembers_4.disabled
> >          Index Cond: ((cachedgroupmembers_4.memberid =
> > principals_1.id) AND
> > (cachedgroupmembers_4.disabled = 0))
> >          Heap Fetches: 5018
> > 
> > Probably, memberid and disabled are correlated but the planner
> > doesn't
> > know that, so it thinks the index condition is way more selective
> > than it
> > actually is.  In PG 10, you could very possibly fix that by
> > installing
> > extended statistics on that pair of columns.  See
> > 
> > https://www.postgresql.org/docs/current/static/planner-stats.html#P
> > LANNER-STATS-EXTENDED
> 
> I'm not sure what you mean by correlated, but there are only a
> handful (164 when I check it) disabled groupmembers out of total 7.5
> million.
> I'll give CREATE STATISTICS on those columns a shot and see if it
> gets any better.

It looks like you are right, Tom. There actually exists full
correlation between memberid, groupid and disabled.

rt4=# SELECT stxname, stxkeys, stxdependencies FROM pg_statistic_ext;
 stxname  | stxkeys |   stxdependencies    
-----------+---------+----------------------
 cgm_stat2 | 2 6     | {"2
=> 6": 1.000000}
 cgm_stat1 | 3 6     | {"3 => 6": 1.000000}
(2 rows)

However, this does not help the planner. It still picks the bad plan.


       / Eskil



pgsql-performance by date:

Previous
From: Johan Fredriksson
Date:
Subject: SV: bad plan using nested loops
Next
From: Laurenz Albe
Date:
Subject: Re: Query optimiser is not using 'not null' constraint when 'orderby nulls last' clause is used