Re: estimates for nested loop very wrong? - Mailing list pgsql-sql

From joostje@komputilo.org
Subject Re: estimates for nested loop very wrong?
Date
Msg-id 20030410222056.GA4359@co.uea.org
Whole thread Raw
In response to Re: estimates for nested loop very wrong?  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql
Je 2003/04/10(4)/16:04, Tom Lane skribis:
> joostje@komputilo.org writes:
> > For example, for the stat.targ=500 run:
> > => explain analyse SELECT  id  from db, tmp0 WHERE valida AND poseda='uea'  AND tab='pers' AND tmp0.v0=id ;
> > NOTICE:  QUERY PLAN:
> 
> > Nested Loop  (cost=0.00..121000.31 rows=28184 width=39) (actual time=1.05..23.24 rows=415 loops=1)
> >   ->  Seq Scan on tmp0  (cost=0.00..20.00 rows=1000 width=32) (actual time=0.22..0.40 rows=29 loops=1)
>                                             ^^^^^^^^^
> >   ->  Index Scan using db_id_idx on db  (cost=0.00..120.63 rows=28 width=7) (actual time=0.27..0.75 rows=14
loops=29)
> > Total runtime: 23.92 msec
> 
> Actually, I see part of the problem: you haven't vacuumed
> tmp0, so it's sitting at the default size estimate of 1000 rows.
> That accounts for more than a factor of 30 in the estimation error
> in the nestloop plan, while it wouldn't have nearly as much impact
> on hash or mergejoin estimates.
> 
> There's still a good big error left to account for though :-(

OK, here it goes:

db.id.statistics target set to 10  (default);
effective_cache_size=1000          (default);
random_page_cost=4                 (default); => estimated cost = 3690777.90

analyse tmp0; => estimated cost = 107033.27

set effective_cache_size=100000; => estimated cost = 107033.27    (no change there)

set random_page_cost=1;            (default was 4) => estimated cost = 27592.06    

db.id.statistics target = 500;     (improves n_distinct estimate from 2041 to 39909) => estimated cost = 943.44

set enable_nestloop = off; => planner elects Hash Loop, estimated/real cost: 52834.33 12762.86

set enable_hashjoin = off; => planner elects Merge Join, estimated/real cost: 179961.75/54314.09

So, it seems everything is working as it should now!
Thanks!

> I don't think you mentioned the other WHERE conditions before.  Which
> table are those restricting, and how selective are they?

Ah, sorry, they are worth not mentioning, as db.poseda='uea' and db.tab='pers' and valida
for about 97.6% of the rows. I guess I should have done the tests without
the two conditions, as they have no influence.


Thanks!
joostje



pgsql-sql by date:

Previous
From: Antti Haapala
Date:
Subject: Re: estimates for nested loop very wrong?
Next
From: "CIC mail"
Date:
Subject: unsubscrube