Re: Very poor estimates from planner - Mailing list pgsql-hackers

From Rod Taylor
Subject Re: Very poor estimates from planner
Date
Msg-id 1068079082.53233.29.camel@jester
Whole thread Raw
In response to Re: Very poor estimates from planner  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Very poor estimates from planner  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Wed, 2003-11-05 at 19:18, Tom Lane wrote:
> Rod Taylor <rbt@rbt.ca> writes:
> > Effectively, the planner has amazingly inaccurate row estimates.
>
> It seems the key estimation failure is in this join step:
>
>           ->  Hash Join  (cost=1230.79..60581.82 rows=158 width=54) (actual time=1262.35..151200.29 rows=1121988
loops=1)
>                 Hash Cond: ("outer".account_id = "inner".account_id)
>                 ->  Hash Join  (cost=1226.78..52863.43 rows=1542558 width=50) (actual time=1261.63..100418.30
rows=1573190loops=1) 
>                       (join of bsod, tsb, tss)
>                 ->  Hash  (cost=4.01..4.01 rows=1 width=4) (actual time=0.33..0.33 rows=0 loops=1)
>                       ->  Index Scan using single_null_parent_account_hack on account ap  (cost=0.00..4.01 rows=1
width=4)(actual time=0.26..0.28 rows=1 loops=1) 
>                             Filter: (parent_account_id IS NULL)
>
> The estimated number of rows out of the join of bsod, tsb, tss isn't far
> off, but the estimate for the result of joining that to ap is WAY off.
> Apparently the planner thinks that only a few rows in the join will have
> matches in ap, but really they almost all do.  Any idea why?  The
> account_id stats for each seem to be the thing to look at.

The account structure is tree, as is the product catalogue. Essentially
what the query does is convert high level recorded invoices into the
lower (not quite base) items for billing the different parties involved
in the transaction. bsod and tsb are both foreign keys to tss on the
columns being joined. Since these are full table scans, all values will
join.

This join in particular is  bsod (lineitems) and tsb (cached graph of
the product catalog tree) which are both foreign key'd off of tss
(product catalog tree).

So yes, since this is a full table scan all values will be joined since
the foreign key enforces them all to exist.

> > Any hints? I'm basically stuck. Oh, and I would like to ask for a
> > pgadmin feature -- visual explain :)
>
> You do know that Red Hat has been offering a Visual Explain tool for
> some time?
> http://sources.redhat.com/rhdb/
> I've not had much occasion to use it myself, but it works ...

Yeah.. but pgadmin is in the ports tree.

I'll take a peak at it.

pgsql-hackers by date:

Previous
From: "scott.marlowe"
Date:
Subject: Re: Very poor estimates from planner
Next
From: Josh Berkus
Date:
Subject: Re: Changes to Contributor List