Re: hashjoin chosen over 1000x faster plan - Mailing list pgsql-performance

From Simon Riggs
Subject Re: hashjoin chosen over 1000x faster plan
Date
Msg-id 1192045945.4233.351.camel@ebony.site
Whole thread Raw
In response to Re: hashjoin chosen over 1000x faster plan  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: hashjoin chosen over 1000x faster plan
List pgsql-performance
On Wed, 2007-10-10 at 14:35 -0500, Kevin Grittner wrote:
> >>> On Wed, Oct 10, 2007 at  1:54 PM, in message
> <1192042492.4233.334.camel@ebony.site>, Simon Riggs <simon@2ndquadrant.com>
> wrote:
> >
> > But the planner doesn't work on probability. It works on a best-guess
> > selectivity, as known at planning time.
>
> The point I'm trying to make is that at planning time the
> pg_statistic row for this "Charge"."reopHistSeqNo" column showed
> stanullfrac as 0.989; it doesn't seem to have taken this into account
> when making its guess about how many rows would be joined when it was
> compared to the primary key column of the "CaseHist" table.  I'm
> suggesting that it might be a good thing if it did.

Understood, it would be a good thing if it did.

It's more complex than you think:

The fast plan is an all-or-nothing plan. It is *only* faster when the
number of matched rows is zero. You know it is zero, but currently the
planner doesn't, nor is it able to make use of the information when it
has it, half thru execution. Even if we could work out the high
probability of it being zero, we would still be left with the decision
of whether to optimise for the zero or for the non-zero.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com


pgsql-performance by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: hashjoin chosen over 1000x faster plan
Next
From: "Jonah H. Harris"
Date:
Subject: Re: Performance problems with prepared statements