Home > mailing lists

Re: hashjoin chosen over 1000x faster plan - Mailing list pgsql-performance

From	Tom Lane
Subject	Re: hashjoin chosen over 1000x faster plan
Date	October 10, 2007 17:33:39
Msg-id	23650.1192048377@sss.pgh.pa.us Whole thread Raw
In response to	Re: hashjoin chosen over 1000x faster plan ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses	Re: hashjoin chosen over 1000x faster plan
List	pgsql-performance

Tree view

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:
> The point I'm trying to make is that at planning time the
> pg_statistic row for this "Charge"."reopHistSeqNo" column showed
> stanullfrac as 0.989; it doesn't seem to have taken this into account
> when making its guess about how many rows would be joined when it was
> compared to the primary key column of the "CaseHist" table.

It certainly does take nulls into account, but the estimate of resulting
rows was still nonzero; and even if it were zero, I'd be very hesitant
to make it choose a plan that is fast only if there were exactly zero
such rows and is slow otherwise.  Most of the complaints we've had about
issues of this sort involve the opposite problem, ie, the planner is
choosing a plan that works well for few rows but falls down because
reality involves many rows.  "Fast-for-few-rows" plans are usually a lot
more brittle than the alternatives in terms of the penalty you pay for
too many rows, and so putting a thumb on the scales to push it towards a
"fast" corner case sounds pretty unsafe to me.

As Simon notes, the only technically sound way to handle this would
involve run-time plan changeover, which is something we're not nearly
ready to tackle.

            regards, tom lane

pgsql-performance by date:

From: Josh Trutwin
Date: 10 October 2007, 17:32:24
Subject: Re: Shared Buffer setting in postgresql.conf

From: "Kevin Grittner"
Date: 10 October 2007, 17:48:59
Subject: Re: hashjoin chosen over 1000x faster plan

Re: hashjoin chosen over 1000x faster plan - Mailing list pgsql-performance

Previous

Next