Re: Simple join optimized badly? - Mailing list pgsql-performance

From Mark Kirkwood
Subject Re: Simple join optimized badly?
Date
Msg-id 4529B6D0.2010206@paradise.net.nz
Whole thread Raw
In response to Re: Simple join optimized badly?  ("Craig A. James" <cjames@modgraph-usa.com>)
Responses Re: Simple join optimized badly?  ("Craig A. James" <cjames@modgraph-usa.com>)
List pgsql-performance
Craig A. James wrote:
>
>
> My example, discussed previously in this forum, is a classic.  I have a
> VERY expensive function (it's in the class of NP-complete problems, so
> there is no faster way to do it).  There is no circumstance when my
> function should be used as a filter, and no circumstance when it should
> be done before a join.  But PG has no way of knowing the cost of a
> function, and so the optimizer assigns the same cost to every function.
> Big disaster.
>
> The result?  I can't use my function in any WHERE clause that involves
> any other conditions or joins.  Only by itself.  PG will occasionally
> decide to use my function as a filter instead of doing the join or the
> other WHERE conditions first, and I'm dead.
>

this is an argument for cost-for-functions rather than hints AFAICS.

It seems to me that if (in addition to the function cost) we come up
with some efficient way of recording cross column statistics we would be
well on the way to silencing *most* of the demands for hints.

We would still be left with some of the really difficult problems - a
metric for "locally correlated" column distributions and a reliable
statistical algorithm for most common value sampling (or a different way
of approaching this). These sound like interesting computer science or
mathematics thesis topics, maybe we could try (again?) to get some
interest at that level?

Cheers

Mark



pgsql-performance by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Simple join optimized badly?
Next
From: "Craig A. James"
Date:
Subject: Re: Simple join optimized badly?