Re: Correlation in cost_index() - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Correlation in cost_index()
Date
Msg-id 29475.1033598929@sss.pgh.pa.us
Whole thread Raw
In response to Correlation in cost_index()  (Manfred Koizar <mkoi-pg@aon.at>)
Responses Re: Correlation in cost_index()  (Manfred Koizar <mkoi-pg@aon.at>)
Re: Correlation in cost_index()  (Sean Chittenden <sean@chittenden.org>)
List pgsql-hackers
Manfred Koizar <mkoi-pg@aon.at> writes:
> AFAICS (part of) the real problem is in costsize.c:cost_index() where
> IO_cost is calculated from min_IO_cost, pages_fetched,
> random_page_cost, and indexCorrelation.  The current implementation
> uses indexCorrelation^2 to interpolate between min_IO_cost and
> max_IO_cost, which IMHO gives results that are too close to
> max_IO_cost.

The indexCorrelation^2 algorithm was only a quick hack with no theory
behind it :-(.  I've wanted to find some better method to put in there,
but have not had any time to research the problem.

> As nobody knows how each of these proposals performs in real life
> under different conditions, I suggest to leave the current
> implementation in, add all three algorithms, and supply a GUC variable
> to select a cost function.

I don't think it's really a good idea to expect users to pick among
multiple cost functions that *all* have no guiding theory behind them.
I'd prefer to see us find a better cost function and use it.  Has anyone
trawled the database literature on the subject?
        regards, tom lane


pgsql-hackers by date:

Previous
From: "Michael Paesold"
Date:
Subject: Re: (Fwd) Re: Any Oracle 9 users? A test please...
Next
From: Justin Clift
Date:
Subject: Anyone want to assist with the translation of the Advocacy site?