Re: new correlation metric - Mailing list pgsql-hackers

From Tom Lane
Subject Re: new correlation metric
Date
Msg-id 11255.1225036262@sss.pgh.pa.us
Whole thread Raw
In response to Re: new correlation metric  (Martijn van Oosterhout <kleptog@svana.org>)
Responses Re: new correlation metric
List pgsql-hackers
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Sun, Oct 26, 2008 at 01:38:02AM -0700, Jeff Davis wrote:
>> I worked with Nathan Boley to come up with what we think is a better
>> metric for measuring this cost.

> I think the code is in the right direction, but I think want you want
> is some kind of estimate of "given I've looked for tuple X, how many
> tuples in the next k pages are near this one".

ISTM that some experimental studies would be required to justify any
proposal in this area.

Having said that ... one of the things I never liked about the existing
code is that it pays no attention to block-boundary effects.  It doesn't
matter to an indexscan how badly tuples within a single block are
misordered --- what matters is how many block reads you have to do.
So I think that any changes here ought to include measuring the
correlation on the basis of block numbers not tuple numbers.  But what's
not too clear to me is whether our sampling methods would mess that up.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Martijn van Oosterhout
Date:
Subject: Re: new correlation metric
Next
From: Heikki Linnakangas
Date:
Subject: Re: new correlation metric