Home > mailing lists

Re: new correlation metric - Mailing list pgsql-hackers

From	Martijn van Oosterhout
Subject	Re: new correlation metric
Date	October 26, 2008 10:49:57
Msg-id	20081026134943.GA8427@svana.org Whole thread Raw
In response to	new correlation metric (Jeff Davis <pgsql@j-davis.com>)
Responses	Re: new correlation metric Re: new correlation metric Re: new correlation metric
List	pgsql-hackers

Tree view

On Sun, Oct 26, 2008 at 01:38:02AM -0700, Jeff Davis wrote:
> I worked with Nathan Boley to come up with what we think is a better
> metric for measuring this cost. It is based on the number of times in
> the ordered sample that you have to physically backtrack (i.e. the data
> value increases, but the physical position is earlier).
>
> For example, if the table's physical order is
>
> 6 7 8 9 10 1 2 3 4 5

How does it deal with a case like the following:

1 6 2 7 3 8 4 9 5 10  (interleaving)

ISTM that your code will overestimate the cost whereas the old code
wouldn't have done too badly.

I think the code is in the right direction, but I think want you want
is some kind of estimate of "given I've looked for tuple X, how many
tuples in the next k pages are near this one". Unfortunatly I don't see
a way of calculating it other than a full simulation.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

pgsql-hackers by date:

From: Jeff Davis
Date: 26 October 2008, 09:17:05
Subject: new correlation metric

From: Tom Lane
Date: 26 October 2008, 12:51:09
Subject: Re: new correlation metric

Re: new correlation metric - Mailing list pgsql-hackers

Previous

Next