Home > mailing lists

Re: Large index scan perfomance and indexCorrelation (PG - Mailing list pgsql-performance

From	Simon Riggs
Subject	Re: Large index scan perfomance and indexCorrelation (PG
Date	June 27, 2006 19:04:36
Msg-id	1151445858.2479.170.camel@localhost.localdomain Whole thread
In response to	Large index scan perfomance and indexCorrelation (PG 8.1.4 Win32) (Andrew Sagulin <andrews42@yandex.ru>)
Responses	Large index scan perfomance and indexCorrelation (PG 8.1.4 Win32)
List	pgsql-performance

Tree view

On Tue, 2006-06-27 at 16:14 +0400, Andrew Sagulin wrote:

> Result showed up that there were no page seq scan at all - true random access
> only.
> The simple model which can explain the situation: the sequence of numbers 2, 1,
> 4, 3, 6, 5, ..., 100, 99 has correlation about 0,9994. Let's imagine it's the page
> order of an index scan. H'm, bad choice, isn't it?

Your example is only possible if whole blocks of data were out of order,
which I guess is possible within a multi-million row table. Slightly out
of order values would be ignored, since I/O works at the block rather
than the tuple level.

ANALYZE doesn't cope well with tables as large as you have. It doesn't
sample enough rows, nor does it look within single blocks/groups to
discover anomalies such as yours. As a result, the data that is sampled
looks almost perfectly ordered, though the main bulk is not.

I think what you are also pointing out is that the assumption of the
effects of correlation doesn't match the current readahead logic of
filesystems. If we were to actively force a readahead stride of 2 for
this scan (somehow), then the lack of correlation would disappear
completely. IIRC the current filesystem readahead logic would find that
such a sequence would be seen as random, and so no readahead would be
performed at all - even though the data is highly correlated. That isn't
PostgreSQL's fault directly, since the readahead ought to work better
than it does, but we fail indirectly by relying upon it in this case.

> I think indexCorrelation can help to estimate page count but not page
> fetch cost. Why not to use formula
>
> min_IO_cost = ceil(indexSelectivity * T) * random_page_cost
>
> instead of
>
> min_IO_cost = ceil(indexSelectivity * T) ?

That part is sensible. The min_IO_cost is when the access is sequential,
which by definition has a cost of 1.0.

The bit you might have issue with is how we extrapolate from the
min_IO_cost and correlation to arrive at a cost.

--
  Simon Riggs
  EnterpriseDB          http://www.enterprisedb.com

pgsql-performance by date:

From: "Andrus"
Date: 27 June 2006, 15:02:10
Subject: Re: why group expressions cause query to run forever

From: Andrew Sagulin
Date: 28 June 2006, 06:34:14
Subject: Large index scan perfomance and indexCorrelation (PG 8.1.4 Win32)

Re: Large index scan perfomance and indexCorrelation (PG - Mailing list pgsql-performance

Previous

Next