Re: Large index scan perfomance and indexCorrelation (PG - Mailing list pgsql-performance

From Simon Riggs
Subject Re: Large index scan perfomance and indexCorrelation (PG
Date
Msg-id 1151445858.2479.170.camel@localhost.localdomain
Whole thread Raw
In response to Large index scan perfomance and indexCorrelation (PG 8.1.4 Win32)  (Andrew Sagulin <andrews42@yandex.ru>)
Responses Large index scan perfomance and indexCorrelation (PG 8.1.4 Win32)
List pgsql-performance
On Tue, 2006-06-27 at 16:14 +0400, Andrew Sagulin wrote:

> Result showed up that there were no page seq scan at all - true random access
> only.
> The simple model which can explain the situation: the sequence of numbers 2, 1,
> 4, 3, 6, 5, ..., 100, 99 has correlation about 0,9994. Let's imagine it's the page
> order of an index scan. H'm, bad choice, isn't it?

Your example is only possible if whole blocks of data were out of order,
which I guess is possible within a multi-million row table. Slightly out
of order values would be ignored, since I/O works at the block rather
than the tuple level.

ANALYZE doesn't cope well with tables as large as you have. It doesn't
sample enough rows, nor does it look within single blocks/groups to
discover anomalies such as yours. As a result, the data that is sampled
looks almost perfectly ordered, though the main bulk is not.

I think what you are also pointing out is that the assumption of the
effects of correlation doesn't match the current readahead logic of
filesystems. If we were to actively force a readahead stride of 2 for
this scan (somehow), then the lack of correlation would disappear
completely. IIRC the current filesystem readahead logic would find that
such a sequence would be seen as random, and so no readahead would be
performed at all - even though the data is highly correlated. That isn't
PostgreSQL's fault directly, since the readahead ought to work better
than it does, but we fail indirectly by relying upon it in this case.

> I think indexCorrelation can help to estimate page count but not page
> fetch cost. Why not to use formula
>
> min_IO_cost = ceil(indexSelectivity * T) * random_page_cost
>
> instead of
>
> min_IO_cost = ceil(indexSelectivity * T) ?

That part is sensible. The min_IO_cost is when the access is sequential,
which by definition has a cost of 1.0.

The bit you might have issue with is how we extrapolate from the
min_IO_cost and correlation to arrive at a cost.

--
  Simon Riggs
  EnterpriseDB          http://www.enterprisedb.com


pgsql-performance by date:

Previous
From: "Andrus"
Date:
Subject: Re: why group expressions cause query to run forever
Next
From: Andrew Sagulin
Date:
Subject: Large index scan perfomance and indexCorrelation (PG 8.1.4 Win32)