Home > mailing lists

Re: CLUSTER and indisclustered - Mailing list pgsql-hackers

From	Hannu Krosing
Subject	Re: CLUSTER and indisclustered
Date	August 7, 2002 10:20:32
Msg-id	1028733736.13419.124.camel@taru.tm.ee Whole thread Raw
In response to	Re: CLUSTER and indisclustered (Curt Sampson <cjs@cynic.net>)
List	pgsql-hackers

Tree view

On Wed, 2002-08-07 at 04:31, Curt Sampson wrote:
> On Sun, 4 Aug 2002, mark Kirkwood wrote:
> 
> > Ok, this change would save you the initial access of the index
> > structure itself - but isnt the usual killer for indexes is the
> > "thrashing" that happens when the "pointed to" table data is spread
> > over a many pages.
> 
> Yeah, no kidding on this one. I've reduced queries from 75 seconds
> to 0.6 seconds by clustering on the appropriate field.
> 
> But after doing some benchmarking of various sorts of random reads
> and writes, it occurred to me that there might be optimizations
> that could help a lot with this sort of thing. What if, when we've
> got an index block with a bunch of entries, instead of doing the
> reads in the order of the entries, we do them in the order of the
> blocks the entries point to? That would introduce a certain amount
> of "sequentialness" to the reads that the OS is not capable of
> introducing (since it can't reschedule the reads you're doing, the
> way it could reschedule, say, random writes).
>

I guess this could be solved elegantly using threading - one thread
scans index and pushes tids into a btree or some other sorted structure,
while other thread loops continuously (or "elevatorly" back and forth)
over that structure in tuple order and does the actual data reads. 

This would have the added benefit of better utilising multiprocessor
computers.

---------------
Hannu

pgsql-hackers by date:

From: Manfred Koizar
Date: 07 August 2002, 10:18:36
Subject: Heap tuple header issues

From: Tom Lane
Date: 07 August 2002, 10:26:19
Subject: Re: CLUSTER and indisclustered

Re: CLUSTER and indisclustered - Mailing list pgsql-hackers

Previous

Next