Re: Scaling SELECT:s with the number of disks on a stripe

From: Peter Schuller
Subject: Re: Scaling SELECT:s with the number of disks on a stripe
Date: ,
Msg-id: 20070404174448.GA6918@hyperion.scode.org
(view: Whole thread, Raw)
In response to: Re: Scaling SELECT:s with the number of disks on a stripe  (Andrew - Supernews)
List: pgsql-performance

Tree view

Scaling SELECT:s with the number of disks on a stripe  (Peter Schuller, )
 Re: Scaling SELECT:s with the number of disks on a stripe  ("Marc Mamin", )
  Re: Scaling SELECT:s with the number of disks on a stripe  (Peter Schuller, )
 Re: Scaling SELECT:s with the number of disks on a stripe  (Andrew - Supernews, )
  Re: Scaling SELECT:s with the number of disks on a stripe  (Peter Schuller, )
 Re: Scaling SELECT:s with the number of disks on a stripe  (Andrew - Supernews, )
  Re: Scaling SELECT:s with the number of disks on a stripe  (Peter Schuller, )
   Re: Scaling SELECT:s with the number of disks on a stripe  (Dave Cramer, )
 Re: Scaling SELECT:s with the number of disks on a stripe  (Andrew - Supernews, )
  Re: Scaling SELECT:s with the number of disks on a stripe  (Peter Schuller, )

Hello,

> I'd always do benchmarks with a realistic value of shared_buffers (i.e.
> much higher than that).
>
> Another thought that comes to mind is that the bitmap index scan does
> depend on the size of work_mem.
>
> Try increasing your shared_buffers to a reasonable working value (say
> 10%-15% of RAM - I was testing on a machine with 4GB of RAM, using a
> shared_buffers setting of 50000), and increase work_mem to 16364, and
> see if there are any noticable changes in behaviour.

Increasing the buffer size and work_mem did have a significant
effect. I can understand it in the case of the heap scan, but I am
still surprised at the index scan. Could pg be serializing the entire
query as a result of insufficient buffers/work_mem to satisfy multiple
concurrent queries?

With both turned up, not only is the heap scan no longer visibly CPU
bound, I am seeing some nice scaling in terms of disk I/O. I have not
yet benchmarked to the point of being able to say whether it's
entirely linear, but it certainly seems to at least be approaching the
ballpark.

Thank you for the help! I guess I made a bad call not tweaking
this. My thinking was that I explicitly did not want to turn it up so
that I could benchmark the raw performance of disk I/O, rather than
having things be cached in memory more than it would already be. But
apparantly it had other side-effects I did not consider.

Thanks again,

--
/ Peter Schuller

PGP userID: 0xE9758B7D or 'Peter Schuller <>'
Key retrieval: Send an E-Mail to 
E-Mail:  Web: http://www.scode.org


Attachment

pgsql-performance by date:

From: "jason@ohloh.net"
Date:
Subject: Re: SCSI vs SATA
From: "James Mansion"
Date:
Subject: Re: SCSI vs SATA