Re: [ADMIN] Quad processor options - summary - Mailing list pgsql-performance

From James Thornton
Subject Re: [ADMIN] Quad processor options - summary
Date
Msg-id 40A40670.5030901@jamesthornton.com
Whole thread Raw
In response to Re: [ADMIN] Quad processor options  ("scott.marlowe" <scott.marlowe@ihs.com>)
List pgsql-performance
Hadley Willan wrote:

> To answer question 1, if you use software raid the chunk size is part of
> the /etc/raidtab file that is used on initial container creation. 4KB is
> the standard and a LARGE chunk size of 1MB may affect performance if
> you're not writing down to blocks in that size continuously.  If you
> make it to big and you're constantly needing to write out smaller chunks
> of information, then you will find the disk "always" working and would
> be an inefficient use of the blocks. There is some free info around
> about calculating the ideal chunk size. Looking for "Calculating chunk
> size for RAID" through google.

"Why does the SAME configuration recommend a one megabyte stripe width?
Let’s examine the reasoning behind this choice. Why not use a stripe
depth smaller than one megabyte? Smaller stripe depths can improve disk
throughput for a single process by spreading a single IO across multiple
disks. However IOs that are much smaller than a megabyte can cause seek
time to becomes a large fraction of the total IO time. Therefore, the
overall efficiency of the storage system is reduced. In some cases it
may be worth trading off some efficiency for the increased throughput
that smaller stripe depths provide. In general it is not necessary to do
this though. Parallel execution at database level achieves high disk
throughput while keeping efficiency high. Also, remember that the degree
of parallelism can be dynamically tuned, whereas the stripe depth is
very costly to change.

Why not use a stripe depth bigger than one megabyte? One megabyte is
large enough that a sequential scan will spend most of its time
transferring data instead of positioning the disk head. A bigger stripe
depth will improve scan efficiency but only modestly. One megabyte is
small enough that a large IO operation will not “hog” a single disk for
very long before moving to the next one. Further, one megabyte is small
enough that Oracle’s asynchronous readahead operations access multiple
disks. One megabyte is also small enough that a single stripe unit will
not become a hot-spot. Any access hot-spot that is smaller than a
megabyte should fit comfortably in the database buffer cache. Therefore
it will not create a hot-spot on disk."

The SAME configuration paper says to ensure that that large IO
operations aren't broken up between the DB and the disk, you need to be
able to ensure that the database file multi-block read count (Oracle has
a param called db_file_multiblock_read_count, does Postgres?) is the
same size as the stripe width and the OS IO limits should be at least
this size.

Also, it says, "Ideally we would like to stripe the log files using the
same one megabyte stripe width as the rest of the files. However, the
log files are written sequentially, and many storage systems limit the
maximum size of a single write operation to one megabyte (or even less).
If the maximum write size is limited, then using a one megabyte stripe
width for the log files may not work well. In this case, a smaller
stripe width such as 64K may work better. Caching RAID controllers are
an exception to this. If the storage subsystem can cache write
operations in nonvolatile RAM, then a one megabyte stripe width will
work well for the log files. In this case, the write operation will be
buffered in cache and the next log writes can be issued before the
previous write is destaged to disk."


--

  James Thornton
______________________________________________________
Internet Business Consultant, http://jamesthornton.com


pgsql-performance by date:

Previous
From: James Thornton
Date:
Subject: Re: Quad processor options - summary
Next
From: Paul Tuckfield
Date:
Subject: Re: Quad processor options - summary