Re: New to PostgreSQL, performance considerations - Mailing list pgsql-performance
From | Ron |
---|---|
Subject | Re: New to PostgreSQL, performance considerations |
Date | |
Msg-id | E1GvGJR-0006HJ-OQ@elasmtp-mealy.atl.sa.earthlink.net Whole thread Raw |
In response to | Re: New to PostgreSQL, performance considerations (Greg Smith <gsmith@gregsmith.com>) |
Responses |
Re: New to PostgreSQL, performance considerations
|
List | pgsql-performance |
At 09:50 AM 12/15/2006, Greg Smith wrote: >On Fri, 15 Dec 2006, Merlin Moncure wrote: > >>The slower is probably due to the unroll loops switch which can >>actually hurt code due to the larger footprint (less cache coherency). > >The cache issues are so important with current processors that I'd >suggest throwing -Os (optimize for size) into the mix people >test. That one may stack usefully with -O2, but probably not with >-O3 (3 includes optimizations that increase code size). -Os Optimize for size. -Os enables all -O2 optimizations that do not typically increase code size. It also performs further optimizations designed to reduce code size. -Os disables the following optimization flags: -falign-functions -falign-jumps -falign-loops -falign-labels -freorder-blocks -freorder-blocks-and-partition -fprefetch-loop-arrays -ftree-vect-loop-version Hmmm. That list of disabled flags bears thought. -falign-functions -falign-jumps -falign-loops -falign-labels 1= Most RISC CPUs performance is very sensitive to misalignment issues. Not recommended to turn these off. -freorder-blocks Reorder basic blocks in the compiled function in order to reduce number of taken branches and improve code locality. Enabled at levels -O2, -O3. -freorder-blocks-and-partition In addition to reordering basic blocks in the compiled function, in order to reduce number of taken branches, partitions hot and cold basic blocks into separate sections of the assembly and .o files, to improve paging and cache locality performance. This optimization is automatically turned off in the presence of exception handling, for link once sections, for functions with a user-defined section attribute and on any architecture that does not support named sections. 2= Most RISC CPUs are cranky about branchy code and (lack of) cache locality. Wouldn't suggest punting these either. -fprefetch-loop-arrays If supported by the target machine, generate instructions to prefetch memory to improve the performance of loops that access large arrays. This option may generate better or worse code; results are highly dependent on the structure of loops within the source code. 3= OTOH, This one looks worth experimenting with turning off. -ftree-vect-loop-version Perform loop versioning when doing loop vectorization on trees. When a loop appears to be vectorizable except that data alignment or data dependence cannot be determined at compile time then vectorized and non-vectorized versions of the loop are generated along with runtime checks for alignment or dependence to control which version is executed. This option is enabled by default except at level -Os where it is disabled. 4= ...and this one looks like a 50/50 shot. Ron Peacetree
pgsql-performance by date: