Re: Warm-cache prefetching - Mailing list pgsql-hackers
From | Qingqing Zhou |
---|---|
Subject | Re: Warm-cache prefetching |
Date | |
Msg-id | dnb6jk$ikl$1@news.hub.org Whole thread Raw |
In response to | Warm-cache prefetching (Qingqing Zhou <zhouqq@cs.toronto.edu>) |
List | pgsql-hackers |
""Luke Lonergan"" <llonergan@greenplum.com> wrote > >> /* prefetch ahead */ >> __asm__ __volatile__ ( >> "1: prefetchnta 128(%0)\n" >> : : "r" (s) : "memory" ); > > I think this kind / grain of prefetch is handled as a compiler > optimization > in the latest GNU compilers, and further there are some memory streaming > operations for the Pentium 4 ISA that are now part of the standard > compiler > optimizations done by gcc. > Is there any special kind of optimization flag of gcc needed to support this? I just tried both 2.96 and 4.01 with O2. Unfortunately, sse_clear_page() encounters a core-dump by 4.0.1 at this line: __asm__ __volatile__ (" movntdq %%xmm0, %0"::"m"(sse_save[0]) ); So I removed this test (sorry ...). There is no non-trivial difference AFAICS. The results is attached. I will look into the other parts of your thread tomorrow, Regards, Qingqing --- *#ll prefp3-* -rwx------ 1 zhouqq jmgrp 38k Dec 9 00:49 prefp3-296 -rwx------ 1 zhouqq jmgrp 16k Dec 9 00:49 prefp3-401 *#./prefp3-296 2392.975 MHz clear_page function 'gcc clear_page()' took 27142 cycles per page (172.7 MB/s) clear_page function 'normal clear_page()' took 27161 cycles per page (172.6 MB/s) clear_page function 'mmx clear_page() ' took 17293 cycles per page (271.1 MB/s) clear_page function 'gcc clear_page()' took 27174 cycles per page (172.5 MB/s) clear_page function 'normal clear_page()' took 27142 cycles per page (172.7 MB/s) clear_page function 'mmx clear_page() ' took 17291 cycles per page (271.1 MB/s) copy_page function 'normal copy_page()' took 18552 cycles per page (252.7 MB/s) copy_page function 'mmx copy_page() ' took 12511 cycles per page (374.6 MB/s) copy_page function 'sse copy_page() ' took 12318 cycles per page (380.5 MB/s) *#./prefp3-401 2392.970 MHz clear_page function 'gcc clear_page()' took 27120 cycles per page (172.8 MB/s) clear_page function 'normal clear_page()' took 27151 cycles per page (172.6 MB/s) clear_page function 'mmx clear_page() ' took 17295 cycles per page (271.0 MB/s) clear_page function 'gcc clear_page()' took 27152 cycles per page (172.6 MB/s) clear_page function 'normal clear_page()' took 27114 cycles per page (172.9 MB/s) clear_page function 'mmx clear_page() ' took 17296 cycles per page (271.0 MB/s) copy_page function 'normal copy_page()' took 18586 cycles per page (252.2 MB/s) copy_page function 'mmx copy_page() ' took 12620 cycles per page (371.4 MB/s) copy_page function 'sse copy_page() ' took 12698 cycles per page (369.1 MB/s)
pgsql-hackers by date: