Re: Warm-cache prefetching - Mailing list pgsql-hackers

From Qingqing Zhou
Subject Re: Warm-cache prefetching
Date
Msg-id dnb6jk$ikl$1@news.hub.org
Whole thread Raw
In response to Warm-cache prefetching  (Qingqing Zhou <zhouqq@cs.toronto.edu>)
List pgsql-hackers
""Luke Lonergan"" <llonergan@greenplum.com> wrote
>
>>                                         /* prefetch ahead */
>>                                         __asm__ __volatile__ (
>>                                         "1: prefetchnta 128(%0)\n"
>>                                                 : : "r" (s) : "memory" );
>
> I think this kind / grain of prefetch is handled as a compiler 
> optimization
> in the latest GNU compilers, and further there are some memory streaming
> operations for the Pentium 4 ISA that are now part of the standard 
> compiler
> optimizations done by gcc.
>

Is there any special kind of optimization flag of gcc needed to support 
this? I just tried both 2.96 and 4.01 with O2. Unfortunately, 
sse_clear_page() encounters a core-dump by 4.0.1 at this line:
       __asm__ __volatile__ (" movntdq %%xmm0, %0"::"m"(sse_save[0]) );

So I removed this test (sorry ...). There is no non-trivial difference 
AFAICS. The results is attached. I will look into the other parts of your 
thread tomorrow,

Regards,
Qingqing

---

*#ll prefp3-*
-rwx------    1 zhouqq   jmgrp         38k Dec  9 00:49 prefp3-296
-rwx------    1 zhouqq   jmgrp         16k Dec  9 00:49 prefp3-401
*#./prefp3-296
2392.975 MHz
clear_page function 'gcc clear_page()'   took 27142 cycles per page (172.7 
MB/s)
clear_page function 'normal clear_page()'        took 27161 cycles per page 
(172.6 MB/s)
clear_page function 'mmx clear_page()   '        took 17293 cycles per page 
(271.1 MB/s)
clear_page function 'gcc clear_page()'   took 27174 cycles per page (172.5 
MB/s)
clear_page function 'normal clear_page()'        took 27142 cycles per page 
(172.7 MB/s)
clear_page function 'mmx clear_page()   '        took 17291 cycles per page 
(271.1 MB/s)

copy_page function 'normal copy_page()'  took 18552 cycles per page (252.7 
MB/s)
copy_page function 'mmx copy_page()   '  took 12511 cycles per page (374.6 
MB/s)
copy_page function 'sse copy_page()   '  took 12318 cycles per page (380.5 
MB/s)
*#./prefp3-401
2392.970 MHz
clear_page function 'gcc clear_page()'   took 27120 cycles per page (172.8 
MB/s)
clear_page function 'normal clear_page()'        took 27151 cycles per page 
(172.6 MB/s)
clear_page function 'mmx clear_page()   '        took 17295 cycles per page 
(271.0 MB/s)
clear_page function 'gcc clear_page()'   took 27152 cycles per page (172.6 
MB/s)
clear_page function 'normal clear_page()'        took 27114 cycles per page 
(172.9 MB/s)
clear_page function 'mmx clear_page()   '        took 17296 cycles per page 
(271.0 MB/s)

copy_page function 'normal copy_page()'  took 18586 cycles per page (252.2 
MB/s)
copy_page function 'mmx copy_page()   '  took 12620 cycles per page (371.4 
MB/s)
copy_page function 'sse copy_page()   '  took 12698 cycles per page (369.1 
MB/s)




pgsql-hackers by date:

Previous
From: "Luke Lonergan"
Date:
Subject: Re: Warm-cache prefetching
Next
From: "Min Xu (Hsu)"
Date:
Subject: Re: Warm-cache prefetching