Re: Warm-cache prefetching - Mailing list pgsql-hackers

From Qingqing Zhou
Subject Re: Warm-cache prefetching
Date
Msg-id dnfsv4$1r7m$1@news.hub.org
Whole thread Raw
In response to Warm-cache prefetching  (Qingqing Zhou <zhouqq@cs.toronto.edu>)
List pgsql-hackers
"Simon Riggs" <simon@2ndquadrant.com> wrote
>
> You may be trying to use the memory too early. Prefetched memory takes
> time to arrive in cache, so you may need to issue prefetch calls for N
> +2, N+3 etc rather than simply N+1.
>
> p.6-11 covers this.
>

I actually tried it and no improvements have been observed. Also, this may 
conflict with "try to mix prefetch with computation" suggestion from the 
manual that you pointed out. But anyway, this looks like fixable compared to 
the following "prefetch distance" problem. As I read from the manual, this 
is one key factor of the efficiency, which also matches our intuition. 
However, when we process each tuple on a page, CPU clocks that are needed 
might be quite different:

---for (each tuple on a page){ if (ItemIdIsUsed(lpp))    /* some stopped here */ {  ...  /* some involves deeper
functioncalls here */  valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);  if (valid)
scan->rs_vistuples[ntup++]= lineoff; }}
 
---

So it is pretty hard to predicate the prefetch distance. The prefetch 
improvements to memcpy/memmove does not have this problem, the prefecth 
distance can be fixed, and it does not change due to the different speed 
CPUs of the same processor serials.

Maybe L2 cache is big enough so no need to worry about fetch too ahead? 
Seems not true, since this idea is vulnerable to a busy system. No data in 
L2 will be saved for you for a long time.

As Luke suggested, the code above scan operators like sort might be a better 
place to look at. I will take a look there.

Regards,
Qingqing






pgsql-hackers by date:

Previous
From: Robert Treat
Date:
Subject: Re: Upcoming PG re-releases
Next
From: Hannu Krosing
Date:
Subject: Re: Reducing relation locking overhead