On Thu, Aug 27, 2020 at 8:48 PM Jakub Wartak <Jakub.Wartak@tomtom.com> wrote:
> I've tried to get cache misses ratio via PMCs, apparently on EC2 they are (even on bigger) reporting as not-supported
orzeros.
I heard some of the counters are only allowed on their dedicated instance types.
> However interestingly the workload has IPC of 1.40 (instruction bound) which to me is strange as I would expect
BufTableLookup()to be actually heavy memory bound (?) Maybe I'll try on some different hardware one day.
Hmm, OK now you've made me go and read a bunch of Brendan Gregg bloggs
and try some experiments of my own to get a feel for this number and
what it might be telling us about the cache miss counters you can't
see. Since I know how to generate arbitrary cache miss workloads for
quick experiments using hash joins of different sizes, I tried that
and noticed that when LLC misses were at 76% (bad), IPC was at 1.69
which is still higher than what you're seeing. When the hash table
was much smaller and LLC misses were down to 15% (much better), IPC
was at 2.83. I know Gregg said[1] "An IPC < 1.0 likely means memory
bound, and an IPC > 1.0 likely means instruction bound", but that's
not what I'm seeing here, and in his comments section that is
disputed. So I'm not sure your IPC of 1.40 is evidence against the
hypothesis on its own.
[1] http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html