Re: Handing off SLRU fsyncs to the checkpointer - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: Handing off SLRU fsyncs to the checkpointer
Date
Msg-id CA+hUKGLDgaNU2K2Nh4Qsy5vQvbEv+XXPNvGUMPhaSgt_0CJW_g@mail.gmail.com
Whole thread Raw
In response to Re: Handing off SLRU fsyncs to the checkpointer  (Jakub Wartak <Jakub.Wartak@tomtom.com>)
List pgsql-hackers
On Thu, Aug 27, 2020 at 8:48 PM Jakub Wartak <Jakub.Wartak@tomtom.com> wrote:
> I've tried to get cache misses ratio via PMCs, apparently on EC2 they are (even on bigger) reporting as not-supported
orzeros.
 

I heard some of the counters are only allowed on their dedicated instance types.

> However interestingly the workload has IPC of 1.40 (instruction bound) which to me is strange as I would expect
BufTableLookup()to be actually heavy memory bound (?) Maybe I'll try on some different hardware one day.
 

Hmm, OK now you've made me go and read a bunch of Brendan Gregg bloggs
and try some experiments of my own to get a feel for this number and
what it might be telling us about the cache miss counters you can't
see.  Since I know how to generate arbitrary cache miss workloads for
quick experiments using hash joins of different sizes, I tried that
and noticed that when LLC misses were at 76% (bad), IPC was at 1.69
which is still higher than what you're seeing.  When the hash table
was much smaller and LLC misses were down to 15% (much better), IPC
was at 2.83.  I know Gregg said[1] "An IPC < 1.0 likely means memory
bound, and an IPC > 1.0 likely means instruction bound", but that's
not what I'm seeing here, and in his comments section that is
disputed.  So I'm not sure your IPC of 1.40 is evidence against the
hypothesis on its own.

[1] http://www.brendangregg.com/blog/2017-05-09/cpu-utilization-is-wrong.html



pgsql-hackers by date:

Previous
From: Jakub Wartak
Date:
Subject: Re: Handing off SLRU fsyncs to the checkpointer
Next
From: John Naylor
Date:
Subject: Re: factorial function/phase out postfix operators?