On Thu, Apr 29, 2021 at 12:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2021-04-28 19:24:53 -0400, Tom Lane wrote:
> >> IOW, we've spent over twice as many CPU cycles shipping data to the
> >> standby as we did in applying the WAL on the standby.
>
> > I don't really know how the time calculation works on mac. Is there a
> > chance it includes time spent doing IO?
For comparison, on a modern Linux system I see numbers like this,
while running that 025_stream_rep_regress.pl test I posted in a nearby
thread:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
tmunro 2150863 22.5 0.0 55348 6752 ? Ss 12:59 0:07
postgres: standby_1: startup recovering 00000001000000020000003C
tmunro 2150867 17.5 0.0 55024 6364 ? Ss 12:59 0:05
postgres: standby_1: walreceiver streaming 2/3C675D80
tmunro 2150868 11.7 0.0 55296 7192 ? Ss 12:59 0:04
postgres: primary: walsender tmunro [local] streaming 2/3C675D80
Those ratios are better but it's still hard work, and perf shows the
CPU time is all in page cache schlep:
22.44% postgres [kernel.kallsyms] [k] copy_user_enhanced_fast_string
20.12% postgres [kernel.kallsyms] [k] __add_to_page_cache_locked
7.30% postgres [kernel.kallsyms] [k] iomap_set_page_dirty
That was with all three patches reverted, so it's nothing new.
Definitely room for improvement... there have been a few discussions
about not using a buffered file for high-frequency data exchange and
relaxing various timing rules, which we should definitely look into,
but I wouldn't be at all surprised if HFS+ was just much worse at
this.
Thinking more about good old HFS+... I guess it's remotely possible
that there might have been coherency bugs in that could be exposed by
our usage pattern, but then that doesn't fit too well with the clues I
have from light reading: this is a non-SMP system, and it's said that
HFS+ used to serialise pretty much everything on big filesystem locks
anyway.