hash_search_with_hash_value is high in "perf top" on a replica - Mailing list pgsql-hackers

From Dmitry Koterov
Subject hash_search_with_hash_value is high in "perf top" on a replica
Date
Msg-id CA+CZih5R_ZMjcuMHM3Ub10vTNDLEBjW8VsO3h-y2WKb=oK4fWw@mail.gmail.com
Whole thread Raw
List pgsql-hackers
Hi.

Debugging some replication lag on a replica when the master node experiences heavy writes.

PG "startup recovering" eats up a lot of CPU (like 65 %user and 30 %sys), which is a little surprising (what is it doing with all those CPU cycles? it looked like WAL replay should be more IO bound than CPU bound?).

Running "perf top -p <pid>", it shows this:

Samples: 1M of event 'cycles:P', 4000 Hz, Event count (approx.): 18178814660 lost: 0/0 drop: 0/0
Overhead  Shared Object     Symbol
  16.63%  postgres          [.] hash_search_with_hash_value
   5.38%  postgres          [.] __aarch64_ldset4_sync
   4.42%  postgres          [.] __aarch64_cas4_acq_rel
   3.42%  postgres          [.] XLogReadBufferExtended
   2.35%  libc.so.6         [.] 0x0000000000097f4c
   2.04%  postgres          [.] pg_comp_crc32c_armv8
   1.77%  [kernel]          [k] mutex_lock
   1.63%  postgres          [.] XLogPrefetcherReadRecord
   1.56%  postgres          [.] DecodeXLogRecord
   1.55%  postgres          [.] LWLockAcquire

This is more surprising: hash_search_with_hash_value is a hash table lookup function, is it expected that it is #1 in the profiler? (Called mostly from PrefetchSharedBuffer*.)

Configuration:
- PG17
- ZFS, compression is off, empty database (with compression on, most of "startup recovering" CPU was spent in ZFS guts doing compression/decompression according to the profiler)
- full_page_writes=off, recovery_prefetch=on (ZFS supports it, I tested with a small C program), wal_decode_buffer_size=2048kB (it doesn't seem to affect anything though).
- shared_buffers = 25% of RAM
- testing with a giant COPY command basically

My main question is about hash_search_with_hash_value CPU usage. With both recovery_prefetch=on and off, it is the topmost function in "perf top". I see no IO bottleneck on the machine, it's only "startup recovering" maxing out one CPU core.

Maybe it's a red herring though, but it looks pretty suspicious.

pgsql-hackers by date:

Previous
From: Matthias van de Meent
Date:
Subject: Re: why there is not VACUUM FULL CONCURRENTLY?
Next
From: Ashutosh Bapat
Date:
Subject: Re: EvictUnpinnedBuffer and buffer free list