Re: WIP: WAL prefetch (another approach) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: WIP: WAL prefetch (another approach)
Date
Msg-id 20210429001224.cdkef4ybkw7gramv@alap3.anarazel.de
Whole thread Raw
In response to Re: WIP: WAL prefetch (another approach)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: WIP: WAL prefetch (another approach)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

On 2021-04-28 19:24:53 -0400, Tom Lane wrote:
> But I happened to notice the accumulated CPU time for the background
> processes:
> 
> USER       PID  %CPU %MEM      VSZ    RSS   TT  STAT STARTED      TIME COMMAND
> tgl      19048   0.0  4.4   229952  92196   ??  Ss    3:19PM  19:59.19 postgres: startup recovering
000000010000001400000022   
 
> tgl      19051   0.0  0.1   229656   1696   ??  Ss    3:19PM  27:09.14 postgres: walreceiver streaming 14/227D8F14

> tgl      19052   0.0  0.1   229904   2516   ??  Ss    3:19PM  17:38.17 postgres: walsender tgl [local] streaming
14/227D8F14
 
> 
> IOW, we've spent over twice as many CPU cycles shipping data to the
> standby as we did in applying the WAL on the standby.  Is this
> expected?  I've got wal_consistency_checking = all, which is bloating
> the WAL volume quite a bit, but still it seems like the walsender and
> walreceiver have little excuse for spending more cycles per byte
> than the startup process.

I don't really know how the time calculation works on mac. Is there a
chance it includes time spent doing IO? On the primary the WAL IO is
done by a lot of backends, but on the standby it's all going to be the
walreceiver. And the walreceiver does fsyncs in a not particularly
efficient manner.

FWIW, on my linux workstation no such difference is visible:
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
andres   2910540  9.4  0.0 2237252 126680 ?      Ss   16:55   0:20 postgres: dev assert standby: startup recovering
00000001000000020000003F
andres   2910544  5.2  0.0 2236724 9260 ?        Ss   16:55   0:11 postgres: dev assert standby: walreceiver streaming
2/3FDCF118
andres   2910545  2.1  0.0 2237036 10672 ?       Ss   16:55   0:04 postgres: dev assert: walsender andres [local]
streaming2/3FDCF118
 



> (This is testing b3ee4c503, so if Thomas' WAL changes improved
> efficiency of the replay process at all, the discrepancy could be
> even worse in HEAD.)

The prefetching isn't enabled by default, so I'd not expect meaningful
differences... And even with the prefetching enabled, our normal
regression tests largely are resident in s_b, so there shouldn't be much
prefetching.


Oh! I was about to ask how much shared buffers your primary / standby
have. And I think I may actually have reproduce a variant of the issue!

I previously had played around with different settings that I thought
might increase the likelihood of reproducing the problem. But this time
I set shared_buffers lower than before, and got:

2021-04-28 17:03:22.174 PDT [2913840][] LOG:  database system was shut down in recovery at 2021-04-28 17:03:11 PDT
2021-04-28 17:03:22.174 PDT [2913840][] LOG:  entering standby mode
2021-04-28 17:03:22.178 PDT [2913840][1/0] LOG:  redo starts at 2/416C6278
2021-04-28 17:03:37.628 PDT [2913840][1/0] LOG:  consistent recovery state reached at 4/7F5C3200
2021-04-28 17:03:37.628 PDT [2913840][1/0] FATAL:  invalid memory alloc request size 3053455757
2021-04-28 17:03:37.628 PDT [2913839][] LOG:  database system is ready to accept read only connections
2021-04-28 17:03:37.636 PDT [2913839][] LOG:  startup process (PID 2913840) exited with exit code 1

This reproduces across restarts. Yay, I guess.

Isn't it off that we get a "database system is ready to accept read only
connections"?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: Use simplehash.h instead of dynahash in SMgr
Next
From: Tom Lane
Date:
Subject: Re: WIP: WAL prefetch (another approach)