On Mon, Jan 30, 2023 at 05:47:49PM +0000, Mok wrote: > Hi, > > We've started to observe instances of one of our databases stalling for a > few seconds. > > We see a spike in wal write locks then nothing for a few seconds. After > which we have spike latency as processes waiting to get to the db can do > so. > > There is nothing in the postgres logs that give us any clues to what could > be happening, no locks, unusually high/long running transactions, just a > pause and resume. > > Could anyone give me any advice as to what to look for when it comes to > checking the underlying disk that the db is on?
What version postgres? What settings have non-default values ? What OS/version? What environment/hardware? VM/image/provider/...
Have you enabled logging for vacuum/checkpoints/locks ?
In addition to previous questions, if possible, a SELECT * FROM pg_stat_activity at the moment of the stall. The most important information is the wait_event column. My guess is the disk, but just the select at the right moment can answer this.