Thread: ERROR: invalid spinlock number: 0
Hi, Commit 3b733fcd04 caused the buildfarm member "rorqual" to report the following error and fail the regression test. The cause of this issue is a bug in that commit. ERROR: invalid spinlock number: 0 But while investigating the issue, I found that this error could happen even in the current master (without commit 3b733fcd04). The error can be easily reproduced by reading pg_stat_wal_receiver view before walreceiver starts up, in the server built with --disable-atomics --disable-spinlocks. Furthermore if you try to read pg_stat_wal_receiver again, that gets stuck. This is not good. ISTM that the commit 2c8dd05d6c caused this issue. The commit changed pg_stat_get_wal_receiver() so that it reads "writtenUpto" by using pg_atomic_read_u64(). But since "writtenUpto" is initialized only when walreceiver starts up, reading "writtenUpto" before the startup of walreceiver can cause the error. Also pg_stat_get_wal_receiver() calls pg_atomic_read_u64() while a spinlock is being held. Maybe this may cause the process to get stuck in --disable-atomics case, I guess. Thought? Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Hi Fujii-san, On Tue, Feb 09, 2021 at 11:17:04PM +0900, Fujii Masao wrote: > ISTM that the commit 2c8dd05d6c caused this issue. The commit changed > pg_stat_get_wal_receiver() so that it reads "writtenUpto" by using > pg_atomic_read_u64(). But since "writtenUpto" is initialized only when > walreceiver starts up, reading "writtenUpto" before the startup of > walreceiver can cause the error. Indeed, that's a problem. We should at least move that out of the spin lock area. I'll try to look at that in details, and that's going to take me a couple of days at least. Sorry for the delay. -- Michael
Attachment
On 2021/02/11 21:55, Michael Paquier wrote: > Hi Fujii-san, > > On Tue, Feb 09, 2021 at 11:17:04PM +0900, Fujii Masao wrote: >> ISTM that the commit 2c8dd05d6c caused this issue. The commit changed >> pg_stat_get_wal_receiver() so that it reads "writtenUpto" by using >> pg_atomic_read_u64(). But since "writtenUpto" is initialized only when >> walreceiver starts up, reading "writtenUpto" before the startup of >> walreceiver can cause the error. > > Indeed, that's a problem. We should at least move that out of the > spin lock area. Yes, so what about the attached patch? We didn't notice this issue long time because no regression test checks pg_stat_wal_receiver. So I included such test in the patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
On Thu, Feb 11, 2021 at 11:30:13PM +0900, Fujii Masao wrote: > Yes, so what about the attached patch? I see. So the first error triggering the spinlock error would cause a transaction failure because the fallback implementation of atomics uses a spinlock for this variable, and it may not initialized in this code path. > We didn't notice this issue long time because no regression test checks > pg_stat_wal_receiver. So I included such test in the patch. Moving that behind ready_to_display is fine by me seeing where the initialization is done. The test case is a good addition. + * Read "writtenUpto" without holding a spinlock. So it may not be + * consistent with other WAL receiver's shared variables protected by a + * spinlock. This is OK because that variable is used only for + * informational purpose and should not be used for data integrity checks. It seems to me that the first two sentences of this comment should be combined together. -- Michael
Attachment
On Mon, Feb 15, 2021 at 9:27 PM Michael Paquier <michael@paquier.xyz> wrote: > On Thu, Feb 11, 2021 at 11:30:13PM +0900, Fujii Masao wrote: > > Yes, so what about the attached patch? > > I see. So the first error triggering the spinlock error would cause > a transaction failure because the fallback implementation of atomics > uses a spinlock for this variable, and it may not initialized in this > code path. Why not initialise it in WalRcvShmemInit()?
On Mon, Feb 15, 2021 at 10:47:05PM +1300, Thomas Munro wrote: > Why not initialise it in WalRcvShmemInit()? I was thinking about doing that as well, but we have no real need to initialize this stuff in most cases, say standalone deployments. In particular for the fallback implementation of atomics, we would prepare a spinlock for nothing. -- Michael
Attachment
On 2021/02/15 19:45, Michael Paquier wrote: > On Mon, Feb 15, 2021 at 10:47:05PM +1300, Thomas Munro wrote: >> Why not initialise it in WalRcvShmemInit()? > > I was thinking about doing that as well, but we have no real need to > initialize this stuff in most cases, say standalone deployments. In > particular for the fallback implementation of atomics, we would > prepare a spinlock for nothing. But on second thought, if we make WalRceiverMain() call pg_atomic_init_u64(), the variable is initialized (i,e., SpinLockInit() is called in --disable-atomics) every time walreceiver is started. That may be problematic? If so, the variable needs to be initialized in WalRcvShmemInit(), instead. BTW, the recent commit 46d6e5f567 has the similar issue. The variable that commit added is initialized in InitProcess(), but maybe should be done in InitProcGlobal() or elsewhere. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Hi, On 2021-02-15 19:45:21 +0900, Michael Paquier wrote: > On Mon, Feb 15, 2021 at 10:47:05PM +1300, Thomas Munro wrote: > > Why not initialise it in WalRcvShmemInit()? > > I was thinking about doing that as well, but we have no real need to > initialize this stuff in most cases, say standalone deployments. In > particular for the fallback implementation of atomics, we would > prepare a spinlock for nothing. So what? It's just about free to initialize a spinlock, whether it's using the fallback implementation or not. Initializing upon walsender startup adds a lot of complications, because e.g. somebody could already hold the spinlock because the previous walsender just disconnected, and they were looking at the stats. Greetings, Andres Freund
On 2021/02/16 6:28, Andres Freund wrote: > Hi, > > On 2021-02-15 19:45:21 +0900, Michael Paquier wrote: >> On Mon, Feb 15, 2021 at 10:47:05PM +1300, Thomas Munro wrote: >>> Why not initialise it in WalRcvShmemInit()? >> >> I was thinking about doing that as well, but we have no real need to >> initialize this stuff in most cases, say standalone deployments. In >> particular for the fallback implementation of atomics, we would >> prepare a spinlock for nothing. > > So what? It's just about free to initialize a spinlock, whether it's > using the fallback implementation or not. Initializing upon walsender > startup adds a lot of complications, because e.g. somebody could already > hold the spinlock because the previous walsender just disconnected, and > they were looking at the stats. Even if we initialize "writtenUpto" in WalRcvShmemInit(), WalReceiverMain() still needs to initialize (reset to 0) by using pg_atomic_write_u64(). Basically we should not acquire new spinlock while holding another spinlock, to shorten the spinlock duration. Right? If yes, we need to move pg_atomic_read_u64() of "writtenUpto" after the release of spinlock in pg_stat_get_wal_receiver. Attached is the updated version of the patch. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
On Tue, Feb 16, 2021 at 12:43:42PM +0900, Fujii Masao wrote: > On 2021/02/16 6:28, Andres Freund wrote: >> So what? It's just about free to initialize a spinlock, whether it's >> using the fallback implementation or not. Initializing upon walsender >> startup adds a lot of complications, because e.g. somebody could already >> hold the spinlock because the previous walsender just disconnected, and >> they were looking at the stats. Okay. > Even if we initialize "writtenUpto" in WalRcvShmemInit(), WalReceiverMain() > still needs to initialize (reset to 0) by using pg_atomic_write_u64(). Yes, you have to do that. > Basically we should not acquire new spinlock while holding another spinlock, > to shorten the spinlock duration. Right? If yes, we need to move > pg_atomic_read_u64() of "writtenUpto" after the release of spinlock in > pg_stat_get_wal_receiver. It would not matter much as a NULL tuple is returned as long as the WAL receiver information is not ready to be displayed. The only reason why all the fields are read before checking for ready_to_display is that we can be sure that everything is consistent with the PID. So reading writtenUpto before or after does not really matter logically. I would just move it after the check, as you did previously. + /* + * Read "writtenUpto" without holding a spinlock. So it may not be + * consistent with other WAL receiver's shared variables protected by a + * spinlock. This is OK because that variable is used only for + * informational purpose and should not be used for data integrity checks. + */ What about the following? "Read "writtenUpto" without holding a spinlock. Note that it may not be consistent with the other shared variables of the WAL receiver protected by a spinlock, but this should not be used for data integrity checks." I agree that what has been done with MyProc->waitStart in 46d6e5f is not safe, and that initialization should happen once at postmaster startup, with a write(0) when starting the backend. There are two of them in proc.c, one in twophase.c. Do you mind if I add an open item for this one? -- Michael
Attachment
On 2021/02/16 15:50, Michael Paquier wrote: > On Tue, Feb 16, 2021 at 12:43:42PM +0900, Fujii Masao wrote: >> On 2021/02/16 6:28, Andres Freund wrote: >>> So what? It's just about free to initialize a spinlock, whether it's >>> using the fallback implementation or not. Initializing upon walsender >>> startup adds a lot of complications, because e.g. somebody could already >>> hold the spinlock because the previous walsender just disconnected, and >>> they were looking at the stats. > > Okay. > >> Even if we initialize "writtenUpto" in WalRcvShmemInit(), WalReceiverMain() >> still needs to initialize (reset to 0) by using pg_atomic_write_u64(). > > Yes, you have to do that. > >> Basically we should not acquire new spinlock while holding another spinlock, >> to shorten the spinlock duration. Right? If yes, we need to move >> pg_atomic_read_u64() of "writtenUpto" after the release of spinlock in >> pg_stat_get_wal_receiver. > > It would not matter much as a NULL tuple is returned as long as the > WAL receiver information is not ready to be displayed. The only > reason why all the fields are read before checking for > ready_to_display is that we can be sure that everything is consistent > with the PID. So reading writtenUpto before or after does not really > matter logically. I would just move it after the check, as you did > previously. OK. > > + /* > + * Read "writtenUpto" without holding a spinlock. So it may not be > + * consistent with other WAL receiver's shared variables protected by a > + * spinlock. This is OK because that variable is used only for > + * informational purpose and should not be used for data integrity checks. > + */ > What about the following? > "Read "writtenUpto" without holding a spinlock. Note that it may not > be consistent with the other shared variables of the WAL receiver > protected by a spinlock, but this should not be used for data > integrity checks." Sounds good. Attached is the updated version of the patch. > > I agree that what has been done with MyProc->waitStart in 46d6e5f is > not safe, and that initialization should happen once at postmaster > startup, with a write(0) when starting the backend. There are two of > them in proc.c, one in twophase.c. Do you mind if I add an open item > for this one? Yeah, please feel free to do that! BTW, I already posted the patch addressing that issue, at [1]. [1] https://postgr.es/m/1df88660-6f08-cc6e-b7e2-f85296a2bdab@oss.nttdata.com Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
On Tue, Feb 16, 2021 at 11:47:52PM +0900, Fujii Masao wrote: > On 2021/02/16 15:50, Michael Paquier wrote: >> + /* >> + * Read "writtenUpto" without holding a spinlock. So it may not be >> + * consistent with other WAL receiver's shared variables protected by a >> + * spinlock. This is OK because that variable is used only for >> + * informational purpose and should not be used for data integrity checks. >> + */ >> What about the following? >> "Read "writtenUpto" without holding a spinlock. Note that it may not >> be consistent with the other shared variables of the WAL receiver >> protected by a spinlock, but this should not be used for data >> integrity checks." > > Sounds good. Attached is the updated version of the patch. Thanks, looks good to me. >> I agree that what has been done with MyProc->waitStart in 46d6e5f is >> not safe, and that initialization should happen once at postmaster >> startup, with a write(0) when starting the backend. There are two of >> them in proc.c, one in twophase.c. Do you mind if I add an open item >> for this one? > > Yeah, please feel free to do that! BTW, I already posted the patch > addressing that issue, at [1]. Okay, item added with a link to the original thread. -- Michael
Attachment
On 2021/02/17 13:52, Michael Paquier wrote: > On Tue, Feb 16, 2021 at 11:47:52PM +0900, Fujii Masao wrote: >> On 2021/02/16 15:50, Michael Paquier wrote: >>> + /* >>> + * Read "writtenUpto" without holding a spinlock. So it may not be >>> + * consistent with other WAL receiver's shared variables protected by a >>> + * spinlock. This is OK because that variable is used only for >>> + * informational purpose and should not be used for data integrity checks. >>> + */ >>> What about the following? >>> "Read "writtenUpto" without holding a spinlock. Note that it may not >>> be consistent with the other shared variables of the WAL receiver >>> protected by a spinlock, but this should not be used for data >>> integrity checks." >> >> Sounds good. Attached is the updated version of the patch. > > Thanks, looks good to me. Pushed. Thanks! > >>> I agree that what has been done with MyProc->waitStart in 46d6e5f is >>> not safe, and that initialization should happen once at postmaster >>> startup, with a write(0) when starting the backend. There are two of >>> them in proc.c, one in twophase.c. Do you mind if I add an open item >>> for this one? >> >> Yeah, please feel free to do that! BTW, I already posted the patch >> addressing that issue, at [1]. > > Okay, item added with a link to the original thread. Thanks! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION