Thread: "ERROR: latch already owned" on gharial
Hi, A couple of recent isolation test failures reported $SUBJECT. It could be a bug in recent-ish latch refactoring work, though I don't know why it would show up twice just recently. Just BTW, that animal has shown signs of a flaky toolchain before[1]. I know we have quite a lot of museum exhibits in the 'farm, in terms of hardare, OS, and tool chain. In some cases, they're probably just forgotten/not on anyone's upgrade radar. If they've shown signs of misbehaving, maybe it's time to figure out if they can be upgraded? For example, it'd be nice to be able to rule out problems in GCC 4.6.0 (that's like running PostgreSQL 9.1.0, in terms of vintage, unsupported status, and long list of missing bugfixes from the time when it was supported). [1] https://www.postgresql.org/message-id/CA+hUKGJK5R0S1LL_W4vEzKxNQGY_xGAQ1XknR-WN9jqQeQtB_w@mail.gmail.com
Hi, On 2022-05-25 12:45:21 +1200, Thomas Munro wrote: > A couple of recent isolation test failures reported $SUBJECT. Was that just on gharial? > It could be a bug in recent-ish latch refactoring work, though I don't > know why it would show up twice just recently. Yea, that's weird. > Just BTW, that animal has shown signs of a flaky toolchain before[1]. > I know we have quite a lot of museum exhibits in the 'farm, in terms > of hardare, OS, and tool chain. In some cases, they're probably just > forgotten/not on anyone's upgrade radar. If they've shown signs of > misbehaving, maybe it's time to figure out if they can be upgraded? > For example, it'd be nice to be able to rule out problems in GCC 4.6.0 > (that's like running PostgreSQL 9.1.0, in terms of vintage, > unsupported status, and long list of missing bugfixes from the time > when it was supported). Yea. gcc 4.6.0 is pretty ridiculous - the only thing we gain by testing with a .0 compiler of that vintage is pain. Could it be upgraded? TBH, I think we should just desupport HPUX. It's makework to support it at this point. 11.31 v3 is about to be old enough to drink in quite a few countries... Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2022-05-25 12:45:21 +1200, Thomas Munro wrote: >> I know we have quite a lot of museum exhibits in the 'farm, in terms >> of hardare, OS, and tool chain. In some cases, they're probably just >> forgotten/not on anyone's upgrade radar. If they've shown signs of >> misbehaving, maybe it's time to figure out if they can be upgraded? > TBH, I think we should just desupport HPUX. I think there's going to be a significant die-off of old BF animals when (if?) we convert over to the meson build system; it's just not going to be worth the trouble to upgrade those platforms to be able to run meson and ninja. I'm inclined to wait until that's over and see what's still standing before we make decisions about officially desupporting things. regards, tom lane
On Tue, May 24, 2022 at 06:24:39PM -0700, Andres Freund wrote: > On 2022-05-25 12:45:21 +1200, Thomas Munro wrote: > > Just BTW, that animal has shown signs of a flaky toolchain before[1]. > > I know we have quite a lot of museum exhibits in the 'farm, in terms > > of hardare, OS, and tool chain. In some cases, they're probably just > > forgotten/not on anyone's upgrade radar. If they've shown signs of > > misbehaving, maybe it's time to figure out if they can be upgraded? > > For example, it'd be nice to be able to rule out problems in GCC 4.6.0 > > (that's like running PostgreSQL 9.1.0, in terms of vintage, > > unsupported status, and long list of missing bugfixes from the time > > when it was supported). > > Yea. gcc 4.6.0 is pretty ridiculous - the only thing we gain by testing with a > .0 compiler of that vintage is pain. Could it be upgraded? +1, this is at least the third non-obvious miscompilation from gharial. Installing the latest GCC that builds easily (perhaps GCC 10.3) would make this a good buildfarm member again. If that won't happen, at least add a note to the animal like described in https://postgr.es/m/20211109144021.GD940092@rfd.leadboat.com
Noah Misch <noah@leadboat.com> writes: > +1, this is at least the third non-obvious miscompilation from gharial. Is there any evidence that this is a compiler-sourced problem? Maybe it is, but it's sure not obvious to me (he says, eyeing his buildfarm animals with even older gcc versions). regards, tom lane
On Thu, May 26, 2022 at 2:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Noah Misch <noah@leadboat.com> writes: > > +1, this is at least the third non-obvious miscompilation from gharial. > > Is there any evidence that this is a compiler-sourced problem? > Maybe it is, but it's sure not obvious to me (he says, eyeing his > buildfarm animals with even older gcc versions). Sorry for the ambiguity -- I have no evidence of miscompilation. My "just BTW" paragraph was a reaction to the memory of the last couple of times Noah and I wasted hours chasing red herrings on this system, which is pretty demotivating when looking into an unexplained failure. On a more practical note, I don't have access to the BF database right now. Would you mind checking if "latch already owned" has occurred on any other animals?
Thomas Munro <thomas.munro@gmail.com> writes: > Sorry for the ambiguity -- I have no evidence of miscompilation. My > "just BTW" paragraph was a reaction to the memory of the last couple > of times Noah and I wasted hours chasing red herrings on this system, > which is pretty demotivating when looking into an unexplained failure. I can't deny that those HPUX animals have produced more than their fair share of problems. > On a more practical note, I don't have access to the BF database right > now. Would you mind checking if "latch already owned" has occurred on > any other animals? Looking back 6 months, these are the only occurrences of that string in failed tests: sysname | branch | snapshot | stage | l ---------+--------+---------------------+----------------+------------------------------------------------------------------- gharial | HEAD | 2022-04-28 23:37:51 | Check | 2022-04-28 18:36:26.981 MDT [22642:1] ERROR: latch already owned gharial | HEAD | 2022-05-06 11:33:11 | IsolationCheck | 2022-05-06 10:10:52.727 MDT [7366:1] ERROR: latch already owned gharial | HEAD | 2022-05-24 06:31:31 | IsolationCheck | 2022-05-24 02:44:51.850 MDT [13089:1] ERROR: latch already owned (3 rows) regards, tom lane
On Thu, May 26, 2022 at 2:35 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > On a more practical note, I don't have access to the BF database right > > now. Would you mind checking if "latch already owned" has occurred on > > any other animals? > > Looking back 6 months, these are the only occurrences of that string > in failed tests: > > sysname | branch | snapshot | stage | l > ---------+--------+---------------------+----------------+------------------------------------------------------------------- > gharial | HEAD | 2022-04-28 23:37:51 | Check | 2022-04-28 18:36:26.981 MDT [22642:1] ERROR: latch alreadyowned > gharial | HEAD | 2022-05-06 11:33:11 | IsolationCheck | 2022-05-06 10:10:52.727 MDT [7366:1] ERROR: latch already owned > gharial | HEAD | 2022-05-24 06:31:31 | IsolationCheck | 2022-05-24 02:44:51.850 MDT [13089:1] ERROR: latch alreadyowned > (3 rows) Thanks. Hmm. So far it's always a parallel worker. The best idea I have is to include the ID of the mystery PID in the error message and see if that provides a clue next time.
Attachment
On Fri, May 27, 2022 at 7:55 AM Thomas Munro <thomas.munro@gmail.com> wrote: > Thanks. Hmm. So far it's always a parallel worker. The best idea I > have is to include the ID of the mystery PID in the error message and > see if that provides a clue next time. What I'm inclined to do is get gharial and anole removed from the buildfarm. anole was set up by Heikki in 2011. I don't know when gharial was set up, or by whom. I don't think anyone at EDB cares about these machines any more, or has any interest in maintaining them. I think the only reason they're still running is that, just by good fortune, they haven't fallen over and died yet. The hardest part of getting them taken out of the buildfarm is likely to be finding someone who has a working username and password to log into them and take the jobs out of the crontab. If someone really cares about figuring out what's going on here, it's probably possible to get someone who is an EDB employee access to the box to chase it down. But I'm having a hard time understanding what value we get out of that given that the machines are running an 11-year-old compiler version on discontinued hardware on a discontinued operating system. Even if we find a bug in PostgreSQL, it's likely to be a bug that only matters on systems nobody cares about. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, May 27, 2022 at 7:55 AM Thomas Munro <thomas.munro@gmail.com> wrote: >> Thanks. Hmm. So far it's always a parallel worker. The best idea I >> have is to include the ID of the mystery PID in the error message and >> see if that provides a clue next time. > ... Even if we find a bug in PostgreSQL, > it's likely to be a bug that only matters on systems nobody cares > about. That's possible, certainly. It's also possible that it's a real bug that so far has only manifested there for (say) timing reasons. The buildfarm is not so large that we can write off single-machine failures as being unlikely to hit in the real world. What I'd suggest is to promote that failure to elog(PANIC), which would at least give us the PID and if we're lucky a stack trace. regards, tom lane
On Fri, May 27, 2022 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > That's possible, certainly. It's also possible that it's a real bug > that so far has only manifested there for (say) timing reasons. > The buildfarm is not so large that we can write off single-machine > failures as being unlikely to hit in the real world. > > What I'd suggest is to promote that failure to elog(PANIC), which > would at least give us the PID and if we're lucky a stack trace. That proposed change is fine with me. As to the question of whether it's a real bug, nobody can prove anything unless we actually run it down. It's just a question of what you think the odds are. Noah's PGCon talk a few years back on the long tail of buildfarm failures convinced me (perhaps unintentionally) that low-probability failures that occur only on obscure systems or configurations are likely not worth running down, because while they COULD be real bugs, a lot of them aren't, and the time it would take to figure it out could be spent on other things - for instance, fixing things that we know for certain are bugs. Spending 40 hours of person-time on something with a 10% chance of being a bug in the PostgreSQL code doesn't necessarily make sense to me, because while you are correct that the buildfarm isn't that large, neither is the developer community. -- Robert Haas EDB: http://www.enterprisedb.com
Robert Haas <robertmhaas@gmail.com> writes: > On Fri, May 27, 2022 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> What I'd suggest is to promote that failure to elog(PANIC), which >> would at least give us the PID and if we're lucky a stack trace. > That proposed change is fine with me. > As to the question of whether it's a real bug, nobody can prove > anything unless we actually run it down. Agreed, and I'll even grant your point that if it is an HPUX-specific or IA64-specific bug, it is not worth spending huge amounts of time to isolate. The problem is that we don't know that. What we do know so far is that if it can occur elsewhere, it's rare --- so we'd better be prepared to glean as much info as possible if we do get such a failure. Hence my thought of s/ERROR/PANIC/. And I'd be in favor of any other low-effort change we can make to instrument the case better. regards, tom lane
On Sat, May 28, 2022 at 8:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: > > On Fri, May 27, 2022 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> What I'd suggest is to promote that failure to elog(PANIC), which > >> would at least give us the PID and if we're lucky a stack trace. > > > That proposed change is fine with me. > > > As to the question of whether it's a real bug, nobody can prove > > anything unless we actually run it down. > > Agreed, and I'll even grant your point that if it is an HPUX-specific > or IA64-specific bug, it is not worth spending huge amounts of time > to isolate. The problem is that we don't know that. What we do know > so far is that if it can occur elsewhere, it's rare --- so we'd better > be prepared to glean as much info as possible if we do get such a > failure. Hence my thought of s/ERROR/PANIC/. And I'd be in favor of > any other low-effort change we can make to instrument the case better. OK, pushed (except I realised that all the PIDs involved were int, not pid_t). Let's see...
On Sat, May 28, 2022 at 1:56 AM Robert Haas <robertmhaas@gmail.com> wrote: > What I'm inclined to do is get gharial and anole removed from the > buildfarm. anole was set up by Heikki in 2011. I don't know when > gharial was set up, or by whom. I don't think anyone at EDB cares > about these machines any more, or has any interest in maintaining > them. I think the only reason they're still running is that, just by > good fortune, they haven't fallen over and died yet. The hardest part > of getting them taken out of the buildfarm is likely to be finding > someone who has a working username and password to log into them and > take the jobs out of the crontab. FWIW, in a previous investigation, Semab and Sandeep had access: https://www.postgresql.org/message-id/CABimMB4mRs9N3eivR-%3DqF9M8oWc5E6OX7GywsWF0DXN4P5gNEA%40mail.gmail.com
On Mon, May 30, 2022 at 8:31 PM Thomas Munro <thomas.munro@gmail.com> wrote: > On Sat, May 28, 2022 at 1:56 AM Robert Haas <robertmhaas@gmail.com> wrote: > > What I'm inclined to do is get gharial and anole removed from the > > buildfarm. anole was set up by Heikki in 2011. I don't know when > > gharial was set up, or by whom. I don't think anyone at EDB cares > > about these machines any more, or has any interest in maintaining > > them. I think the only reason they're still running is that, just by > > good fortune, they haven't fallen over and died yet. The hardest part > > of getting them taken out of the buildfarm is likely to be finding > > someone who has a working username and password to log into them and > > take the jobs out of the crontab. > > FWIW, in a previous investigation, Semab and Sandeep had access: > > https://www.postgresql.org/message-id/CABimMB4mRs9N3eivR-%3DqF9M8oWc5E6OX7GywsWF0DXN4P5gNEA%40mail.gmail.com Yeah, I'm in touch with Sandeep but not able to get in yet for some reason. Will try to sort it out. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, May 31, 2022 at 8:20 AM Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, May 30, 2022 at 8:31 PM Thomas Munro <thomas.munro@gmail.com> wrote: > > On Sat, May 28, 2022 at 1:56 AM Robert Haas <robertmhaas@gmail.com> wrote: > > > What I'm inclined to do is get gharial and anole removed from the > > > buildfarm. anole was set up by Heikki in 2011. I don't know when > > > gharial was set up, or by whom. I don't think anyone at EDB cares > > > about these machines any more, or has any interest in maintaining > > > them. I think the only reason they're still running is that, just by > > > good fortune, they haven't fallen over and died yet. The hardest part > > > of getting them taken out of the buildfarm is likely to be finding > > > someone who has a working username and password to log into them and > > > take the jobs out of the crontab. > > > > FWIW, in a previous investigation, Semab and Sandeep had access: > > > > https://www.postgresql.org/message-id/CABimMB4mRs9N3eivR-%3DqF9M8oWc5E6OX7GywsWF0DXN4P5gNEA%40mail.gmail.com > > Yeah, I'm in touch with Sandeep but not able to get in yet for some > reason. Will try to sort it out. OK, I have access to the box now. I guess I might as well leave the crontab jobs enabled until the next time this happens, since Thomas just took steps to improve the logging, but I do think these BF members are overdue to be killed off, and would like to do that as soon as it seems like a reasonable step to take. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Jun 1, 2022 at 12:55 AM Robert Haas <robertmhaas@gmail.com> wrote: > OK, I have access to the box now. I guess I might as well leave the > crontab jobs enabled until the next time this happens, since Thomas > just took steps to improve the logging, but I do think these BF > members are overdue to be killed off, and would like to do that as > soon as it seems like a reasonable step to take. A couple of months later, there has been no repeat of that error. I'd happily forget about that and move on, if you want to decommission these.
On Sun, Jul 3, 2022 at 11:51 PM Thomas Munro <thomas.munro@gmail.com> wrote: > On Wed, Jun 1, 2022 at 12:55 AM Robert Haas <robertmhaas@gmail.com> wrote: > > OK, I have access to the box now. I guess I might as well leave the > > crontab jobs enabled until the next time this happens, since Thomas > > just took steps to improve the logging, but I do think these BF > > members are overdue to be killed off, and would like to do that as > > soon as it seems like a reasonable step to take. > > A couple of months later, there has been no repeat of that error. I'd > happily forget about that and move on, if you want to decommission > these. I have commented out the BF stuff in crontab on that machine. -- Robert Haas EDB: http://www.enterprisedb.com
Thanks Robert.
We are receiving the alerts from buildfarm-admins for anole and gharial not reporting. Who can help to stop these? Thanks
On Wed, Jul 6, 2022 at 1:27 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Jul 3, 2022 at 11:51 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Wed, Jun 1, 2022 at 12:55 AM Robert Haas <robertmhaas@gmail.com> wrote:
> > OK, I have access to the box now. I guess I might as well leave the
> > crontab jobs enabled until the next time this happens, since Thomas
> > just took steps to improve the logging, but I do think these BF
> > members are overdue to be killed off, and would like to do that as
> > soon as it seems like a reasonable step to take.
>
> A couple of months later, there has been no repeat of that error. I'd
> happily forget about that and move on, if you want to decommission
> these.
I have commented out the BF stuff in crontab on that machine.
--
Robert Haas
EDB: http://www.enterprisedb.com
Sandeep Thakkar
On 2022-Jul-13, Sandeep Thakkar wrote: > Thanks Robert. > > We are receiving the alerts from buildfarm-admins for anole and gharial not > reporting. Who can help to stop these? Thanks Probably Andrew knows how to set buildsystems.no_alerts for these animals. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)
Hey hackers, I wanted to report that we have seen this issue (with the procLatch) a few times very sporadically on Greenplum 6X (based on 9.4), with relatively newer versions of GCC. I realize that 9.4 is out of support, so this email is purely to add on to the existing thread, in case the info can help fix/reveal something in supported versions. Unfortunately, we don't have a core to share as we don't have the benefit of commit [1] in Greenplum 6X, but we do possess commit [2] which gives us an elog ERROR as opposed to PANIC. Instance 1: Event 1: 2023-11-13 10:01:31.927168 CET..., pY, ..."LOG","00000","disconnection: session time: ..." Event 2: 2023-11-13 10:01:32.049135 CET...,pX,,,,,"FATAL","XX000","latch already owned by pid Y (is_set: 0) (pg_latch.c:159)",,,,,,,0,, "pg_latch.c",159,"Stack trace: 1 0xbde8b8 postgres errstart (elog.c:567) 2 0xbe0768 postgres elog_finish (discriminator 7) 3 0xa08924 postgres <symbol not found> (pg_latch.c:158) <---------- OwnLatch 4 0xa7f179 postgres InitProcess (proc.c:523) 5 0xa94ac3 postgres PostgresMain (postgres.c:4874) 6 0xa1e2ed postgres <symbol not found> (postmaster.c:2860) 7 0xa1f295 postgres PostmasterMain (discriminator 5) ... "LOG","00000","server process (PID Y) exited with exit code 1",,,,,,,0,,"postmaster.c",3987, Instance 2 (was reported with (GCC) 8.5.0 20210514 (Red Hat 8.5.0-20)): Exactly the same as Instance 1 with identical log, ordering of events and stack trace, except this time (is_set: 1) when the ERROR is logged. A possible ordering of events: (1) DisownLatch() is called by pid Y during ProcKill() and the write for latch->owner_pid = 0 is NOT yet flushed to shmem. (2) The PGPROC object for pid Y is returned to the free list. (3) Pid X sees the same PGPROC object on the free list and grabs it. (4) Pid X does sanity check inside OwnLatch during InitProcess and still sees the old value of latch->owner_pid = Y (and not = 0), and trips the ERROR. The above sequence of operations should apply to PG HEAD as well. Suggestion: Should we do a pg_memory_barrier() at the end of DisownLatch(), like in ResetLatch(), like the one introduced in [3]? This would ensure that the write latch->owner_pid = 0; is flushed to shmem. The attached patch does this. I'm not sure why we didn't introduce a memory barrier in DisownLatch() in [3]. I didn't find anything in the associated hackers thread [4] either. Was it the performance impact, or was it just because SetLatch and ResetLatch were more racy and this is way less likely to happen? This is out of my wheelhouse, but would one additional barrier in a process' lifecycle be that bad for performance? Appendix: Build details: (GCC) 8.5.0 20210514 (Red Hat 8.5.0-20) CFLAGS=-Wall -Wmissing-prototypes -Wpointer-arith -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -fno-aggressive-loop-optimizations -Wno-unused-but-set-variable -Wno-address -Werror=implicit-fallthrough=3 -Wno-format-truncation -Wno-stringop-truncation -m64 -O3 -fargument-noalias-global -fno-omit-frame-pointer -g -std=gnu99 -Werror=uninitialized -Werror=implicit-function-declaration Regards, Soumyadeep (VMware) [1] https://github.com/postgres/postgres/commit/12e28aac8e8eb76cab13a4e9b696e3dab17f1c99 [2] https://github.com/greenplum-db/gpdb/commit/81fdd6c5219af865e9dc41f4087e0405d6616050 [3] https://github.com/postgres/postgres/commit/14e8803f101a54d99600683543b0f893a2e3f529 [4] https://www.postgresql.org/message-id/flat/20150112154026.GB2092%40awork2.anarazel.de
Attachment
On 08/02/2024 04:08, Soumyadeep Chakraborty wrote: > A possible ordering of events: > > (1) DisownLatch() is called by pid Y during ProcKill() and the write for > latch->owner_pid = 0 is NOT yet flushed to shmem. > > (2) The PGPROC object for pid Y is returned to the free list. > > (3) Pid X sees the same PGPROC object on the free list and grabs it. > > (4) Pid X does sanity check inside OwnLatch during InitProcess and > still sees the > old value of latch->owner_pid = Y (and not = 0), and trips the ERROR. > > The above sequence of operations should apply to PG HEAD as well. > > Suggestion: > > Should we do a pg_memory_barrier() at the end of DisownLatch(), like in > ResetLatch(), like the one introduced in [3]? This would ensure that the write > latch->owner_pid = 0; is flushed to shmem. The attached patch does this. Hmm, there is a pair of SpinLockAcquire() and SpinLockRelease() in ProcKill(), before step 3 can happen. Comment in spin.h about SpinLockAcquire/Release: > * Load and store operations in calling code are guaranteed not to be > * reordered with respect to these operations, because they include a > * compiler barrier. (Before PostgreSQL 9.5, callers needed to use a > * volatile qualifier to access data protected by spinlocks.) That talks about a compiler barrier, though, not a memory barrier. But looking at the implementations in s_lock.h, I believe they do act as memory barrier, too. So you might indeed have that problem on 9.4, but AFAICS not on later versions. -- Heikki Linnakangas Neon (https://neon.tech)
Hi, On 2024-02-08 14:57:47 +0200, Heikki Linnakangas wrote: > On 08/02/2024 04:08, Soumyadeep Chakraborty wrote: > > A possible ordering of events: > > > > (1) DisownLatch() is called by pid Y during ProcKill() and the write for > > latch->owner_pid = 0 is NOT yet flushed to shmem. > > > > (2) The PGPROC object for pid Y is returned to the free list. > > > > (3) Pid X sees the same PGPROC object on the free list and grabs it. > > > > (4) Pid X does sanity check inside OwnLatch during InitProcess and > > still sees the > > old value of latch->owner_pid = Y (and not = 0), and trips the ERROR. > > > > The above sequence of operations should apply to PG HEAD as well. > > > > Suggestion: > > > > Should we do a pg_memory_barrier() at the end of DisownLatch(), like in > > ResetLatch(), like the one introduced in [3]? This would ensure that the write > > latch->owner_pid = 0; is flushed to shmem. The attached patch does this. > > Hmm, there is a pair of SpinLockAcquire() and SpinLockRelease() in > ProcKill(), before step 3 can happen. Right. I wonder if the issue istead could be something similar to what was fixed in 8fb13dd6ab5b and more generally in 97550c0711972a. If two procs go through proc_exit() for the same process, you can get all kinds of weird mixed up resource ownership. The bug fixed in 8fb13dd6ab5b wouldn't apply, but it's pretty easy to introduce similar bugs in other places, so it seems quite plausible that greenplum might have done so. We also did have more proc_exit()s in signal handlers in older branches, so it might just be an issue that also was present before. Greetings, Andres Freund
Hey, Deeply appreciate both your input! On Thu, Feb 8, 2024 at 4:57 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > Hmm, there is a pair of SpinLockAcquire() and SpinLockRelease() in > ProcKill(), before step 3 can happen. Comment in spin.h about > SpinLockAcquire/Release: > > > * Load and store operations in calling code are guaranteed not to be > > * reordered with respect to these operations, because they include a > > * compiler barrier. (Before PostgreSQL 9.5, callers needed to use a > > * volatile qualifier to access data protected by spinlocks.) > > That talks about a compiler barrier, though, not a memory barrier. But > looking at the implementations in s_lock.h, I believe they do act as > memory barrier, too. > > So you might indeed have that problem on 9.4, but AFAICS not on later > versions. Yes 9.4 does not have 0709b7ee72e, which I'm assuming you are referring to? Reading src/backend/storage/lmgr/README.barrier: For x86, to avoid reordering between a load and a store, we need something that prevents both CPU and compiler reordering. pg_memory_barrier() fits the bill. Here we can have pid X's read of latch->owner_pid=Y reordered to precede pid Y's store of latch->owner_pid = 0. The compiler barrier in S_UNLOCK() will prevent compiler reordering but not CPU reordering of the above. #define S_UNLOCK(lock) \ do { __asm__ __volatile__("" : : : "memory"); *(lock) = 0; } while (0) which is equivalent to a: #define pg_compiler_barrier_impl() __asm__ __volatile__("" ::: "memory") But maybe both CPU and memory reordering will be prevented by the tas() in S_LOCK() which does a lock and xchgb? Is the above acting as BOTH a compiler and CPU barrier, like the lock; addl stuff in pg_memory_barrier_impl()? If yes, then the picture would look like this: Pid Y in DisownLatch(), Pid X in OwnLatch() Y: LOAD latch->ownerPid ... Y: STORE latch->ownerPid = 0 ... // returning PGPROC to freeList Y:S_LOCK(ProcStructLock) <--- tas() prevents X: LOAD latch->ownerPid from preceding this ... ... <-------- X: LOAD latch->ownerPid can't get here anyway as spinlock is held ... Y:S_UNLOCK(ProcStructLock) ... X: S_LOCK(ProcStructLock) // to retrieve PGPROC from freeList ... X: S_UNLOCK(ProcStructLock) ... X: LOAD latch->ownerPid And this issue is not caused due to 9.4 missing 0709b7ee72e, which changed S_UNLOCK exclusively. If no, then we would need the patch that does an explicit pg_memory_barrier() at the end of DisownLatch() for PG HEAD. On Thu, Feb 8, 2024 at 1:41 PM Andres Freund <andres@anarazel.de> wrote: > Right. I wonder if the issue istead could be something similar to what was > fixed in 8fb13dd6ab5b and more generally in 97550c0711972a. If two procs go > through proc_exit() for the same process, you can get all kinds of weird > mixed up resource ownership. The bug fixed in 8fb13dd6ab5b wouldn't apply, > but it's pretty easy to introduce similar bugs in other places, so it seems > quite plausible that greenplum might have done so. We also did have more > proc_exit()s in signal handlers in older branches, so it might just be an > issue that also was present before. Hmm, the pids X and Y in the example provided upthread don't spawn off any children (like by calling system()) - they are just regular backends. So its not possible for them to receive TERM and try to proc_exit() w/ the same PGPROC. So that is not the issue, I guess? Regards, Soumyadeep (VMware)