Thread: [HACKERS] Should we standardize on a type for signal handler flags?
Hi, At the moment a number of flag variables set in signal handlers have 'volatile bool' as type, others have 'volatile sig_atomic_t'. That's kinda confusing. I think either is safe, but I think we should standardize one of them. Opinions? - Andres
On Mon, Jun 5, 2017 at 8:00 AM, Andres Freund <andres@anarazel.de> wrote: > At the moment a number of flag variables set in signal handlers have > 'volatile bool' as type, others have 'volatile sig_atomic_t'. That's > kinda confusing. I think either is safe, but I think we should > standardize one of them. sig_atomic_t's definition includes a reference to signals, so I would vote for using it instead of bool. -- Michael
Andres Freund <andres@anarazel.de> writes: > At the moment a number of flag variables set in signal handlers have > 'volatile bool' as type, others have 'volatile sig_atomic_t'. That's > kinda confusing. I think either is safe, but I think we should > standardize one of them. sig_atomic_t is more standards-conforming, I should think. I'm not sure if there are any current platforms where a store to a char variable wouldn't be atomic, but why live dangerously? I'd be inclined to let the code continue to treat the variables as if they were bool, ie store "true" and "false" not "1" and "0" into them. That should be perfectly safe. regards, tom lane
On 2017-06-04 19:14:06 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > At the moment a number of flag variables set in signal handlers have > > 'volatile bool' as type, others have 'volatile sig_atomic_t'. That's > > kinda confusing. I think either is safe, but I think we should > > standardize one of them. > > sig_atomic_t is more standards-conforming, I should think. I'm not sure > if there are any current platforms where a store to a char variable > wouldn't be atomic, but why live dangerously? Well, we already have some variables that aren't actually booleans, although I think all of them are only read not manipulated in signal handlers (InterruptHoldoffCount etc). So one could argue that there's no safety benefit in sig_atomic_t, because we're already using in other places. We also already rely on int32 stores being atomic in other parts of the code, although that's between processes not between signal / normal path of execution. > I'd be inclined to let the code continue to treat the variables as > if they were bool, ie store "true" and "false" not "1" and "0" > into them. That should be perfectly safe. Indeed. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2017-06-04 19:14:06 -0400, Tom Lane wrote: >> sig_atomic_t is more standards-conforming, I should think. I'm not sure >> if there are any current platforms where a store to a char variable >> wouldn't be atomic, but why live dangerously? > Well, we already have some variables that aren't actually booleans, > although I think all of them are only read not manipulated in signal > handlers (InterruptHoldoffCount etc). Hm. Well, according to POSIX one may rely on sig_atomic_t being able to hold the values 0..127 on all platforms. So we might be able to get away with converting InterruptHoldoffCount to sig_atomic_t if we needed to. In the absence of evidence that we need to, I wouldn't. But I have no problem with standardizing on using sig_atomic_t for variables that are assigned to by signal handlers. regards, tom lane
On Sun, Jun 4, 2017 at 7:21 PM, Andres Freund <andres@anarazel.de> wrote: > Well, we already have some variables that aren't actually booleans, > although I think all of them are only read not manipulated in signal > handlers (InterruptHoldoffCount etc). So one could argue that there's > no safety benefit in sig_atomic_t, because we're already using in other > places. I think that's a pretty good argument, really. If there exists a platform where only sig_atomic_t is safe to read from a signal handler, then we already don't work on that platform. Even saving and restoring errno isn't safe in that case. And if no such platform exists, then I don't know what the benefit is of worrying about sig_atomic_t at all. If "int" is anyway going to be "volatile int", then why should "bool" be written "sig_atomic_t" rather than "volatile bool"? > We also already rely on int32 stores being atomic in other > parts of the code, although that's between processes not between signal > / normal path of execution. I don't think the issues are much different. Presumably no CPU delivers a signal halfway through a CPU instruction, so if we can rely on a 4 byte store being indivisible from the perspective of some other CPU, it seems fine to also rely on that being true in the signal handler case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017-06-06 13:07:57 -0400, Robert Haas wrote: > > We also already rely on int32 stores being atomic in other > > parts of the code, although that's between processes not between signal > > / normal path of execution. > > I don't think the issues are much different. Presumably no CPU > delivers a signal halfway through a CPU instruction, so if we can rely > on a 4 byte store being indivisible from the perspective of some other > CPU, it seems fine to also rely on that being true in the signal > handler case. The signal handler case is the "weaker" one I think - you'll only ever see the result of an entire CPU instruction, whereas cross-cpu concurrency can allow another cpu to see state from the *middle* of an instruction if not atomic. - Andres
Robert Haas <robertmhaas@gmail.com> writes: > On Sun, Jun 4, 2017 at 7:21 PM, Andres Freund <andres@anarazel.de> wrote: >> Well, we already have some variables that aren't actually booleans, >> although I think all of them are only read not manipulated in signal >> handlers (InterruptHoldoffCount etc). So one could argue that there's >> no safety benefit in sig_atomic_t, because we're already using in other >> places. > I think that's a pretty good argument, really. If there exists a > platform where only sig_atomic_t is safe to read from a signal > handler, then we already don't work on that platform. Even saving and > restoring errno isn't safe in that case. That's an argument from false premises. The question here is what types are safe for an interrupt handler to *change*, not what can it read. Having said that, this is a good point: >> We also already rely on int32 stores being atomic in other >> parts of the code, although that's between processes not between signal >> / normal path of execution. > I don't think the issues are much different. That would license us to use int32 communication variables, but it still doesn't mean that "bool" is safe. In practice, the sort of architecture where bool would be a problem is one that doesn't have narrower-than-word-wide memory access instructions, so that changing a char-sized variable involves loading a word, manipulating a byte within the word, and storing it back. I cut my teeth on some machines like that, but I dunno if any still exist in the wild. regards, tom lane
On Tue, Jun 6, 2017 at 1:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I think that's a pretty good argument, really. If there exists a >> platform where only sig_atomic_t is safe to read from a signal >> handler, then we already don't work on that platform. Even saving and >> restoring errno isn't safe in that case. > > That's an argument from false premises. The question here is what types > are safe for an interrupt handler to *change*, not what can it read. OK, but we certainly have code in signal handlers that does: int save_errno = errno; /* stuff */ errno = save_errno; If that's not a signal handler changing an int, color me confused. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2017-06-06 14:13:29 -0400, Robert Haas wrote: > On Tue, Jun 6, 2017 at 1:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> I think that's a pretty good argument, really. If there exists a > >> platform where only sig_atomic_t is safe to read from a signal > >> handler, then we already don't work on that platform. Even saving and > >> restoring errno isn't safe in that case. > > > > That's an argument from false premises. The question here is what types > > are safe for an interrupt handler to *change*, not what can it read. > > OK, but we certainly have code in signal handlers that does: > > int save_errno = errno; > /* stuff */ > errno = save_errno; > > If that's not a signal handler changing an int, color me confused. Don't think it's actually clear that errno is an integer - might very well be just a sig_atomic_t, which can contain values up to like 127 or so. I think the bigger point Tom was making is that we actually know an int4 is safe - otherwise we'd have crashed and burned a long time ago - but that that might be different for *smaller* datatypes because $platform doesn't really do smaller writes atomically (turning them into read-or-write operations either in microcode or assembly). Alpha, s390, pa-risc appear to have such behaviour cross-cpu - although that doesn't necessarily imply the same is true for handlers as well. A reasonable rule would actually be to only use [u]int32 and sig_atomic_t, never bool. - Andres
On Tue, Jun 6, 2017 at 2:24 PM, Andres Freund <andres@anarazel.de> wrote: > Don't think it's actually clear that errno is an integer - might very > well be just a sig_atomic_t, which can contain values up to like 127 or > so. I think the bigger point Tom was making is that we actually know > an int4 is safe - otherwise we'd have crashed and burned a long time ago > - but that that might be different for *smaller* datatypes because > $platform doesn't really do smaller writes atomically (turning them into > read-or-write operations either in microcode or assembly). Oh, right, I remember hearing about that issue before, but it had slipped my mind completely. > Alpha, > s390, pa-risc appear to have such behaviour cross-cpu - although that > doesn't necessarily imply the same is true for handlers as well. Hmm, OK. We've already decided Alpha is safely dead, but s390 and pa-risc are ostensibly not dead. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Jun 6, 2017 at 1:33 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> That's an argument from false premises. The question here is what types >> are safe for an interrupt handler to *change*, not what can it read. > OK, but we certainly have code in signal handlers that does: > int save_errno = errno; > /* stuff */ > errno = save_errno; > If that's not a signal handler changing an int, color me confused. But it puts it back the same way at the end, so there's no visible issue for mainline code. Even if the interrupted code were in the midst of changing adjacent values, nothing bad would happen. Perhaps it would help to give a concrete example of the type of hazard you face with a non-interrupt-safe datatype --- which, for the present purpose, is one that takes more than one CPU instruction to load or store. (Andres is right that cross-process safety is a different and stronger requirement.) Suppose we have a machine with no byte-wide load/store ops, only word-wide, and the compiler has laid out four bool variables a,b,c,d in the same word. If the mainline code is trying to set d to 1, it's going to do something more or less like ldw r2, ...address...ori r2, $1stw r2, ...address... Now, if that sequence gets interrupted by a signal handler that tries to change the value of a, b, or c, the change will be lost when we reach the stw, because that's overwriting more than just the programmer-intended target byte. On the other hand, there's no problem with the handler *reading* any of those values --- it will see a consistent state. It also wouldn't matter if it changed one of them and then changed it back before returning. In practice this would never be an issue for "errno" anyway, because int is surely a datatype that can be loaded and stored in one instruction --- the C standard actually defines int as being the machine's most efficiently manipulatable type, so I think that's a pretty safe assumption. But I'm not convinced the same is universally true for "char". regards, tom lane
Andres Freund <andres@anarazel.de> writes: > A reasonable rule would actually be to only use [u]int32 and > sig_atomic_t, never bool. Yes, I'd agree with that. regards, tom lane
On 2017-06-06 14:45:37 -0400, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > A reasonable rule would actually be to only use [u]int32 and > > sig_atomic_t, never bool. > > Yes, I'd agree with that. Cool. I propose we change, once branched, the existing code using booleans, and codify that practice in sources.sgml already existing section about signal handlers. - Andres