Thread: libpq and psql not on same page about SIGPIPE
libpq compiled with --enable-thread-safety thinks it can set the SIGPIPE signal handler. It thinks once is enough. psql thinks it can arbitrarily flip the signal handler between SIG_IGN and SIG_DFL. Ergo, after the first use of the pager for output, libpq's SIGPIPE handling will be broken. I submit that psql is unlikely to be the only program that does this, and therefore that libpq must be considered broken, not psql. regards, tom lane
Tom Lane wrote: > libpq compiled with --enable-thread-safety thinks it can set the SIGPIPE > signal handler. It thinks once is enough. > > psql thinks it can arbitrarily flip the signal handler between SIG_IGN > and SIG_DFL. Ergo, after the first use of the pager for output, libpq's > SIGPIPE handling will be broken. > > I submit that psql is unlikely to be the only program that does this, > and therefore that libpq must be considered broken, not psql. I have researched possible fixes for our threading sigpipe handling in libpq. Basically, we need to ignore SIGPIPE in socket send() (and SSL_write) because if the backend dies unexpectedly, the process will die. libpq would rather trap the failure. In 7.4.X we set ignore for SIGPIPE before write and reset it after write, but that doesn't work for threading because it affects all threads, not just the thread using libpq. Our current setup is wrong because an application could change SIGPIPE for its own purposes (like psql does) and remove our custom thread handler for sigpipe. The best solution seems to be one suggested by Manfred in November of 2003: > signal handlers are a process property, not a thread property - that > code is broken for multi-threaded apps. > At least that's how I understand the opengroup man page, and a quick > google confirmed that: > http://groups.google.de/groups?selm=353662BF.9D70F63A%40brighttiger.com > > I haven't found a reliable thread-safe approach yet: > My first idea was block with pthread_sigmask, after send check if > pending with sigpending, and then delete with sigwait, and restore > blocked state. But that breaks if SIGPIPE is blocked and a signal is > already pending: there is no way to remove our additional SIGPIPE. I > don't see how we can avoid destroying the realtime signal info. His idea of pthread_sigmask/send/sigpending/sigwait/restore-mask. Seems we could also check errno for SIGPIPE rather than calling sigpending. He has a concern about an application that already blocked SIGPIPE and has a pending SIGPIPE signal waiting already. One idea would be to check for sigpending() before the send() and clear the signal only if SIGPIPE wasn't pending before the call. I realize that if our send() also generates a SIGPIPE it would remove the previous realtime signal info but that seems a minor problem. Comments? This seems like our only solution. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Manfred Spraul wrote: > Bruce Momjian wrote: > > >Comments? This seems like our only solution. > > > > > > > This would be a transparent solution. Another approach would be: > - Use the old 7.3 approach by default. This means perfect backward > compatibility for single-threaded apps and broken multithreaded apps. > - Add a new PQinitDB(int disableSigpipeHandler) initialization function. > Document that multithreaded apps must call the function with > disableSigpipeHandle=1 and handle SIGPIPE for libpq. Perhaps with a > reference implementation in libpq (i.e. a sigpipeMode with 0 for old > approach, 1 for do nothing, 2 for install our own handler). > > It would prefer that approach: > It means that the multithreaded libpq apps must be updated [are there > any?], but the solution is simpler and less fragile than calling 4 > signal handling function in a row to selectively block SIGPIPE per-thread. I think we can get away with three function calls because we can check errno for EPIPE from the send() call. We already have two function calls in there so I don't see a third as a huge problem and not worth an API change unless someone tells us it is a performance hit. One thing I know from the broken 7.4 code is that calling signal() from inside a thread is very slow on Linux, like a 20% performance hit because I assume it has to adjust all the threads signal stacks or something. Anyway, I assume thread_sigmask() and sigpending() are not huge hits. In fact, by checking EPIPE from send() we don't need sigpending at all. We just block and (if EPIPE) clear using sigwait(), then unblock. I think we can document that if you are blocking SIGPIPE in your application and wait to handle it later, you can't keep the pending signal through a libpq call. I think that is a much easier requirement than adding a new API call. The most common case of SIG_IGN/SIG_DFL is not affected by this, only cases where they define their own signal handler, block it, and then trying to keep a call active across a libpq call. Manfred, glad you are around to discuss this. After much research I came up with a method and then found your description that matched it so I felt I was on the right track. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Bruce Momjian wrote: [... SIGPIPE suppression in libpq ...] Linux also has MSG_NOSIGNAL as a send() flag that might be useful. It suppresses generation of SIGPIPE for just that call. No, it doesn't work for SSL and it's probably not very portable, but it might be a good platform-specific optimization for the common case. -O
Bruce Momjian <pgman@candle.pha.pa.us> writes: > His idea of pthread_sigmask/send/sigpending/sigwait/restore-mask. Seems > we could also check errno for SIGPIPE rather than calling sigpending. > He has a concern about an application that already blocked SIGPIPE and > has a pending SIGPIPE signal waiting already. One idea would be to > check for sigpending() before the send() and clear the signal only if > SIGPIPE wasn't pending before the call. I realize that if our send() > also generates a SIGPIPE it would remove the previous realtime signal > info but that seems a minor problem. Supposing that we don't touch the signal handler at all, then it is possible that the application has set it to SIG_IGN, in which case a SIGPIPE would be discarded rather than going into the pending mask. So I think the logic has to be: pthread_sigmask to block SIGPIPE and save existing signal mask send(); if (errno == EPIPE){ if (sigpending indicates SIGPIPE pending) use sigwait to clear SIGPIPE;} pthread_sigmask to restore prior signal mask The only case that is problematic is where the application had already blocked SIGPIPE and there is a pending SIGPIPE signal when we are entered, *and* we get SIGPIPE ourselves. If the C library does not support queued signals then our sigwait will clear both our own EPIPE and the pending signal. This is annoying but it doesn't seem fatal --- if the app is writing on a closed pipe then it'll probably try it again later and get the signal again. If the C library does support queued signals then we will read the existing SIGPIPE condition and leave our own signal in the queue. This is no problem to the extent that one pending SIGPIPE looks just like another --- does anyone know of platforms where there is additional info carried by a SIGPIPE event? This seems workable as long as we document the possible gotchas. regards, tom lane
Bruce Momjian wrote: >Comments? This seems like our only solution. > > > This would be a transparent solution. Another approach would be: - Use the old 7.3 approach by default. This means perfect backward compatibility for single-threaded apps and broken multithreaded apps. - Add a new PQinitDB(int disableSigpipeHandler) initialization function. Document that multithreaded apps must call the function with disableSigpipeHandle=1 and handle SIGPIPE for libpq. Perhaps with a reference implementation in libpq (i.e. a sigpipeMode with 0 for old approach, 1 for do nothing, 2 for install our own handler). It would prefer that approach: It means that the multithreaded libpq apps must be updated [are there any?], but the solution is simpler and less fragile than calling 4 signal handling function in a row to selectively block SIGPIPE per-thread. -- Manfred
Manfred Spraul <manfred@colorfullife.com> writes: > Is that really worthwhile? There are half a dozend assumption about the > C library and kernel internal efficiency of the signal handling > functions in the proposal. Adding a PQinitLib function is obviously a > larger change, but it solves the problem. Not really: it only solves the problem *if you change the application*, which is IMHO not acceptable. In particular, why should a non-threaded app expect to have to change to deal with this issue? But we can't safely build a thread-safe libpq.so for general use if it breaks non-threaded apps that haven't been changed. As for the efficiency argument, we have been doing two pqsignal()s per send() for years and years; I see no reason to think this is worse. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > His idea of pthread_sigmask/send/sigpending/sigwait/restore-mask. Seems > > we could also check errno for SIGPIPE rather than calling sigpending. > > > He has a concern about an application that already blocked SIGPIPE and > > has a pending SIGPIPE signal waiting already. One idea would be to > > check for sigpending() before the send() and clear the signal only if > > SIGPIPE wasn't pending before the call. I realize that if our send() > > also generates a SIGPIPE it would remove the previous realtime signal > > info but that seems a minor problem. > > Supposing that we don't touch the signal handler at all, then it is > possible that the application has set it to SIG_IGN, in which case a > SIGPIPE would be discarded rather than going into the pending mask. Right. > So I think the logic has to be: > > pthread_sigmask to block SIGPIPE and save existing signal mask > > send(); > > if (errno == EPIPE) > { > if (sigpending indicates SIGPIPE pending) > use sigwait to clear SIGPIPE; > } > > pthread_sigmask to restore prior signal mask Yes, that is the logic in my patch, except that I don't check errno, I just call sigpending(). There are a few writes and it seemed impossible to check them all. Also, we might get a SIGPIPE but as you mentioned the SIGPIPE may have been ignored. > The only case that is problematic is where the application had > already blocked SIGPIPE and there is a pending SIGPIPE signal when > we are entered, *and* we get SIGPIPE ourselves. Right. > If the C library does not support queued signals then our sigwait will > clear both our own EPIPE and the pending signal. This is annoying but > it doesn't seem fatal --- if the app is writing on a closed pipe then > it'll probably try it again later and get the signal again. Right. > If the C library does support queued signals then we will read the > existing SIGPIPE condition and leave our own signal in the queue. This > is no problem to the extent that one pending SIGPIPE looks just like > another --- does anyone know of platforms where there is additional info > carried by a SIGPIPE event? Right. We could add a sigpending() check and if set we just don't clear the SIGPIPE signal, but this adds an additional function call into the mix that seemed unnecessary, though I could be convinced otherwise. > This seems workable as long as we document the possible gotchas. My patch documents the limitation. I finished reading the Addison-Wesley "Programming with POSIX Threads" book and it has helped me with this stuff. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Manfred Spraul wrote: > >This seems workable as long as we document the possible gotchas. > > > > > > > Is that really worthwhile? There are half a dozend assumption about the > C library and kernel internal efficiency of the signal handling > functions in the proposal. Adding a PQinitLib function is obviously a The main path uses pthread_sigmask() and sigpending(). Are those possible performance problems? I see how signal() would be a thread problem, but not those. > larger change, but it solves the problem. > I'm aware of one minor gotcha: PQinSend() is not usable right now: it > relies on the initialization of pq_thread_in_send, which is only created > in the middle of the first connectDB(). That makes proper signal > handling for the first connection impossible. I think that whole PQinSend thing is pretty ugly, even if I did write it. My current patch seems like a great improvement. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: > If the C library does support queued signals then we will read the > existing SIGPIPE condition and leave our own signal in the queue. This > is no problem to the extent that one pending SIGPIPE looks just like > another --- does anyone know of platforms where there is additional info > carried by a SIGPIPE event? POSIX.1b / SA_SIGINFO? SIGPIPE does not fill much of siginfo_t, but the 3rd handler arg has the interrupted execution context. -O
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Yes, that is the logic in my patch, except that I don't check errno, I > just call sigpending(). No, that's wrong: if there is a pending SIGPIPE that belongs to the outer app, you'd clear it. > There are a few writes and it seemed impossible > to check them all. Hmm? There is only one place this needs to be done, namely pqsecure_write. BTW, have you posted the patch yet or are you still working on it? The mail server seems a bit flaky today ... regards, tom lane
Tom Lane wrote: >Bruce Momjian <pgman@candle.pha.pa.us> writes: > > >>His idea of pthread_sigmask/send/sigpending/sigwait/restore-mask. Seems >>we could also check errno for SIGPIPE rather than calling sigpending. >> >> > > > >>He has a concern about an application that already blocked SIGPIPE and >>has a pending SIGPIPE signal waiting already. One idea would be to >>check for sigpending() before the send() and clear the signal only if >>SIGPIPE wasn't pending before the call. I realize that if our send() >>also generates a SIGPIPE it would remove the previous realtime signal >>info but that seems a minor problem. >> >> > >Supposing that we don't touch the signal handler at all, then it is >possible that the application has set it to SIG_IGN, in which case a >SIGPIPE would be discarded rather than going into the pending mask. >So I think the logic has to be: > > pthread_sigmask to block SIGPIPE and save existing signal mask > > send(); > > if (errno == EPIPE) > { > if (sigpending indicates SIGPIPE pending) > use sigwait to clear SIGPIPE; > } > > pthread_sigmask to restore prior signal mask > >The only case that is problematic is where the application had >already blocked SIGPIPE and there is a pending SIGPIPE signal when >we are entered, *and* we get SIGPIPE ourselves. > >If the C library does not support queued signals then our sigwait will >clear both our own EPIPE and the pending signal. This is annoying but >it doesn't seem fatal --- if the app is writing on a closed pipe then >it'll probably try it again later and get the signal again. > >If the C library does support queued signals then we will read the >existing SIGPIPE condition and leave our own signal in the queue. This >is no problem to the extent that one pending SIGPIPE looks just like >another --- does anyone know of platforms where there is additional info >carried by a SIGPIPE event? > > > Linux stores pid/uid together with the signal. pid doesn't matter and no sane programmer will look at the uid, so it seems to be possible. >This seems workable as long as we document the possible gotchas. > > > Is that really worthwhile? There are half a dozend assumption about the C library and kernel internal efficiency of the signal handling functions in the proposal. Adding a PQinitLib function is obviously a larger change, but it solves the problem. I'm aware of one minor gotcha: PQinSend() is not usable right now: it relies on the initialization of pq_thread_in_send, which is only created in the middle of the first connectDB(). That makes proper signal handling for the first connection impossible. -- Manfred
Manfred Spraul wrote: > Tom Lane wrote: > > >Not really: it only solves the problem *if you change the application*, > >which is IMHO not acceptable. In particular, why should a non-threaded > >app expect to have to change to deal with this issue? But we can't > >safely build a thread-safe libpq.so for general use if it breaks > >non-threaded apps that haven't been changed. > > > > > > > No. non-threaded apps do not need to change. The default is the old, 7.3 > code: change the signal handler around the write calls. Which means that > non-threaded apps are guaranteed to work without any changes, regardless > of the libpq thread safety setting. > Threaded apps would have to change, but how many threaded apps use > libpq? They check their code anyway - either just add PQinitLib() or > review (and potentialy update) their signal handling code if it match > any of the gotchas of the transparent handling. So without the call we do the old non-threaded version of the code. Yea, we could do that, but I think the most recent patch is clearer. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Manfred Spraul <manfred@colorfullife.com> writes: > Tom Lane wrote: >> Not really: it only solves the problem *if you change the application*, >> which is IMHO not acceptable. > No. non-threaded apps do not need to change. The default is the old, 7.3 > code: change the signal handler around the write calls. Which means that > non-threaded apps are guaranteed to work without any changes, regardless > of the libpq thread safety setting. Hmm, so you propose that a thread-enabled library must contain both code paths and choose between them on the fly? That seems like the worst of all possible worlds --- it's not backwards compatible, it's therefore unsafe, it's slow, *and* it'll be #ifdef hell to read. On a platform that has pthread_sigmask, ISTM we can use that unconditionally and it'll work whether the calling app is threaded or not. We only fall back to the pqsignal method if we aren't supporting threads. There's no need for a runtime test nor for an API change. regards, tom lane
Tom Lane wrote: > Manfred Spraul <manfred@colorfullife.com> writes: > > Tom Lane wrote: > >> Not really: it only solves the problem *if you change the application*, > >> which is IMHO not acceptable. > > > No. non-threaded apps do not need to change. The default is the old, 7.3 > > code: change the signal handler around the write calls. Which means that > > non-threaded apps are guaranteed to work without any changes, regardless > > of the libpq thread safety setting. > > Hmm, so you propose that a thread-enabled library must contain both code > paths and choose between them on the fly? That seems like the worst of > all possible worlds --- it's not backwards compatible, it's therefore > unsafe, it's slow, *and* it'll be #ifdef hell to read. > > On a platform that has pthread_sigmask, ISTM we can use that > unconditionally and it'll work whether the calling app is threaded or > not. We only fall back to the pqsignal method if we aren't supporting > threads. There's no need for a runtime test nor for an API change. Agreed. Manfred, I guess the big question is why not use something that is automatic like I just applied? Now that the patch is in, would someone run some threaded performance tests and see if those pg_*_sigpipe routines are causing any performance problems. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
Tom Lane wrote: >Not really: it only solves the problem *if you change the application*, >which is IMHO not acceptable. In particular, why should a non-threaded >app expect to have to change to deal with this issue? But we can't >safely build a thread-safe libpq.so for general use if it breaks >non-threaded apps that haven't been changed. > > > No. non-threaded apps do not need to change. The default is the old, 7.3 code: change the signal handler around the write calls. Which means that non-threaded apps are guaranteed to work without any changes, regardless of the libpq thread safety setting. Threaded apps would have to change, but how many threaded apps use libpq? They check their code anyway - either just add PQinitLib() or review (and potentialy update) their signal handling code if it match any of the gotchas of the transparent handling. -- Manfred -- Manfred