Home > mailing lists

Multithreaded SIGPIPE race in libpq on Solaris - Mailing list pgsql-hackers

From	Thomas Munro
Subject	Multithreaded SIGPIPE race in libpq on Solaris
Date	August 29, 2014 00:35:14
Msg-id	CADLWmXVPnLzwcdRCyPnM3QZq6ehVOa16u=bWyxFatxcT0nOaKA@mail.gmail.com Whole thread Raw
Responses	Re: Multithreaded SIGPIPE race in libpq on Solaris
List	pgsql-hackers

Tree view

Hello,

A while back someone showed me a program blocking in libpq 9.2 on
Solaris 11, inside sigwait called by pq_reset_sigpipe.  It had
happened a couple of times before during a period of
instability/crashing with a particular DB (a commercial PostgreSQL
derivative, but the client was using regular libpq).  This was a very
busy multithreaded client where each thread had its own connection.
My theory is that if two connections accessed by different threads get
shut down around the same time, there is a race scenario where each of
them fails to write to its socket, sees errno == EPIPE and then sees a
pending SIGPIPE with sigpending(), but only one thread returns from
sigwait() due to signal merging.

We never saw the problem again after we made the following change:

--- a/src/interfaces/libpq/fe-secure.c
+++ b/src/interfaces/libpq/fe-secure.c
@@ -450,7 +450,6 @@ voidpq_reset_sigpipe(sigset_t *osigset, bool sigpipe_pending, bool got_epipe){       int
        save_errno = SOCK_ERRNO;
 
-       int                     signo;       sigset_t        sigset;
       /* Clear SIGPIPE only if none was pending */
@@ -460,11 +459,13 @@ pq_reset_sigpipe(sigset_t *osigset, bool
sigpipe_pending, bool got_epipe)                       sigismember(&sigset, SIGPIPE))               {
   sigset_t        sigpipe_sigset;
 
+                       siginfo_t       siginfo;
+                       struct timespec timeout = { 0, 0 };
                       sigemptyset(&sigpipe_sigset);                       sigaddset(&sigpipe_sigset, SIGPIPE);

-                       sigwait(&sigpipe_sigset, &signo);
+                       sigtimedwait(&sigpipe_sigset, &siginfo, &timeout);               }       }

Does this make any sense?

Best regards,
Thomas Munro

pgsql-hackers by date:

From: Alvaro Herrera
Date: 29 August 2014, 00:33:44
Subject: Re: Why data of timestamptz does not store value of timezone passed to it?

From: Robert Haas
Date: 29 August 2014, 01:01:47
Subject: Re: Per table autovacuum vacuum cost limit behaviour strange

Multithreaded SIGPIPE race in libpq on Solaris - Mailing list pgsql-hackers

Previous

Next