Thread: Unsafe threading in syslogger on Windows

Unsafe threading in syslogger on Windows

From
Heikki Linnakangas
Date:
On Windows, syslogger uses two threads. The main thread loops and polls
if any SIGHUPs have been received or if the log file needs to be
rotated. Another thread, "pipe thread", does ReadFile() on the pipe that
other processes send their log messages to. ReadFile() blocks, and
whenever new data arrives, it is processed in the pipe thread.

Both threads use palloc()/pfree(), which are not thread-safe :-(.

It's hard to trigger a crash because the main thread mostly just sleeps,
and the pipe thread only uses palloc()/pfree() when it receives chunked
messages, larger than 512 bytes. Browsing the CVS history, this was made
visibly broken by the patch that introduced the message chunking. Before
that the pipe thread just read from the pipe and wrote to the log file,
which was safe. It has always used ereport() to report read errors,
though, which can do palloc(), but there shouldn't normally be any read
errors.

I chatted with Magnus about this, and he suggested using a Windows
critical section to make sure that only one of the threads is active at
a time. That seems suitable for back-porting, but I'd like to get rid of
this threading in CVS head, it seems too error-prone.

The reason it uses threads like this on Windows is explained in the
comments:
> /*
>  * Worker thread to transfer data from the pipe to the current logfile.
>  *
>  * We need this because on Windows, WaitForSingleObject does not work on
>  * unnamed pipes: it always reports "signaled", so the blocking ReadFile won't
>  * allow for SIGHUP; and select is for sockets only.
>  */

But Magnus pointed out that our pgpipe() implementation on Windows
actually creates a pair of sockets instead of pipes, for exactly that
reason, so that you can use select() on the returned file descriptor.
For some reason syslogger explicitly doesn't use pgpipe() on Windows,
though, but calls CreatePipe(). I don't see any explanation why.

I'm going to see what happens if I remove all the #ifdef WIN32 blocks in
syslogger, and let it use pgpipe() and select() instead of the extra thread.

Thoughts?

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Unsafe threading in syslogger on Windows

From
Andrew Dunstan
Date:

Heikki Linnakangas wrote:
> On Windows, syslogger uses two threads. The main thread loops and polls
> if any SIGHUPs have been received or if the log file needs to be
> rotated. Another thread, "pipe thread", does ReadFile() on the pipe that
> other processes send their log messages to. ReadFile() blocks, and
> whenever new data arrives, it is processed in the pipe thread.
>
> Both threads use palloc()/pfree(), which are not thread-safe :-(.
>
> It's hard to trigger a crash because the main thread mostly just sleeps,
> and the pipe thread only uses palloc()/pfree() when it receives chunked
> messages, larger than 512 bytes. Browsing the CVS history, this was made
> visibly broken by the patch that introduced the message chunking. Before
> that the pipe thread just read from the pipe and wrote to the log file,
> which was safe. It has always used ereport() to report read errors,
> though, which can do palloc(), but there shouldn't normally be any read
> errors.
>
> I chatted with Magnus about this, and he suggested using a Windows
> critical section to make sure that only one of the threads is active at
> a time. That seems suitable for back-porting, but I'd like to get rid of
> this threading in CVS head, it seems too error-prone.
>
> The reason it uses threads like this on Windows is explained in the
> comments:
>   
>> /*
>>  * Worker thread to transfer data from the pipe to the current logfile.
>>  *
>>  * We need this because on Windows, WaitForSingleObject does not work on
>>  * unnamed pipes: it always reports "signaled", so the blocking ReadFile won't
>>  * allow for SIGHUP; and select is for sockets only.
>>  */
>>     
>
> But Magnus pointed out that our pgpipe() implementation on Windows
> actually creates a pair of sockets instead of pipes, for exactly that
> reason, so that you can use select() on the returned file descriptor.
> For some reason syslogger explicitly doesn't use pgpipe() on Windows,
> though, but calls CreatePipe(). I don't see any explanation why.
>
> I'm going to see what happens if I remove all the #ifdef WIN32 blocks in
> syslogger, and let it use pgpipe() and select() instead of the extra thread.
>
>
>   


Sounds reasonable. Let's see how big the changes are on HEAD. I'm not 
sure it's worth creating a different smaller fix for the back branches.

cheers

andrew




Re: Unsafe threading in syslogger on Windows

From
Heikki Linnakangas
Date:
Andrew Dunstan wrote:
> Heikki Linnakangas wrote:
>> I'm going to see what happens if I remove all the #ifdef WIN32 blocks in
>> syslogger, and let it use pgpipe() and select() instead of the extra
>> thread.
>
> Sounds reasonable. Let's see how big the changes are on HEAD. I'm not
> sure it's worth creating a different smaller fix for the back branches.

I tried that, and got a crash somewhere in the code that inherits the
syslogger pipe/socket to the child process. I don't understand why, and
I don't feel like debugging any deeper into that right now. If you or
someone else wants to give it a shot, that would be good. If not, I
might try again some other day after sleeping over it.

Anyway, here's the patch I had in mind for back-branches.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index 22cf5f2..ee449ac 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -117,7 +117,7 @@ HANDLE        syslogPipe[2] = {0, 0};

 #ifdef WIN32
 static HANDLE threadHandle = 0;
-static CRITICAL_SECTION sysfileSection;
+static CRITICAL_SECTION sysloggerSection;
 #endif

 /*
@@ -268,7 +268,8 @@ SysLoggerMain(int argc, char *argv[])

 #ifdef WIN32
     /* Fire up separate data transfer thread */
-    InitializeCriticalSection(&sysfileSection);
+    InitializeCriticalSection(&sysloggerSection);
+    EnterCriticalSection(&sysloggerSection);

     threadHandle = (HANDLE) _beginthreadex(NULL, 0, pipeThread, NULL, 0, NULL);
     if (threadHandle == 0)
@@ -423,8 +424,16 @@ SysLoggerMain(int argc, char *argv[])
          * On Windows we leave it to a separate thread to transfer data and
          * detect pipe EOF.  The main thread just wakes up once a second to
          * check for SIGHUP and rotation conditions.
+         *
+         * Server code isn't generally thread-safe, so we ensure that only
+         * one of the threads is active at a time by entering the critical
+         * section whenever we're not sleeping.
          */
+        LeaveCriticalSection(&sysloggerSection);
+
         pg_usleep(1000000L);
+
+        EnterCriticalSection(&sysloggerSection);
 #endif   /* WIN32 */

         if (pipe_eof_seen)
@@ -911,17 +920,9 @@ write_syslogger_file(const char *buffer, int count, int destination)
     if (destination == LOG_DESTINATION_CSVLOG && csvlogFile == NULL)
         open_csvlogfile();

-#ifdef WIN32
-    EnterCriticalSection(&sysfileSection);
-#endif
-
     logfile = destination == LOG_DESTINATION_CSVLOG ? csvlogFile : syslogFile;
     rc = fwrite(buffer, 1, count, logfile);

-#ifdef WIN32
-    LeaveCriticalSection(&sysfileSection);
-#endif
-
     /* can't use ereport here because of possible recursion */
     if (rc != count)
         write_stderr("could not write to log file: %s\n", strerror(errno));
@@ -945,11 +946,21 @@ pipeThread(void *arg)
     for (;;)
     {
         DWORD        bytesRead;
+        BOOL        result;
+
+        result = ReadFile(syslogPipe[0],
+                          logbuffer + bytes_in_logbuffer,
+                          sizeof(logbuffer) - bytes_in_logbuffer,
+                          &bytesRead, 0);

-        if (!ReadFile(syslogPipe[0],
-                      logbuffer + bytes_in_logbuffer,
-                      sizeof(logbuffer) - bytes_in_logbuffer,
-                      &bytesRead, 0))
+        /*
+         * Enter critical section before doing anything that might touch
+         * global state shared by the main thread. Anything that uses
+         * palloc()/pfree() in particular are not safe outside the critical
+         * section.
+         */
+        EnterCriticalSection(&sysloggerSection);
+        if (!result)
         {
             DWORD        error = GetLastError();

@@ -966,6 +977,7 @@ pipeThread(void *arg)
             bytes_in_logbuffer += bytesRead;
             process_pipe_input(logbuffer, &bytes_in_logbuffer);
         }
+        LeaveCriticalSection(&sysloggerSection);
     }

     /* We exit the above loop only upon detecting pipe EOF */
@@ -974,6 +986,7 @@ pipeThread(void *arg)
     /* if there's any data left then force it out now */
     flush_pipe_input(logbuffer, &bytes_in_logbuffer);

+    LeaveCriticalSection(&sysloggerSection);
     _endthread();
     return 0;
 }
@@ -1097,18 +1110,9 @@ logfile_rotate(bool time_based_rotation, int size_rotation_for)
         _setmode(_fileno(fh), _O_TEXT); /* use CRLF line endings on Windows */
 #endif

-        /* On Windows, need to interlock against data-transfer thread */
-#ifdef WIN32
-        EnterCriticalSection(&sysfileSection);
-#endif
-
         fclose(syslogFile);
         syslogFile = fh;

-#ifdef WIN32
-        LeaveCriticalSection(&sysfileSection);
-#endif
-
         /* instead of pfree'ing filename, remember it for next time */
         if (last_file_name != NULL)
             pfree(last_file_name);
@@ -1164,18 +1168,9 @@ logfile_rotate(bool time_based_rotation, int size_rotation_for)
         _setmode(_fileno(fh), _O_TEXT); /* use CRLF line endings on Windows */
 #endif

-        /* On Windows, need to interlock against data-transfer thread */
-#ifdef WIN32
-        EnterCriticalSection(&sysfileSection);
-#endif
-
         fclose(csvlogFile);
         csvlogFile = fh;

-#ifdef WIN32
-        LeaveCriticalSection(&sysfileSection);
-#endif
-
         /* instead of pfree'ing filename, remember it for next time */
         if (last_csv_file_name != NULL)
             pfree(last_csv_file_name);

Re: Unsafe threading in syslogger on Windows

From
Heikki Linnakangas
Date:
Heikki Linnakangas wrote:
> Andrew Dunstan wrote:
>> Heikki Linnakangas wrote:
>>> I'm going to see what happens if I remove all the #ifdef WIN32 blocks in
>>> syslogger, and let it use pgpipe() and select() instead of the extra
>>> thread.
>> Sounds reasonable. Let's see how big the changes are on HEAD. I'm not
>> sure it's worth creating a different smaller fix for the back branches.
> 
> I tried that, and got a crash somewhere in the code that inherits the
> syslogger pipe/socket to the child process. I don't understand why, and
> I don't feel like debugging any deeper into that right now. If you or
> someone else wants to give it a shot, that would be good. If not, I
> might try again some other day after sleeping over it.
> 
> Anyway, here's the patch I had in mind for back-branches.

I've committed that patch to all branches now. In the long run it would
be good to get rid of the extra thread, but I'll leave that to someone
else with more Windows skills.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com