Re: [HACKERS] Buildfarm failures on woodlouse (in ecpg-check) - Mailing list pgsql-hackers

From Christian Ullrich
Subject Re: [HACKERS] Buildfarm failures on woodlouse (in ecpg-check)
Date
Msg-id ca4b5af0-c81d-e1b7-5ea6-7d8e53bed560@chrullrich.net
Whole thread Raw
In response to Re: [HACKERS] Buildfarm failures on woodlouse (in ecpg-check)  (Andrew Dunstan <andrew.dunstan@2ndquadrant.com>)
List pgsql-hackers
* I wrote:

> If there is interest in fixing this issue that is apparently limited
> to VS 2013, I will try and produce a proper patch.

Patch.

Perhaps surprisingly, the bug is in the failing test cases themselves, 
not ecpg. The CRT has two modes for its locale implementation: When 
called in a "global" mode thread, setlocale() affects all global mode 
threads; when called from a per-thread mode thread, it affects that 
thread only. [1]

In the threaded test cases, many global mode threads call setlocale() 
simultaneously, getting in each other's way. The fix is to switch the 
worker threads to per-thread mode first.

Without this patch, I see crashes in approximately 25 percent of runs 
(four crashes in 16 cycles of vcregress ecpgcheck, ~10 in 50 runs of 
thread.exe alone). With the patch, I have no crashes in 100 ecpgcheck runs.

On the other hand, this affects only the buildfarm VM I mentioned 
earlier. I have another VM where it does not ever crash even without the 
patch -- such are the joys of multithreading. Both of them should have 
plenty of cores, both physical and virtual.


There are some alternatives to fixing it this way, but I think this is 
the best approach:

- Selecting per-thread mode in ecpglib takes the choice away from the
   developer who might want shared locale.

- Adding locking around setlocale() is difficult because ecpglib already
   uses a wrapper around the CRT function, provided by libpgport.

   These test cases are the only consumers of the wrapper that have any
   concept of multithreading. Supporting it in libpgport for the sole
   benefit of threaded ECPG applications on Windows does not seem to
   be a good idea, and re-wrapping the wrapper in ecpglib is not only
   beyond my abilities to write, but is also going to be unmaintainable.

- Adding locking around every setlocale() call in ecpglib is just ugly.


While working on this, I also noticed that there seem to be two separate 
partial implementations of a pthread synchronization emulation for 
Win32. One is in ecpglib, uses mutexes and provides 
PTHREAD_MUTEX_INITIALIZER and pthread_once(), the other has the header 
in src/port and the implementation in libpq, uses critical sections and 
does not cover either feature.

Should the two be merged at some point?


[1] https://msdn.microsoft.com/en-us/library/ms235302(v=vs.120).aspx

-- 
Christian

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [HACKERS] subscription worker signalling wal writer too much
Next
From: Justin Pryzby
Date:
Subject: Re: [HACKERS] memory fields from getrusage()