Fw: Windows 10 got stuck with PostgreSQL at starting up. Addingdelay lets it avoid. - Mailing list pgsql-hackers

From Yugo Nagata
Subject Fw: Windows 10 got stuck with PostgreSQL at starting up. Addingdelay lets it avoid.
Date
Msg-id 20180720175813.154db441.nagata@sraoss.co.jp
Whole thread Raw
Responses Re: Fw: Windows 10 got stuck with PostgreSQL at starting up. Addingdelay lets it avoid.  (Michael Paquier <michael@paquier.xyz>)
Re: Fw: Windows 10 got stuck with PostgreSQL at starting up. Adding delay lets it avoid.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi,

Recently, one of our clients reported a problem that Windows 10 sometime 
(approximately once in 300 tries) hung up at OS starting up while PostgreSQL
9.3.x service is starting up. My co-worker analyzed this and found that
PostgreSQL's auxiliary process and Windows' logon processes are in a dead-lock
situation.

Although this problem have been found only with PostgreSQL 9.3.x and Windows 10
in our client's environment for now, maybe the same problem occurs with other 
versions of PostgreSQL.

He reported this problem to pgsql-general list as below. Also, he created a patch
to add a build-time option for adding 0.5 or 3.0 seconds delay after each sub 
process starts.  The attached is the same one.  Our client confirmed that this 
patch resolves the dead-lock problem. Is it acceptable to add this option to 
PostgreSQL?  Any comment would be appreciated.

Regards,




Begin forwarded message:

Date: Fri, 29 Jun 2018 15:03:10 +0900
From: TAKATSUKA Haruka <harukat@sraoss.co.jp>
To: pgsql-general@postgresql.org
Subject: Windows 10 got stuck with PostgreSQL at starting up. Adding delay lets it avoid.


I got a trouble in PostgreSQL 9.3.x on Windows 10.
I would like to add new delay code as an official build option.

Windows 10 sometime (approximately once in 300 tries) hung up 
at OS starting up. The logs say it happened while the PostgreSQL 
service was starting. When OS stopped, some postgres auxiliary 
process were started and some were not started yet. 

The Windows dump say some threads of the postgres auxiliary process
are waiting OS level locks and the logon processes’thread are
also waiting a lock. MS help desk said that PostgreSQL’s OS level 
deadlock caused OS freeze. I think it is strange story. But, 
in fact, it not happened in repeated tests when I got rid of 
PostgreSQL from the initial auto-starting services.

I tweaked PostgreSQL 9.3.x (the newest from the repository) to add 
0.5 or 3.0 seconds delay after each sub process starts. 
And then the hung up was gone. This test patch is attached. 
It is only implemented for Windows. Also, I did not use existing 
pg_usleep because it contains locking codes (e.g. WaitForSingleObject
and Enter/LeaveCriticalSection).

Although Windows OS may have some problems, I think we should have
a means to avoid it. Can PostgreSQL be accepted such delay codes
as build-time options by preprocessor variables?


Thanks,
Takatsuka Haruka


-- 
Yugo Nagata <nagata@sraoss.co.jp>

Attachment

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: Re: de-deduplicate code in DML execution hooks in postgres_fdw
Next
From: Aleksander Alekseeev
Date:
Subject: Re: project updates