RE: Intermittent pg_ctl failures on Windows - Mailing list pgsql-hackers

From Badrul Chowdhury
Subject RE: Intermittent pg_ctl failures on Windows
Date
Msg-id DM5PR2101MB0984D3DB39BCCCF189ECD15ED1D30@DM5PR2101MB0984.namprd21.prod.outlook.com
Whole thread Raw
In response to Intermittent pg_ctl failures on Windows  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Intermittent pg_ctl failures on Windows  (r.zharkov@postgrespro.ru)
List pgsql-hackers
Hi Tom,

This is a great catch. I am looking into it: I will start by reproducing the error as you suggested.

Thanks,
Badrul

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Saturday, March 10, 2018 2:48 PM
To: pgsql-hackers@lists.postgresql.org
Subject: Intermittent pg_ctl failures on Windows

The buildfarm's Windows members occasionally show weird pg_ctl failures, for instance this recent case:


https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbuildfarm.postgresql.org%2Fcgi-bin%2Fshow_log.pl%3Fnm%3Dbowerbird%26dt%3D2018-03-10%252020%253A30%253A20&data=04%7C01%7Cbachow%40microsoft.com%7C28a8094e84c74c26ecb108d586d91a9d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636563189370087651%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=qBtMsJ0EJFs4DVtkA6TZJhCDNlj392uNxsB6MHnu7po%3D&reserved=0

### Restarting node "master"
# Running: pg_ctl -D
G:/prog/bf/root/HEAD/pgsql.build/src/test/recovery/tmp_check/t_006_logical_decoding_master_data/pgdata-l
G:/prog/bf/root/HEAD/pgsql.build/src/test/recovery/tmp_check/log/006_logical_decoding_master.logrestart waiting for
serverto shut down.... done server stopped waiting for server to start....The process cannot access the file because it
isbeing used by another process. 
 stopped waiting
pg_ctl: could not start server
Examine the log output.
Bail out!  system pg_ctl failed

or this one:


https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbuildfarm.postgresql.org%2Fcgi-bin%2Fshow_log.pl%3Fnm%3Dbowerbird%26dt%3D2017-12-29%252023%253A30%253A24&data=04%7C01%7Cbachow%40microsoft.com%7C28a8094e84c74c26ecb108d586d91a9d%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636563189370087651%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwifQ%3D%3D%7C-1&sdata=NdoDkZxBagXpiPDjNmhN6znHh%2BITyjEv2StPpLaabaw%3D&reserved=0

### Stopping node "subscriber" using mode fast # Running: pg_ctl -D
c:/prog/bf/root/HEAD/pgsql.build/src/test/subscription/tmp_check/t_001_rep_changes_subscriber_data/pgdata-m fast stop
waitingfor server to shut down....pg_ctl: could not open PID file
"c:/prog/bf/root/HEAD/pgsql.build/src/test/subscription/tmp_check/t_001_rep_changes_subscriber_data/pgdata/postmaster.pid":
Permissiondenied Bail out!  system pg_ctl failed 

I'd been writing these off as Microsoft randomness and/or antivirus interference, but it suddenly occurred to me that
theremight be a consistent explanation: since commit f13ea95f9, when pg_ctl is waiting for server start/stop, it is
tryingto read postmaster.pid more-or-less concurrently with the postmaster writing to that file.  On Unix that's not
muchof a problem, but I believe that on Windows you have to specifically open the file with sharing enabled, or you get
errormessages like these. 
The postmaster should be enabling sharing, because port.h redirects open/fopen to pgwin32_open/pgwin32_fopen which
enablethe sharing flags. 
But it only does that #ifndef FRONTEND.  So pg_ctl is just using naked open(), which could explain these failures.

If this theory is accurate, it should be pretty easy to replicate the problem if you modify the postmaster to hold
postmaster.pidopen longer when rewriting it, e.g. stick fractional-second sleeps into CreateLockFile and
AddToDataDirLockFile.

I'm not in a position to investigate this in detail nor test a fix, but I think somebody should.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Ildus Kurbangaliev
Date:
Subject: Re: Prefix operator for text and spgist support
Next
From: Pavan Deolasee
Date:
Subject: Re: [HACKERS] MERGE SQL Statement for PG11