Re: BUG #16161: pg_ctl stop fails sometimes (on Windows) - Mailing list pgsql-bugs
From | Tom Lane |
---|---|
Subject | Re: BUG #16161: pg_ctl stop fails sometimes (on Windows) |
Date | |
Msg-id | 32374.1576770248@sss.pgh.pa.us Whole thread Raw |
In response to | Re: BUG #16161: pg_ctl stop fails sometimes (on Windows) (Alexander Lakhin <exclusion@gmail.com>) |
Responses |
Re: BUG #16161: pg_ctl stop fails sometimes (on Windows)
(Alexander Lakhin <exclusion@gmail.com>)
|
List | pgsql-bugs |
Alexander Lakhin <exclusion@gmail.com> writes: > Please look at the patch that implements (2). It makes `vcregress > recoverycheck` pass (and my demo restart test still passes too). I think we could be a little smarter here. If the problem is STATUS_DELETE_PENDING, doesn't that affect stat() as well? That is, if stat() succeeds, we needn't wait, independently of whether it says S_ISDIR or not? This seems like a noticeable improvement if true, because it would mean that ordinary file-permission failures also need not wait. We'd still get confused if we get a permission failure on a containing directory rather than the file itself, but that I'm prepared to dismiss as an uncommon case. Entirely-untested patch along this line attached. BTW ... it's likely that stat() here is actually going to invoke pgwin32_safestat(), which has its own opinions about this, and indeed seems to think we'll get ERROR_DELETE_PENDING. I think that's harmless here, but it makes me wonder if we should use a native Windows API instead of going through stat(). > As open(dir) is getting a little more expensive than before, maybe it's > still worthwhile to patch fsync*(..., isdir,...). I can prepare such a > patch if so. I think we should leave that for later, so that the buildfarm can actually test whatever logic we put in here. regards, tom lane diff --git a/src/port/open.c b/src/port/open.c index 5165276..24af6f4 100644 --- a/src/port/open.c +++ b/src/port/open.c @@ -21,6 +21,7 @@ #include <fcntl.h> #include <assert.h> +#include <sys/stat.h> static int @@ -137,18 +138,33 @@ pgwin32_open(const char *fileName, int fileFlags,...) } /* - * ERROR_ACCESS_DENIED can be returned if the file is deleted but not - * yet gone (Windows NT status code is STATUS_DELETE_PENDING). Wait a - * bit and try again, giving up after 1 second (since this condition - * should never persist very long). + * ERROR_ACCESS_DENIED is returned if the file is deleted but not yet + * gone (Windows NT status code is STATUS_DELETE_PENDING). In that + * case we want to wait a bit and try again, giving up after 1 second + * (since this condition should never persist very long). However, + * there are other commonly-hit cases that return ERROR_ACCESS_DENIED, + * so care is needed. In particular that happens if we try to open a + * directory, or of course if there's an actual file-permissions + * problem. To distinguish these cases, try a stat(). In the + * delete-pending case, it will either also get STATUS_DELETE_PENDING, + * or it will see the file as gone and fail with ENOENT. In other + * cases it will usually succeed. The only somewhat-likely case where + * this coding will uselessly wait is if there's a permissions problem + * with a containing directory, which we hope will never happen in any + * performance-critical code paths. */ if (err == ERROR_ACCESS_DENIED) { if (loops < 10) { - pg_usleep(100000); - loops++; - continue; + struct stat st; + + if (stat(fileName, &st) != 0) + { + pg_usleep(100000); + loops++; + continue; + } } }
pgsql-bugs by date: