Thread: Re: [COMMITTERS] pgsql: On Windows, when a file is deleted and another process still has

Re: [COMMITTERS] pgsql: On Windows, when a file is deleted and another process still has

From
Heikki Linnakangas
Date:
(moving to pgsql-hackers)

Tom Lane wrote:
> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>> A completely different approach would be to treat any failure on all
>> platforms as non-fatal. We shouldn't really cut the checkpoint short if
>> recycling a WAL file fails, whatever the reason. That seems like a more
>> robust approach than trying to guess which error codes are OK to ignore.
>
> I could live with that, as long as it gets logged.

Here's a patch implementing that, and changing pgrename() to check for
ERROR_SHARING_VIOLATION and ERROR_LOCK_VIOLATION like pgwin32_open()
does, instead of ERROR_ACCESS_DENIED.

I wonder if we should reduce the timeout in pgrename(). It's 30 s at the
moment, but apparently it hasn't been working correctly, failing
immediately instead if the file is locked. And no-one has complained
about that. But if we sleep in InstallXLogFileSegment(), we're holding
ControlFileLock, which can force other backends to wait, and that might
cause more harm than just failing outright. Something like 5 seconds
might be more appropriate, giving anti-virus and similar software some
time to give up the lock, but not too much to cause long delays. 5
seconds should be enough for anti-virus or backup software to process a
file under normal circumstances.

OTOH, pgwin32_open() uses 30 s, with the same potential for lockups, and
no-one has complained about that either. The bottom line is that if
another program keeps a file locked for any extended period of time,
you're going to have trouble one way or another.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***************
*** 2262,2273 **** XLogFileInit(uint32 log, uint32 seg,
                                  *use_existent, &max_advance,
                                  use_lock))
      {
!         /* No need for any more future segments... */
          unlink(tmppath);
      }

-     elog(DEBUG2, "done creating and filling new WAL file");
-
      /* Set flag to tell caller there was no existent file */
      *use_existent = false;

--- 2262,2275 ----
                                  *use_existent, &max_advance,
                                  use_lock))
      {
!         /*
!          * No need for any more future segments, or InstallXLogFileSegment()
!          * failed to rename the file into place. If the rename failed, opening
!          * the file below will fail.
!          */
          unlink(tmppath);
      }

      /* Set flag to tell caller there was no existent file */
      *use_existent = false;

***************
*** 2280,2285 **** XLogFileInit(uint32 log, uint32 seg,
--- 2282,2289 ----
             errmsg("could not open file \"%s\" (log file %u, segment %u): %m",
                    path, log, seg)));

+     elog(DEBUG2, "done creating and filling new WAL file");
+
      return fd;
  }

***************
*** 2409,2418 **** XLogFileCopy(uint32 log, uint32 seg,
   * place.  This should be TRUE except during bootstrap log creation.  The
   * caller must *not* hold the lock at call.
   *
!  * Returns TRUE if file installed, FALSE if not installed because of
!  * exceeding max_advance limit.  On Windows, we also return FALSE if we
!  * can't rename the file into place because someone's got it open.
!  * (Any other kind of failure causes ereport().)
   */
  static bool
  InstallXLogFileSegment(uint32 *log, uint32 *seg, char *tmppath,
--- 2413,2421 ----
   * place.  This should be TRUE except during bootstrap log creation.  The
   * caller must *not* hold the lock at call.
   *
!  * Returns TRUE if the file was installed successfully. FALSE indicates that
!  * max_advance limit was exceeded, or an error occurred while renaming the
!  * file into place.
   */
  static bool
  InstallXLogFileSegment(uint32 *log, uint32 *seg, char *tmppath,
***************
*** 2460,2490 **** InstallXLogFileSegment(uint32 *log, uint32 *seg, char *tmppath,
       */
  #if HAVE_WORKING_LINK
      if (link(tmppath, path) < 0)
!         ereport(ERROR,
                  (errcode_for_file_access(),
                   errmsg("could not link file \"%s\" to \"%s\" (initialization of log file %u, segment %u): %m",
                          tmppath, path, *log, *seg)));
      unlink(tmppath);
  #else
      if (rename(tmppath, path) < 0)
      {
! #ifdef WIN32
! #if !defined(__CYGWIN__)
!         if (GetLastError() == ERROR_ACCESS_DENIED)
! #else
!         if (errno == EACCES)
! #endif
!         {
!             if (use_lock)
!                 LWLockRelease(ControlFileLock);
!             return false;
!         }
! #endif   /* WIN32 */
!
!         ereport(ERROR,
                  (errcode_for_file_access(),
                   errmsg("could not rename file \"%s\" to \"%s\" (initialization of log file %u, segment %u): %m",
                          tmppath, path, *log, *seg)));
      }
  #endif

--- 2463,2488 ----
       */
  #if HAVE_WORKING_LINK
      if (link(tmppath, path) < 0)
!     {
!         if (use_lock)
!             LWLockRelease(ControlFileLock);
!         ereport(LOG,
                  (errcode_for_file_access(),
                   errmsg("could not link file \"%s\" to \"%s\" (initialization of log file %u, segment %u): %m",
                          tmppath, path, *log, *seg)));
+         return false;
+     }
      unlink(tmppath);
  #else
      if (rename(tmppath, path) < 0)
      {
!         if (use_lock)
!             LWLockRelease(ControlFileLock);
!         ereport(LOG,
                  (errcode_for_file_access(),
                   errmsg("could not rename file \"%s\" to \"%s\" (initialization of log file %u, segment %u): %m",
                          tmppath, path, *log, *seg)));
+         return false;
      }
  #endif

***************
*** 3128,3146 **** RemoveOldXlogFiles(uint32 log, uint32 seg, XLogRecPtr endptr)
                       */
                      snprintf(newpath, MAXPGPATH, "%s.deleted", path);
                      if (rename(path, newpath) != 0)
!                         ereport(ERROR,
                                  (errcode_for_file_access(),
!                                  errmsg("could not rename old transaction log file \"%s\"",
                                          path)));
                      rc = unlink(newpath);
  #else
                      rc = unlink(path);
  #endif
                      if (rc != 0)
!                         ereport(ERROR,
                                  (errcode_for_file_access(),
                                   errmsg("could not remove old transaction log file \"%s\": %m",
                                          path)));
                      CheckpointStats.ckpt_segs_removed++;
                  }

--- 3126,3150 ----
                       */
                      snprintf(newpath, MAXPGPATH, "%s.deleted", path);
                      if (rename(path, newpath) != 0)
!                     {
!                         ereport(LOG,
                                  (errcode_for_file_access(),
!                                  errmsg("could not rename old transaction log file \"%s\": %m",
                                          path)));
+                         continue;
+                     }
                      rc = unlink(newpath);
  #else
                      rc = unlink(path);
  #endif
                      if (rc != 0)
!                     {
!                         ereport(LOG,
                                  (errcode_for_file_access(),
                                   errmsg("could not remove old transaction log file \"%s\": %m",
                                          path)));
+                         continue;
+                     }
                      CheckpointStats.ckpt_segs_removed++;
                  }

*** a/src/port/dirmod.c
--- b/src/port/dirmod.c
***************
*** 129,139 **** pgrename(const char *from, const char *to)
  #endif
      {
  #if defined(WIN32) && !defined(__CYGWIN__)
!         if (GetLastError() != ERROR_ACCESS_DENIED)
  #else
          if (errno != EACCES)
  #endif
-             /* set errno? */
              return -1;
          if (++loops > 300)        /* time out after 30 sec */
              return -1;
--- 129,143 ----
  #endif
      {
  #if defined(WIN32) && !defined(__CYGWIN__)
!         DWORD        err = GetLastError();
!
!         _dosmaperr(err);
!
!         if (err != ERROR_SHARING_VIOLATION &&
!             err != ERROR_LOCK_VIOLATION)
  #else
          if (errno != EACCES)
  #endif
              return -1;
          if (++loops > 300)        /* time out after 30 sec */
              return -1;

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
> Here's a patch implementing that, and changing pgrename() to check for
> ERROR_SHARING_VIOLATION and ERROR_LOCK_VIOLATION like pgwin32_open()
> does, instead of ERROR_ACCESS_DENIED.

This looks sane in a quick once-over, though I haven't tested it.
One tiny stylistic suggestion:
        if (err != ERROR_SHARING_VIOLATION &&            err != ERROR_LOCK_VIOLATION)#else        if (errno !=
EACCES)#endif           return -1;        if (++loops > 300)        /* time out after 30 sec */            return -1;
 

This is overly cute and will probably confuse both pgindent and ordinary
editors.  It's worth one extra line to keep each part of the #if
syntactically independent, ie
        if (err != ERROR_SHARING_VIOLATION &&            err != ERROR_LOCK_VIOLATION)            return -1;#else
if(errno != EACCES)            return -1;#endif        if (++loops > 300)        /* time out after 30 sec */
return-1;
 


> I wonder if we should reduce the timeout in pgrename(). It's 30 s at the
> moment, but apparently it hasn't been working correctly, failing
> immediately instead if the file is locked.

I have a vague recollection that there was a specific reason for having
such a long timeout --- you might want to check the archives to see the
discussion before that code got committed.  However, if nothing turns
up, I wouldn't object to reducing it to 5 or 10 sec.
        regards, tom lane


On Fri, Sep 11, 2009 at 10:44, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> (moving to pgsql-hackers)
>
> Tom Lane wrote:
>> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>>> A completely different approach would be to treat any failure on all
>>> platforms as non-fatal. We shouldn't really cut the checkpoint short if
>>> recycling a WAL file fails, whatever the reason. That seems like a more
>>> robust approach than trying to guess which error codes are OK to ignore.
>>
>> I could live with that, as long as it gets logged.
>
> Here's a patch implementing that, and changing pgrename() to check for
> ERROR_SHARING_VIOLATION and ERROR_LOCK_VIOLATION like pgwin32_open()
> does, instead of ERROR_ACCESS_DENIED.

I have definitely seen AV programs return access deniderather than
sharing violation more than once for temporary errors. How about we
keep the access denied one as well? If we actually don't have
permissions in pg_xlog, we most likely never even got here...


-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/


Magnus Hagander <magnus@hagander.net> writes:
> On Fri, Sep 11, 2009 at 10:44, Heikki Linnakangas
>> Here's a patch implementing that, and changing pgrename() to check for
>> ERROR_SHARING_VIOLATION and ERROR_LOCK_VIOLATION like pgwin32_open()
>> does, instead of ERROR_ACCESS_DENIED.

> I have definitely seen AV programs return access deniderather than
> sharing violation more than once for temporary errors. How about we
> keep the access denied one as well?

+1 ... presumably the original coding was tested in *some* environment.
        regards, tom lane


Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
>> On Fri, Sep 11, 2009 at 10:44, Heikki Linnakangas
>>> Here's a patch implementing that, and changing pgrename() to check for
>>> ERROR_SHARING_VIOLATION and ERROR_LOCK_VIOLATION like pgwin32_open()
>>> does, instead of ERROR_ACCESS_DENIED.
> 
>> I have definitely seen AV programs return access deniderather than
>> sharing violation more than once for temporary errors. How about we
>> keep the access denied one as well?
> 
> +1 ... presumably the original coding was tested in *some* environment.

Ok, I've committed that. Per quick discussion with Magnus, I also
lowered the timeout to 10s.

Luke, although your immediate problem was solved by the previous patch
already, this touched the same pieces of code, so you might want to
fetch the latest sources and retest if you want to be sure. (I did test
it myself..)

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com