Re: BUG #5038: WAL file is pending deletion in pg_xlog folder, this interferes with WAL archiving. - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: BUG #5038: WAL file is pending deletion in pg_xlog folder, this interferes with WAL archiving.
Date
Msg-id 4AA8CA7B.4020608@enterprisedb.com
Whole thread Raw
In response to Re: BUG #5038: WAL file is pending deletion in pg_xlog folder, this interferes with WAL archiving.  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: BUG #5038: WAL file is pending deletion in pg_xlog folder, this interferes with WAL archiving.  (Luke Koops <luke.koops@entrust.com>)
List pgsql-bugs
Heikki Linnakangas wrote:
> Tom Lane wrote:
>> Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
>>> No, it's a backend that's holding the file open, with FILE_SHARE_DELETE.
>> If that's the only case we care about covering, then rename might be
>> enough.  I was just wondering what it would take to solve the more
>> general problem of something holding it open with the wrong flags
>> at the time we want to get rid of it.
>
> Yes, that's a separate problem, and I think we should address that too.
> That's what I thought was going on in OP's case at first, the patch I
> posted in my first reply should address that.
>
> I'll try to reproduce that case too, and verify that the patch fixes it.

Ok, I've committed a patch along those lines. The file is now renamed
before unlinking (on Windows), and the return code of rename() and
unlink() is checked, so that we don't delete the .done file if the WAL
file deletion failed. This fixes both scenarios, the one OP reported
with another backend keeping the file open, and the one where a
different process keeps a file open without FILE_SHARE_DELETE.

I considered making failure to rename or delete a WARNING instead of
ERROR, so that RemoveOldXLogFiles() would still clean up any other old
WAL files. However, when a file is recycled, we throw an error anyway if
the rename fails in InstallXLogFileSegment(), so it doesn't seem like it
would buy us much.

BTW, it seems that errno is not set on Windows when rename fails, but we
still try to print the OS error message in InstallXLogFileSegment().
When I tested the case where another process is keeping the file locked,
for example, I got this:

ERROR:  could not rename file "pg_xlog/000000010000000100000073" to
"pg_xlog/000000010000000100000092" (initialization of log file 1,
segment 146): No such file or directory

even though the file clearly exists, it's just locked. I'm not sure
where errno is coming from in that case, and if we should do something
about that, but that exceeds my appetite for fixing Windows issues right
now.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

pgsql-bugs by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: BUG #5038: WAL file is pending deletion in pg_xlog folder, this interferes with WAL archiving.
Next
From: "Rohan jamadagni"
Date:
Subject: BUG #5047: Not able to connect from Informatica