Thread: Re: BUG #16927: Postgres can`t access WAL files

Re: BUG #16927: Postgres can`t access WAL files

From
Michael Paquier
Date:
On Wed, Mar 17, 2021 at 10:34:05AM +0200, Ярослав Пашинский wrote:
> I am able to run compiled binaries + patch on my servers, that not a
> problem. Or I could also compile if you could tell me briefly how to do
> that because it`s real useful skill :)

There is some documentation to do that with Visual Studio:
https://www.postgresql.org/docs/devel/install-windows.html
In my case, I just use a command prompt to launch those commands and
do the work.  I can send you links to download custom builds, of
course.  My guess is that these should be able to work on your host,
as Windows is good in terms of backward-compatibility.

> I attached 2 files: files_list.txt - that content of pg_wal directory;
> second file file_option.png - properties of one wal file that postgres
> can`t access. Strange thing is even with domain admin or local admin I
> can`t see rights properties for this file.

Thanks.  The .deleted files come from RemoveXlogFile() where a file
gets removed.  This means that a rename before doing an unlink()
fails.  What we are looking for here is what is holding those files
back.
--
Michael

Attachment

Re: BUG #16927: Postgres can`t access WAL files

From
Ярослав Пашинский
Date:
Okay, I`ll try to build from source today following the instructions that you sent to me. And another question is how to apply patch or you will send me links with build + patch?
By the way, unfortunately, yesterday another 3 postgres instances started "complaining" about access to wal files in logs, so that why I`m interesting in fixing this ASAP :)

чт, 18 мар. 2021 г. в 00:27, Michael Paquier <michael@paquier.xyz>:
On Wed, Mar 17, 2021 at 10:34:05AM +0200, Ярослав Пашинский wrote:
> I am able to run compiled binaries + patch on my servers, that not a
> problem. Or I could also compile if you could tell me briefly how to do
> that because it`s real useful skill :)

There is some documentation to do that with Visual Studio:
https://www.postgresql.org/docs/devel/install-windows.html
In my case, I just use a command prompt to launch those commands and
do the work.  I can send you links to download custom builds, of
course.  My guess is that these should be able to work on your host,
as Windows is good in terms of backward-compatibility.

> I attached 2 files: files_list.txt - that content of pg_wal directory;
> second file file_option.png - properties of one wal file that postgres
> can`t access. Strange thing is even with domain admin or local admin I
> can`t see rights properties for this file.

Thanks.  The .deleted files come from RemoveXlogFile() where a file
gets removed.  This means that a rename before doing an unlink()
fails.  What we are looking for here is what is holding those files
back.
--
Michael

Re: BUG #16927: Postgres can`t access WAL files

From
Michael Paquier
Date:
On Thu, Mar 18, 2021 at 10:21:53AM +0200, Ярослав Пашинский wrote:
> Okay, I`ll try to build from source today following the instructions that
> you sent to me. And another question is how to apply patch or you will send
> me links with build + patch?

The "patch" command would be enough.  Please note that I have
generated some builds of 13.2 unpatched and 13.2 patched that you
could directly reuse, so that may make your life easier.  I'll send
you the links in a couple of minutes in a separate email.

> By the way, unfortunately, yesterday another 3 postgres instances started
> "complaining" about access to wal files in logs, so that why I`m
> interesting in fixing this ASAP :)

:(
--
Michael

Attachment

Re: BUG #16927: Postgres can`t access WAL files

From
Ярослав Пашинский
Date:
The strange thing is why one server works fine on unpatched binaries while the second one requires a patched version to get rid of pg_wal access error. 
UPD: just right now on developer server I got an error: "2021-03-18 14:11:39.444 EET [892] LOG:  could not rename file "pg_wal/00000001000009130000006F": Permission denied"
Will switch to patched binaries and tell you later.
P.S: Sorry, that I didn't include in reply psql-bugs, is it ok right now?

чт, 18 мар. 2021 г. в 13:15, Michael Paquier <michael@paquier.xyz>:
On Thu, Mar 18, 2021 at 12:44:29PM +0200, Ярослав Пашинский wrote:
> So, I started test on my Windows server that we using for replica on
> instance, which I copied from master. The binarys was unpatched, that
> you sent me here. The system is: windows server 2016, os build 14393.4283.
> To emulate load I used pgbench with such parameters -t 10000 -c 50 -j 20.
> After couple of running test in log file I found almost same errors:
> "2021-03-18 11:27:14.322 EET [3748] LOG:  could not rename temporary
> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
> Permission denied
> 2021-03-18 11:27:18.928 EET [692] LOG:  using stale statistics instead of
> current ones because stats collector is not responding"
> ...and
> "2021-03-18 11:48:49.630 EET [6476] LOG:  could not rename file
> "pg_wal/00000001000000650000008F": Permission denied"
> So I decided to switch to patched binaries and sometimes get only this one
> error:
> "2021-03-18 12:27:14.571 EET [4840] LOG:  could not rename temporary
> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
> Permission denied
> 2021-03-18 12:27:19.178 EET [7556] LOG:  using stale statistics instead of
> current ones because stats collector is not responding"
> Which is not very critical, so it`s ok.

Okay, so it looks like a very good news to me.  With the patched
binaries you are not seeing the renaming problem with the WAL files
anymore.

> On other hand, on developer server (Windows Server 2016 (version 1607, OS
> build 14393.4225))  with real load and unpatched binaries now I got no
> errors about about pg_wal and gets only twice this error:
> "2021-03-18 12:07:26.153 EET [2956] LOG:  could not rename temporary
> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
> Permission denied"
>
> So, keep testing. That's strange for now. I am thinking about changing
> binaries on prod server, but it will be possible on Saturday.

Yes, I think that it would be good to do more tests, as it may be
possible that what you are seeing does not repeat.  What you are
reporting is encouraging though.  Thanks!

By the way, it is very important to report that to the community
mailing lists.  Could you add pgsql-bugs when replying please?
--
Michael

Re: BUG #16927: Postgres can`t access WAL files

From
Michael Paquier
Date:
On Thu, Mar 18, 2021 at 02:16:21PM +0200, Ярослав Пашинский wrote:
> чт, 18 мар. 2021 г. в 13:15, Michael Paquier <michael@paquier.xyz>:
>> On Thu, Mar 18, 2021 at 12:44:29PM +0200, Ярослав Пашинский wrote:
>>> So, I started test on my Windows server that we using for replica on
>>> instance, which I copied from master. The binarys was unpatched, that
>>> you sent me here. The system is: windows server 2016, os build 14393.4283.
>>> To emulate load I used pgbench with such parameters -t 10000 -c 50 -j 20.
>>> After couple of running test in log file I found almost same errors:
>>> "2021-03-18 11:27:14.322 EET [3748] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied
>>> 2021-03-18 11:27:18.928 EET [692] LOG:  using stale statistics instead of
>>> current ones because stats collector is not responding"
>>> ...and
>>> "2021-03-18 11:48:49.630 EET [6476] LOG:  could not rename file
>>> "pg_wal/00000001000000650000008F": Permission denied"
>>> So I decided to switch to patched binaries and sometimes get only this
>>> one error:
>>> "2021-03-18 12:27:14.571 EET [4840] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied
>>> 2021-03-18 12:27:19.178 EET [7556] LOG:  using stale statistics instead
>> of current ones because stats collector is not responding"
>>> Which is not very critical, so it`s ok.
>>
>> Okay, so it looks like a very good news to me.  With the patched
>> binaries you are not seeing the renaming problem with the WAL files
>> anymore.
>>
>>> On other hand, on developer server (Windows Server 2016 (version 1607, OS
>>> build 14393.4225))  with real load and unpatched binaries now I got no
>>> errors about about pg_wal and gets only twice this error:
>>> "2021-03-18 12:07:26.153 EET [2956] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied"
>>>
>>> So, keep testing. That's strange for now. I am thinking about changing
>>> binaries on prod server, but it will be possible on Saturday.
>>
>> Yes, I think that it would be good to do more tests, as it may be
>> possible that what you are seeing does not repeat.  What you are
>> reporting is encouraging though.  Thanks!
>>
>> By the way, it is very important to report that to the community
>> mailing lists.  Could you add pgsql-bugs when replying please?
>
> The strange thing is why one server works fine on unpatched binaries while
> the second one requires a patched version to get rid of pg_wal access
> error.
>
> UPD: just right now on developer server I got an error: "2021-03-18
> 14:11:39.444 EET [892] LOG:  could not rename file
> "pg_wal/00000001000009130000006F": Permission denied"
> Will switch to patched binaries and tell you later.

The issue seems to depend on timing and the load your cluster is
facing, so that is not surprising to hear that this does not show up
100% of the time.  I am actually glad to hear that you have not seen
the issue anymore with the patched builds, while the unpatched builds
have shown the problem at least once.  It would be a problem if the
patched builds begin to complain about the renaming of the WAL
segments though as we would have to consider a different theory.

> P.S: Sorry, that I didn't include in reply psql-bugs, is it ok right now?

That's fine.  Thanks :)

I have added to this email the last things we discussed, for
transparency.
--
Michael

Attachment

Re: BUG #16927: Postgres can`t access WAL files

From
Andres Freund
Date:
Hi,

On 2021-03-18 14:16:21 +0200, Ярослав Пашинский wrote:
> The strange thing is why one server works fine on unpatched binaries while
> the second one requires a patched version to get rid of pg_wal access
> error.
> UPD: just right now on developer server I got an error: "2021-03-18
> 14:11:39.444 EET [892] LOG:  could not rename file
> "pg_wal/00000001000009130000006F": Permission denied"
> Will switch to patched binaries and tell you later.

Could you use
https://docs.microsoft.com/en-us/sysinternals/downloads/findlinks on one
of the files that can't be renamed? Or even better, the all the WAL
files?

Greetings,

Andres Freund



Re: BUG #16927: Postgres can`t access WAL files

From
Ярослав Пашинский
Date:
Yes, for this moment I have 4 clusters on developer server and 1 cluster that I`m testing by my own and there is no error connected with access to WAL files for last about 17-20 hours. I`ll will run my prod clusters and will also tell you. If I won`t send you any new message - the problem is gone also on prod server. Thanks in advance!
P.S: the error "2021-03-18 12:22:26.096 EET [4840] LOG:  could not rename temporary statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat": Permission denied" still be, as I know it`s not very
P.S.S: will be this patch available to download for everyone? 

чт, 18 мар. 2021 г. в 23:56, Michael Paquier <michael@paquier.xyz>:
On Thu, Mar 18, 2021 at 02:16:21PM +0200, Ярослав Пашинский wrote:
> чт, 18 мар. 2021 г. в 13:15, Michael Paquier <michael@paquier.xyz>:
>> On Thu, Mar 18, 2021 at 12:44:29PM +0200, Ярослав Пашинский wrote:
>>> So, I started test on my Windows server that we using for replica on
>>> instance, which I copied from master. The binarys was unpatched, that
>>> you sent me here. The system is: windows server 2016, os build 14393.4283.
>>> To emulate load I used pgbench with such parameters -t 10000 -c 50 -j 20.
>>> After couple of running test in log file I found almost same errors:
>>> "2021-03-18 11:27:14.322 EET [3748] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied
>>> 2021-03-18 11:27:18.928 EET [692] LOG:  using stale statistics instead of
>>> current ones because stats collector is not responding"
>>> ...and
>>> "2021-03-18 11:48:49.630 EET [6476] LOG:  could not rename file
>>> "pg_wal/00000001000000650000008F": Permission denied"
>>> So I decided to switch to patched binaries and sometimes get only this
>>> one error:
>>> "2021-03-18 12:27:14.571 EET [4840] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied
>>> 2021-03-18 12:27:19.178 EET [7556] LOG:  using stale statistics instead
>> of current ones because stats collector is not responding"
>>> Which is not very critical, so it`s ok.
>>
>> Okay, so it looks like a very good news to me.  With the patched
>> binaries you are not seeing the renaming problem with the WAL files
>> anymore.
>>
>>> On other hand, on developer server (Windows Server 2016 (version 1607, OS
>>> build 14393.4225))  with real load and unpatched binaries now I got no
>>> errors about about pg_wal and gets only twice this error:
>>> "2021-03-18 12:07:26.153 EET [2956] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied"
>>>
>>> So, keep testing. That's strange for now. I am thinking about changing
>>> binaries on prod server, but it will be possible on Saturday.
>>
>> Yes, I think that it would be good to do more tests, as it may be
>> possible that what you are seeing does not repeat.  What you are
>> reporting is encouraging though.  Thanks!
>>
>> By the way, it is very important to report that to the community
>> mailing lists.  Could you add pgsql-bugs when replying please?
>
> The strange thing is why one server works fine on unpatched binaries while
> the second one requires a patched version to get rid of pg_wal access
> error.
>
> UPD: just right now on developer server I got an error: "2021-03-18
> 14:11:39.444 EET [892] LOG:  could not rename file
> "pg_wal/00000001000009130000006F": Permission denied"
> Will switch to patched binaries and tell you later.

The issue seems to depend on timing and the load your cluster is
facing, so that is not surprising to hear that this does not show up
100% of the time.  I am actually glad to hear that you have not seen
the issue anymore with the patched builds, while the unpatched builds
have shown the problem at least once.  It would be a problem if the
patched builds begin to complain about the renaming of the WAL
segments though as we would have to consider a different theory.

> P.S: Sorry, that I didn't include in reply psql-bugs, is it ok right now?

That's fine.  Thanks :)

I have added to this email the last things we discussed, for
transparency.
--
Michael

Re: BUG #16927: Postgres can`t access WAL files

From
Michael Paquier
Date:
On Fri, Mar 19, 2021 at 10:08:36AM +0200, Ярослав Пашинский wrote:
> Yes, for this moment I have 4 clusters on developer server and 1 cluster
> that I`m testing by my own and there is no error connected with
> access to WAL files for last about 17-20 hours. I`ll will run my prod
> clusters and will also tell you. If I won`t send you any new message - the
> problem is gone also on prod server. Thanks in advance!

Cool.  So, it really looks like we have found the issue based on what
you are saying here, and that we had better consider as a first step a
revert of aaaef7a on HEAD and REL_13_STABLE.

So, what do others think?  Would people agree to revert aaaef7a for
now?

> P.S: the error "2021-03-18 12:22:26.096 EET [4840] LOG:  could not rename
> temporary statistics file "pg_stat_tmp/global.tmp" to
> "pg_stat_tmp/global.stat": Permission denied" still be, as I know it`s not
> very

This one is in a different code path.

> P.S.S: will be this patch available to download for everyone?

Well, if a different committer or myself is able to get a patch
committed, it will available to everyone once 13.3 gets released.
This would happen in May based on the existing roadmap:
https://www.postgresql.org/developer/roadmap/

How much did you test the unpatched builds by the way?
--
Michael

Attachment

Re: BUG #16927: Postgres can`t access WAL files

From
Ярослав Пашинский
Date:
Hello, sure. Right now I found recent log for one WAL file and here is a output of find links program:
".\FindLinks64.exe "D:\DB_update\PSQL_4\pg_wal\000000010000429E00000077"

Findlinks v1.1 - Locate file hard links
Copyright (C) 2011-2016 Mark Russinovich
Sysinternals - www.sysinternals.com

Error opening d:\db_update\psql_4\pg_wal\000000010000429e00000077:
Access is denied. "
 P.S.: I run program with admin rights.

пт, 19 мар. 2021 г. в 01:51, Andres Freund <andres@anarazel.de>:
Hi,

On 2021-03-18 14:16:21 +0200, Ярослав Пашинский wrote:
> The strange thing is why one server works fine on unpatched binaries while
> the second one requires a patched version to get rid of pg_wal access
> error.
> UPD: just right now on developer server I got an error: "2021-03-18
> 14:11:39.444 EET [892] LOG:  could not rename file
> "pg_wal/00000001000009130000006F": Permission denied"
> Will switch to patched binaries and tell you later.

Could you use
https://docs.microsoft.com/en-us/sysinternals/downloads/findlinks on one
of the files that can't be renamed? Or even better, the all the WAL
files?

Greetings,

Andres Freund

Re: BUG #16927: Postgres can`t access WAL files

From
Ярослав Пашинский
Date:
"and that we had better consider as a first step a
revert of aaaef7a on HEAD and REL_13_STABLE.

So, what do others think?  Would people agree to revert aaaef7a for
now?"
I didn't quite understand you here, is " aaaef7a " stands for patch name?

" How much did you test the unpatched builds by the way?  "
On cluster where load was caused by pgbench I met wal file access error after about 1 hour, but on developer server with real load (usually up to 100 connections) the error occurred after 2.5-3 hours.

пт, 19 мар. 2021 г. в 10:28, Michael Paquier <michael@paquier.xyz>:
On Fri, Mar 19, 2021 at 10:08:36AM +0200, Ярослав Пашинский wrote:
> Yes, for this moment I have 4 clusters on developer server and 1 cluster
> that I`m testing by my own and there is no error connected with
> access to WAL files for last about 17-20 hours. I`ll will run my prod
> clusters and will also tell you. If I won`t send you any new message - the
> problem is gone also on prod server. Thanks in advance!

Cool.  So, it really looks like we have found the issue based on what
you are saying here, and that we had better consider as a first step a
revert of aaaef7a on HEAD and REL_13_STABLE.

So, what do others think?  Would people agree to revert aaaef7a for
now?

> P.S: the error "2021-03-18 12:22:26.096 EET [4840] LOG:  could not rename
> temporary statistics file "pg_stat_tmp/global.tmp" to
> "pg_stat_tmp/global.stat": Permission denied" still be, as I know it`s not
> very

This one is in a different code path.

> P.S.S: will be this patch available to download for everyone?

Well, if a different committer or myself is able to get a patch
committed, it will available to everyone once 13.3 gets released.
This would happen in May based on the existing roadmap:
https://www.postgresql.org/developer/roadmap/

How much did you test the unpatched builds by the way?
--
Michael

Re: BUG #16927: Postgres can`t access WAL files

From
Michael Paquier
Date:
On Fri, Mar 19, 2021 at 11:04:14AM +0200, Ярослав Пашинский wrote:
> I didn't quite understand you here, is " aaaef7a " stands for patch name?

There was a typo in one of my previous messages.  What I was referring
to is aaa3aedd.  That's a commit of the Postgres code tree, if you are
not familiar with git, here is a link to the code change:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=aaa3aedd

>> " How much did you test the unpatched builds by the way?  "
> On cluster where load was caused by pgbench I met wal file access error
> after about 1 hour, but on developer server with real load (usually up to
> 100 connections) the error occurred after 2.5-3 hours.

OK, thanks.  My environments are not that sensitive to the issue,
unfortunately.
--
Michael

Attachment

Re: BUG #16927: Postgres can`t access WAL files

From
Ярослав Пашинский
Date:
I don`t know the dependencies of appearances of this issue (Honestly, I tried to find them before mailing psql-bugs). For example, yesterday log file was full of this messages but today I was waiting to get this error back to check wal file linking via FindLinks program. At least I`m confident that this issue could destroy replication and this issue not only on one machine. Anyway, I`m happy that your patch seems to be key to solve this problem.

пт, 19 мар. 2021 г. в 13:36, Michael Paquier <michael@paquier.xyz>:
On Fri, Mar 19, 2021 at 11:04:14AM +0200, Ярослав Пашинский wrote:
> I didn't quite understand you here, is " aaaef7a " stands for patch name?

There was a typo in one of my previous messages.  What I was referring
to is aaa3aedd.  That's a commit of the Postgres code tree, if you are
not familiar with git, here is a link to the code change:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=aaa3aedd

>> " How much did you test the unpatched builds by the way?  "
> On cluster where load was caused by pgbench I met wal file access error
> after about 1 hour, but on developer server with real load (usually up to
> 100 connections) the error occurred after 2.5-3 hours.

OK, thanks.  My environments are not that sensitive to the issue,
unfortunately.
--
Michael

Re: BUG #16927: Postgres can`t access WAL files

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> There was a typo in one of my previous messages.  What I was referring
> to is aaa3aedd.

Ah, I was just about to ask what the heck aaaef7a referred to.

Given the evidence that there's a problem, I agree with reverting
that.  I'd suggest keeping the cosmetic rename of the function,
but we have to put back the Windows-doesn't-HAVE_WORKING_LINK logic.

Grepping in the v12 branch, I find a second use of HAVE_WORKING_LINK
in contrib/pg_standby.  But that seems to be in a non-WIN32 code path,
so I don't think putting that back is necessary.

            regards, tom lane



Re: BUG #16927: Postgres can`t access WAL files

From
Magnus Hagander
Date:
On Fri, Mar 19, 2021 at 4:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Michael Paquier <michael@paquier.xyz> writes:
> > There was a typo in one of my previous messages.  What I was referring
> > to is aaa3aedd.
>
> Ah, I was just about to ask what the heck aaaef7a referred to.
>
> Given the evidence that there's a problem, I agree with reverting
> that.  I'd suggest keeping the cosmetic rename of the function,
> but we have to put back the Windows-doesn't-HAVE_WORKING_LINK logic.

+1. I think the indications are definitely clear enough that this has
to go back in.


> Grepping in the v12 branch, I find a second use of HAVE_WORKING_LINK
> in contrib/pg_standby.  But that seems to be in a non-WIN32 code path,
> so I don't think putting that back is necessary.

.. and apart front aht I *really* doubt that one has many users,
especially on Windows :)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



Re: BUG #16927: Postgres can`t access WAL files

From
Michael Paquier
Date:
On Fri, Mar 19, 2021 at 04:19:50PM +0100, Magnus Hagander wrote:
> On Fri, Mar 19, 2021 at 4:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Given the evidence that there's a problem, I agree with reverting
>> that.  I'd suggest keeping the cosmetic rename of the function,
>> but we have to put back the Windows-doesn't-HAVE_WORKING_LINK logic.
>
> +1. I think the indications are definitely clear enough that this has
> to go back in.

No problem from me to keep the rename, and so this leads to the simple
patch attached, then.  Any comments?

>> Grepping in the v12 branch, I find a second use of HAVE_WORKING_LINK
>> in contrib/pg_standby.  But that seems to be in a non-WIN32 code path,
>> so I don't think putting that back is necessary.
>
> .. and apart front aht I *really* doubt that one has many users,
> especially on Windows :)

Yeah, agreed.
--
Michael

Attachment

Re: BUG #16927: Postgres can`t access WAL files

From
Andres Freund
Date:
Hi,

On 2021-03-20 07:32:43 +0900, Michael Paquier wrote:
> No problem from me to keep the rename, and so this leads to the simple
> patch attached, then.  Any comments?

- I think there needs to be a reference to the problem, otherwise we'll
  just redo this a couple years down the line
- I'd not add the XXX, because the whole idea of durable_rename_excl
  seems wrong to me, and it might motivate people to come up with
  patches...

Greetings,

Andres Freund



Re: BUG #16927: Postgres can`t access WAL files

From
Tom Lane
Date:
Michael Paquier <michael@paquier.xyz> writes:
> No problem from me to keep the rename, and so this leads to the simple
> patch attached, then.  Any comments?

LGTM.

            regards, tom lane



Re: BUG #16927: Postgres can`t access WAL files

From
Michael Paquier
Date:
On Fri, Mar 19, 2021 at 04:56:45PM -0700, Andres Freund wrote:
> - I think there needs to be a reference to the problem, otherwise we'll
>   just redo this a couple years down the line
> - I'd not add the XXX, because the whole idea of durable_rename_excl
>   seems wrong to me, and it might motivate people to come up with
>   patches...

Fine by me to remove that :)

What about replacing the XXX comment by a small note, say:
"On Windows, using a hard link followed by an unlink() causes
concurrency issues with code paths interacting with those files, a
rename does not cause that."

If you have a better idea, which I am sure you do, please feel free to
send suggestions.
--
Michael

Attachment

Re: BUG #16927: Postgres can`t access WAL files

From
Michael Paquier
Date:
On Fri, Mar 19, 2021 at 10:38:51AM +0200, Ярослав Пашинский wrote:
> Error opening d:\db_update\psql_4\pg_wal\000000010000429e00000077:
> Access is denied. "
>  P.S.: I run program with admin rights.

Hmm.  I recall that EACCES would happen on files marked as pending for
deletion.
--
Michael

Attachment

Re: BUG #16927: Postgres can`t access WAL files

From
Michael Paquier
Date:
On Sat, Mar 20, 2021 at 09:31:07AM +0900, Michael Paquier wrote:
> Fine by me to remove that :)

Applied this stuff as of 909b449 for 13.3~ to address the bug, but I
would not mind tweak more the areas and its comments if people have
more ideas.  Yaroslav, things should be good with the next minor
release of Postgres.
--
Michael

Attachment

Re: BUG #16927: Postgres can`t access WAL files

From
Ярослав Пашинский
Date:
The error has completely gone and replication works fine. Your patch definitely works, my 8 prod & dev clusters + 4 replication slots works fine on 13.2 now. Only one error still sometimes gets occurred (as I showed before): "2021-03-23 11:44:52.932 EET [9272] LOG:  could not rename temporary statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat": Permission denied"
Hope It'll be fixed in the next releases.
 Best wishes, Yaroslav!

пн, 22 мар. 2021 г. в 07:54, Michael Paquier <michael@paquier.xyz>:
On Sat, Mar 20, 2021 at 09:31:07AM +0900, Michael Paquier wrote:
> Fine by me to remove that :)

Applied this stuff as of 909b449 for 13.3~ to address the bug, but I
would not mind tweak more the areas and its comments if people have
more ideas.  Yaroslav, things should be good with the next minor
release of Postgres.
--
Michael

Re: BUG #16927: Postgres can`t access WAL files

From
Michael Paquier
Date:
On Tue, Mar 23, 2021 at 11:49:32AM +0200, Ярослав Пашинский wrote:
> The error has completely gone and replication works fine.

Glad to hear that.  Until 13.3 is out, you could use the binaries I
have provided but these have been stripped from most of their build
options to make them portable for the tests: no OpenSSL, no ICU, and
more things missing.

> Your patch definitely works, my 8 prod & dev clusters + 4
> replication slots works fine on 13.2 now. Only one error still
> sometimes gets occurred (as I showed > before): "2021-03-23
> 11:44:52.932 EET [9272] LOG:  could not rename temporary
> statistics file "pg_stat_tmp/global.tmp" to
> "pg_stat_tmp/global.stat": Permission denied"
> Hope It'll be fixed in the next releases.

This one is a separate issue, and I don't think I'll be able to look
at that in more details until the commit fest finishes (development of
14 finsihes in two weeks so the activity is high).  Is this something
new to 13 or did you see that in past version as well on your servers?
--
Michael

Attachment