Re: BUG #16927: Postgres can`t access WAL files - Mailing list pgsql-bugs

From Ярослав Пашинский
Subject Re: BUG #16927: Postgres can`t access WAL files
Date
Msg-id CADLmTo+GoZ1tuyZxx5uprqSkwqqw3yajry36o-+=N=Jqe=0oZA@mail.gmail.com
Whole thread Raw
In response to Re: BUG #16927: Postgres can`t access WAL files  (Michael Paquier <michael@paquier.xyz>)
List pgsql-bugs
Yes, for this moment I have 4 clusters on developer server and 1 cluster that I`m testing by my own and there is no error connected with access to WAL files for last about 17-20 hours. I`ll will run my prod clusters and will also tell you. If I won`t send you any new message - the problem is gone also on prod server. Thanks in advance!
P.S: the error "2021-03-18 12:22:26.096 EET [4840] LOG:  could not rename temporary statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat": Permission denied" still be, as I know it`s not very
P.S.S: will be this patch available to download for everyone? 

чт, 18 мар. 2021 г. в 23:56, Michael Paquier <michael@paquier.xyz>:
On Thu, Mar 18, 2021 at 02:16:21PM +0200, Ярослав Пашинский wrote:
> чт, 18 мар. 2021 г. в 13:15, Michael Paquier <michael@paquier.xyz>:
>> On Thu, Mar 18, 2021 at 12:44:29PM +0200, Ярослав Пашинский wrote:
>>> So, I started test on my Windows server that we using for replica on
>>> instance, which I copied from master. The binarys was unpatched, that
>>> you sent me here. The system is: windows server 2016, os build 14393.4283.
>>> To emulate load I used pgbench with such parameters -t 10000 -c 50 -j 20.
>>> After couple of running test in log file I found almost same errors:
>>> "2021-03-18 11:27:14.322 EET [3748] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied
>>> 2021-03-18 11:27:18.928 EET [692] LOG:  using stale statistics instead of
>>> current ones because stats collector is not responding"
>>> ...and
>>> "2021-03-18 11:48:49.630 EET [6476] LOG:  could not rename file
>>> "pg_wal/00000001000000650000008F": Permission denied"
>>> So I decided to switch to patched binaries and sometimes get only this
>>> one error:
>>> "2021-03-18 12:27:14.571 EET [4840] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied
>>> 2021-03-18 12:27:19.178 EET [7556] LOG:  using stale statistics instead
>> of current ones because stats collector is not responding"
>>> Which is not very critical, so it`s ok.
>>
>> Okay, so it looks like a very good news to me.  With the patched
>> binaries you are not seeing the renaming problem with the WAL files
>> anymore.
>>
>>> On other hand, on developer server (Windows Server 2016 (version 1607, OS
>>> build 14393.4225))  with real load and unpatched binaries now I got no
>>> errors about about pg_wal and gets only twice this error:
>>> "2021-03-18 12:07:26.153 EET [2956] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied"
>>>
>>> So, keep testing. That's strange for now. I am thinking about changing
>>> binaries on prod server, but it will be possible on Saturday.
>>
>> Yes, I think that it would be good to do more tests, as it may be
>> possible that what you are seeing does not repeat.  What you are
>> reporting is encouraging though.  Thanks!
>>
>> By the way, it is very important to report that to the community
>> mailing lists.  Could you add pgsql-bugs when replying please?
>
> The strange thing is why one server works fine on unpatched binaries while
> the second one requires a patched version to get rid of pg_wal access
> error.
>
> UPD: just right now on developer server I got an error: "2021-03-18
> 14:11:39.444 EET [892] LOG:  could not rename file
> "pg_wal/00000001000009130000006F": Permission denied"
> Will switch to patched binaries and tell you later.

The issue seems to depend on timing and the load your cluster is
facing, so that is not surprising to hear that this does not show up
100% of the time.  I am actually glad to hear that you have not seen
the issue anymore with the patched builds, while the unpatched builds
have shown the problem at least once.  It would be a problem if the
patched builds begin to complain about the renaming of the WAL
segments though as we would have to consider a different theory.

> P.S: Sorry, that I didn't include in reply psql-bugs, is it ok right now?

That's fine.  Thanks :)

I have added to this email the last things we discussed, for
transparency.
--
Michael

pgsql-bugs by date:

Previous
From: Sandeep Thakkar
Date:
Subject: Re: BUG #16929: The Enterprise DB installer file is available as binary and not as .dmg file for 12.6
Next
From: Michael Paquier
Date:
Subject: Re: BUG #16927: Postgres can`t access WAL files