Re: BUG #16927: Postgres can`t access WAL files - Mailing list pgsql-bugs

From Michael Paquier
Subject Re: BUG #16927: Postgres can`t access WAL files
Date
Msg-id YFPMl3TUPXbN127E@paquier.xyz
Whole thread Raw
In response to Re: BUG #16927: Postgres can`t access WAL files  (Ярослав Пашинский <yarik97.6@gmail.com>)
Responses Re: BUG #16927: Postgres can`t access WAL files
List pgsql-bugs
On Thu, Mar 18, 2021 at 02:16:21PM +0200, Ярослав Пашинский wrote:
> чт, 18 мар. 2021 г. в 13:15, Michael Paquier <michael@paquier.xyz>:
>> On Thu, Mar 18, 2021 at 12:44:29PM +0200, Ярослав Пашинский wrote:
>>> So, I started test on my Windows server that we using for replica on
>>> instance, which I copied from master. The binarys was unpatched, that
>>> you sent me here. The system is: windows server 2016, os build 14393.4283.
>>> To emulate load I used pgbench with such parameters -t 10000 -c 50 -j 20.
>>> After couple of running test in log file I found almost same errors:
>>> "2021-03-18 11:27:14.322 EET [3748] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied
>>> 2021-03-18 11:27:18.928 EET [692] LOG:  using stale statistics instead of
>>> current ones because stats collector is not responding"
>>> ...and
>>> "2021-03-18 11:48:49.630 EET [6476] LOG:  could not rename file
>>> "pg_wal/00000001000000650000008F": Permission denied"
>>> So I decided to switch to patched binaries and sometimes get only this
>>> one error:
>>> "2021-03-18 12:27:14.571 EET [4840] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied
>>> 2021-03-18 12:27:19.178 EET [7556] LOG:  using stale statistics instead
>> of current ones because stats collector is not responding"
>>> Which is not very critical, so it`s ok.
>>
>> Okay, so it looks like a very good news to me.  With the patched
>> binaries you are not seeing the renaming problem with the WAL files
>> anymore.
>>
>>> On other hand, on developer server (Windows Server 2016 (version 1607, OS
>>> build 14393.4225))  with real load and unpatched binaries now I got no
>>> errors about about pg_wal and gets only twice this error:
>>> "2021-03-18 12:07:26.153 EET [2956] LOG:  could not rename temporary
>>> statistics file "pg_stat_tmp/global.tmp" to "pg_stat_tmp/global.stat":
>>> Permission denied"
>>>
>>> So, keep testing. That's strange for now. I am thinking about changing
>>> binaries on prod server, but it will be possible on Saturday.
>>
>> Yes, I think that it would be good to do more tests, as it may be
>> possible that what you are seeing does not repeat.  What you are
>> reporting is encouraging though.  Thanks!
>>
>> By the way, it is very important to report that to the community
>> mailing lists.  Could you add pgsql-bugs when replying please?
>
> The strange thing is why one server works fine on unpatched binaries while
> the second one requires a patched version to get rid of pg_wal access
> error.
>
> UPD: just right now on developer server I got an error: "2021-03-18
> 14:11:39.444 EET [892] LOG:  could not rename file
> "pg_wal/00000001000009130000006F": Permission denied"
> Will switch to patched binaries and tell you later.

The issue seems to depend on timing and the load your cluster is
facing, so that is not surprising to hear that this does not show up
100% of the time.  I am actually glad to hear that you have not seen
the issue anymore with the patched builds, while the unpatched builds
have shown the problem at least once.  It would be a problem if the
patched builds begin to complain about the renaming of the WAL
segments though as we would have to consider a different theory.

> P.S: Sorry, that I didn't include in reply psql-bugs, is it ok right now?

That's fine.  Thanks :)

I have added to this email the last things we discussed, for
transparency.
--
Michael

Attachment

pgsql-bugs by date:

Previous
From: Fujii Masao
Date:
Subject: Re: BUG #16722: PG hanging on COPY when table has close to 2^32 toasts in the table.
Next
From: Andres Freund
Date:
Subject: Re: BUG #16927: Postgres can`t access WAL files