Re: logical replication: could not create file "state.tmp": Fileexists - Mailing list pgsql-bugs

From Grigory Smolkin
Subject Re: logical replication: could not create file "state.tmp": Fileexists
Date
Msg-id c051f18b-fc8f-50d4-53e5-83e750265417@postgrespro.ru
Whole thread Raw
In response to Re: logical replication: could not create file "state.tmp": Fileexists  (Andres Freund <andres@anarazel.de>)
Responses Re: logical replication: could not create file "state.tmp": File exists
List pgsql-bugs
On 12/2/19 7:12 PM, Andres Freund wrote:
> Hi,
>
> On 2019-11-30 15:09:39 +0300, Grigory Smolkin wrote:
>> One of my colleagues encountered an out of space condition, which broke his
>> logical replication setup.
>> It`s manifested with the following errors:
>>
>> ERROR:  could not receive data from WAL stream: ERROR:  could not create
>> file "pg_replslot/some_sub/state.tmp": File exists
> Hm. What was the log output leading to this state? Some cases of this
> would end up in a PANIC, which'd remove the .tmp file during
> recovery. But there's some where we won't - it seems the right fix for
> this would be to unlink the tmp file in that case?
>
>
>> I`ve digged a bit into this problem, and it`s turned out that in
>> SaveSlotToPath() temp file for replication slot is opened with 'O_CREAT |
>> O_EXCL' flags, which makes this routine as not very reentrant.
>>
>> Since an exclusive lock is taken before temp file creation, I think it
>> should be safe to replace O_EXCL with O_TRUNC.
> I'm very doubtful about this. I think it's a good safety measure to
> ensure that there's no previous state file that we're somehow
> overwriting.
Is it possible with exclusive lock taken before that?
>
>
>> Script to reproduce and patch are attached.
> Well:
>
>> # Imitate out_of_space/write_operation_error
>> touch ${PGDATA_PUB}/pg_replslot/mysub/state.tmp
> Doesn't really replicate how we got into this state...

But it replicate the exactly the same state we would get, if write() to 
temp file would have failed with out of space.


>
> Greetings,
>
> Andres Freund
>
>
-- 
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




pgsql-bugs by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: BUG #16144: Segmentation fault on dict_int extension
Next
From: Dmitry Vasiliev
Date:
Subject: Re: logical replication: could not create file "state.tmp": File exists