On 1/28/20 8:02 PM, Kyotaro Horiguchi wrote:
> At Tue, 28 Jan 2020 19:13:32 +0300, Pavel Suderevsky
>> Regading influence: issue is not about the large amount of WALs to apply
>> but in searching for the non-existing WALs on the remote storage,
each such
>> search can take 5-10 seconds while obtaining existing WAL takes
>> milliseconds.
>
> Wow. I didn't know of a file system that takes that much seconds to
> trying non-existent files. Although I still think this is not a bug,
> but avoiding that actually leads to a big win on such systems.
I have not tested this case but I can imagine it would be slow in
practice. It's axiomatic that is hard to prove a negative. With
multi-region replication it might well take some time to be sure that
the file *really* doesn't exist and hasn't just been lost in a single
region.
> After a thought, I think it's safe and effectively doable to let
> XLogFileReadAnyTLI() refrain from trying WAL segments of too-high
> TLIs. Some garbage archive files out of the range of a timeline might
> be seen, for example, after reusing archive directory without clearing
> files. However, fetching such garbages just to fail doesn't
> contribute durability or reliablity at all, I think.
The patch seems sane, the trick will be testing it.
Pavel, do you have an environment where you can ensure this is a
performance benefit?
Regards,
--
-David
david@pgmasters.net