On 4/2/26 06:00, Alexander Lakhin wrote:
> Hello Tom and Tomas,
>
> 01.04.2026 23:20, Tom Lane wrote:
>> Alexander Lakhin <exclusion@gmail.com> writes:
>>> I think this can explain slow CommitTransactionCommand() and why it
>>> happens not every time. Regarding other animals, I guess they can
>>> experience the same bumps but not exceeding 5 seconds (50 tries). Thus,
>>> from my understanding, for the failure to happen, we need to have slow
>>> storage and initialize_worker_spi() -> CommitTransactionCommand() reaching
>>> XLogFileClose().
>> So, it remains not very clear why only widowbird is showing this
>> failure, but I think we can safely take away the bottom-line
>> conclusion that hard-wiring a maximum wait of 5s in
>> CountOtherDBBackends() was not a great idea.
>
> There also were two failures from jay: [1], [2], but yes, widowbird is
> getting more and more consistent in that aspect: [3], probably because
> of the storage (SD card?) degradation.
>
Jay is a regular machine, 2-core VM hosted at a university, so not very
powerful but was running for years just fine and seems to be healthy.
> Tomas, maybe you could check if the write speed is more or less acceptable
> there?
>
Will do. I'll wait for the tests to complete on widowbird, and will do
some testing on the storage (it's running from a flash drive, not SD
card, and I saw stuff in dmesg when it was dying in the past - but now
it's clean).
regards
--
Tomas Vondra