Re: [HACKERS] Continuous buildfarm failures on hamster with bin-check - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: [HACKERS] Continuous buildfarm failures on hamster with bin-check
Date
Msg-id CAB7nPqR6g3HFjtAy2_YJx5yTS45_CuJJRmBGZWYo0JeYNakjhw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Continuous buildfarm failures on hamster with bin-check  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Tue, Apr 18, 2017 at 4:15 PM, Andres Freund <andres@anarazel.de> wrote:
> Hi,
>
> On 2017-04-18 16:07:38 +0900, Michael Paquier wrote:
>> Some of you may have noticed that hamster is heavily red on the
>> buildfarm. I have done a bit of investigation, and I am able to
>> reproduce the failure manually. But actually after looking at the logs
>> the error has obviously showed up:
>> 2017-04-16 05:07:19.650 JST [18282] LOG:  database system is ready to
>> accept connections
>> 2017-04-16 05:08:36.725 JST [18296] LOG:  using stale statistics
>> instead of current ones because stats collector is not responding
>> 2017-04-16 05:10:22.207 JST [18303] t/010_pg_basebackup.pl LOG:
>> terminating walsender process due to replication timeout
>> 2017-04-16 05:10:30.180 JST [18306] LOG:  using stale statistics
>> instead of current ones because stats collector is not responding
>>
>> Stale regressions means that the system is just constrained so much
>> that things are timing out.
>>
>> In order to avoid such failures with normal regression tests, I have
>> set up extra_config so as stats_temp_directory goes to a tmpfs to
>> avoid stale statistics
>
> How high do you need to make the hardcoded limit for this to succeed
> without a tmpfs?

Increasing wal_sender_timeout helps visibly to reduce the failure
rate. With 10 attempts I can see before at least 3 failures, and
nothing after.

> If hamster fails this regularly I think we have to do
> something about it, rather than paper over it.  What's the storage
> situation currently like?

The SD card of this RPI is half-empty.
-- 
Michael



pgsql-hackers by date:

Previous
From: Amit Langote
Date:
Subject: Re: [HACKERS] Proposal: Local indexes for partitioned table
Next
From: Michael Paquier
Date:
Subject: Re: [HACKERS] PANIC in pg_commit_ts slru after crashes