Re: Intermittent buildfarm failures on wrasse - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Intermittent buildfarm failures on wrasse
Date
Msg-id 20220415025019.GA863781@rfd.leadboat.com
Whole thread Raw
In response to Re: Intermittent buildfarm failures on wrasse  (Noah Misch <noah@leadboat.com>)
Responses Re: Intermittent buildfarm failures on wrasse  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Apr 14, 2022 at 07:45:15PM -0700, Noah Misch wrote:
> On Thu, Apr 14, 2022 at 06:52:49PM -0700, Andres Freund wrote:
> > On 2022-04-14 21:32:27 -0400, Tom Lane wrote:
> > > Peter Geoghegan <pg@bowt.ie> writes:
> > > > Are you aware of Andres' commit 02fea8fd? That work prevented exactly
> > > > the same set of symptoms (the same index-only scan create_index
> > > > regressions),
> > > 
> > > Hm.  I'm starting to get the feeling that the real problem here is
> > > we've "optimized" the system to the point where repeatable results
> > > from VACUUM are impossible :-(
> > 
> > The synchronous_commit issue is an old one. It might actually be worth
> > addressing it by flushing out pending async commits out instead. It just
> > started to be noticeable when tenk1 load and vacuum were moved closer.
> > 
> > 
> > What do you think about applying a polished version of what I posted in
> > https://postgr.es/m/20220414164830.63rk5zqsvtqqk7qz%40alap3.anarazel.de
> > ? That'd tell us a bit more about the horizon etc.
> 
> No objection.
> 
> > It's also interesting that it only happens in the installcheck cases,
> > afaics, not the check ones. Although that might just be because there's
> > more of them...
> 
> I suspect the failure is somehow impossible in "check".  Yesterday, I cranked
> up the number of locales, so there are now a lot more installcheck.  Before
> that, each farm run had one "check" and two "installcheck".  Those days saw
> ten installcheck failures, zero check failures.
> 
> Like Tom, I'm failing to reproduce this outside the buildfarm client.  I wrote
> a shell script to closely resemble the buildfarm installcheck sequence, but
> it's lasted a dozen runs without failing.

But 24s after that email, it did reproduce the problem.  Same symptoms as the
last buildfarm runs, including visfrac=0.  I'm attaching my script.  (It has
various references to my home directory, so it's not self-contained.)

Attachment

pgsql-hackers by date:

Previous
From: Noah Misch
Date:
Subject: Re: Intermittent buildfarm failures on wrasse
Next
From: Tom Lane
Date:
Subject: Re: Intermittent buildfarm failures on wrasse