Re: Several buildfarm animals fail tests because of shared memory error - Mailing list pgsql-hackers

From Robins Tharakan
Subject Re: Several buildfarm animals fail tests because of shared memory error
Date
Msg-id CAEP4nAyrnviyeRPb2OLB1o_F8pMw=E4OVhkjYeSA0Rdek+AYHg@mail.gmail.com
Whole thread Raw
In response to Several buildfarm animals fail tests because of shared memory error  (Alexander Lakhin <exclusion@gmail.com>)
List pgsql-hackers
Hi Alexander,

Thanks for collating this list.
I'll try to add as much as I know, in hopes that it helps.

On Sun, 22 Dec 2024 at 16:30, Alexander Lakhin <exclusion@gmail.com> wrote:
I'd like to bring your attention to multiple buildfarm failures, which
occurred this month, on master only, caused by "could not open shared
memory segment ...: No such file or directory" errors.


- I am unsure how batta is set up, but till late last week, none of my instances had set REMOVEIPC correctly. I am sorry, I didn't know about this until Thomas pointed it out to me in another thread. So if that's a key reason here, then probably by this time next week things should settle down. I've begun setting it correctly (2 done with a few more to go) - although given that some machines are at work, I'll try to get to them this coming week.



But still why master only?

+1. It is interesting though as to why master is affected more often. This may be statistical - since master ends up with more commits and thus more tests? Unsure.

Also:
- I recently (~2 days back) switched parula to gcc-experimental nightly - after which I see 4 of the recent errors - although the recent most test is green.
- The only info about leafhopper may be relevant is that it's one of the newest machines (Graviton4) so it comes with a recent hardware / kernel / stock gcc 11.4.1.

Unfortunately I'm unable to reproduce such failures locally, so I'm sorry
for such raw information, but I see no way to investigate this further
without assistance. Perhaps owners of these animals could shed some light
on this...

Since the instances are created with work accounts, it isn't trivial to share access but I could revert with any outputs / capture if it can help here.

Lastly, alligator has been on gcc nightly for a few months, and is on x86_64 - so by this time next week if alligator is still stuttering, pretty sure there's more than just aarch64 or gcc or IPC config to blame here.

-
robins

pgsql-hackers by date:

Previous
From: Alexander Lakhin
Date:
Subject: Several buildfarm animals fail tests because of shared memory error
Next
From: "David G. Johnston"
Date:
Subject: Re: Document NULL