Andres Freund <andres@anarazel.de> writes:
> On 2019-05-10 11:38:57 -0400, Tom Lane wrote:
>> I am wondering if, somehow, the stack depth limit seen by the postmaster
>> sometimes doesn't apply to its children. That would be pretty wacko
>> kernel behavior, especially if it's only intermittently true.
>> But we're running out of other explanations.
> I wonder if this is a SIGSEGV that actually signals an OOM
> situation. Linux, if it can't actually extend the stack on-demand due to
> OOM, sends a SIGSEGV. The signal has that information, but
> unfortunately the buildfarm code doesn't print it. p $_siginfo would
> show us some of that...
> Mark, how tight is the memory on that machine? Does dmesg have any other
> information (often segfaults are logged by the kernel with the code
> IIRC).
It does sort of smell like a resource exhaustion problem, especially
if all these buildfarm animals are VMs running on the same underlying
platform. But why would that manifest as "you can't have a measly two
megabytes of stack" and not as any other sort of OOM symptom?
Mark, if you don't mind modding your local copies of the buildfarm
script, I think what Andres is asking for is a pretty trivial addition
in PGBuild/Utils.pm's sub get_stack_trace:
my $cmdfile = "./gdbcmd";
my $handle;
open($handle, '>', $cmdfile) || die "opening $cmdfile: $!";
print $handle "bt\n";
+ print $handle "p $_siginfo\n";
close($handle);
regards, tom lane