occasional ECPG failures on dikkop (FreeBSD) - Mailing list pgsql-hackers
| From | Tomas Vondra |
|---|---|
| Subject | occasional ECPG failures on dikkop (FreeBSD) |
| Date | |
| Msg-id | 07a279bd-dcd4-4e6a-a5bb-2bc184f6017d@vondra.me Whole thread |
| Responses |
Re: occasional ECPG failures on dikkop (FreeBSD)
|
| List | pgsql-hackers |
Hi, about a month ago dikkop started reporting occasional failures in ECPG tests. I'm not very familiar with ecpg, and I've been unable to figure this out so far, so I wonder if others might know ... The failures seem to happen maybe ~5% of the runs, but only when it's through the buildfarm client. I've been unable to reproduce the issue, even when trying to use exactly the same options etc. Two failures from master: * https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dikkop&dt=2026-04-07%2011%3A00%3A39 * https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dikkop&dt=2026-05-04%2010%3A00%3A10 However, it seems to affect older branches too, all the way back to REL_14_STABLE. The failures started to appear ~30 days ago, which aligns with the machine being upgraded from FreeBSD 14.1 to 14.4. (It might have been running 14.3, not sure.) The failures look like this: ok 64 - thread/prep 732 ms not ok 65 - thread/alloc 65 ms # (test process was terminated by signal 11: Segmentation fault) ok 66 - thread/descriptor 136 ms so the problem seems to be in thread/alloc. But the log says this: $ grep 'signal 11' /var/log/messages Apr 1 21:53:30 generic kernel: pid 27622 (thread_implicit), jid 0, uid 1001: exited on signal 11 (core dumped) Apr 1 23:54:19 generic kernel: pid 50594 (alloc), jid 0, uid 1001: exited on signal 11 (core dumped) Apr 2 20:19:57 generic kernel: pid 53415 (prep), jid 0, uid 1001: exited on signal 11 (core dumped) Apr 4 02:07:58 generic kernel: pid 48615 (prep), jid 0, uid 1001: exited on signal 11 (core dumped) Apr 7 12:58:20 generic kernel: pid 17092 (alloc), jid 0, uid 1001: exited on signal 11 (core dumped) Apr 9 13:21:47 generic kernel: pid 65784 (alloc), jid 0, uid 1001: exited on signal 11 (core dumped) Apr 10 18:20:17 generic kernel: pid 67540 (thread_implicit), jid 0, uid 1001: exited on signal 11 (core dumped) Apr 22 16:29:29 generic kernel: pid 10941 (prep), jid 0, uid 1001: exited on signal 11 (core dumped) Apr 22 20:29:47 generic kernel: pid 32964 (thread_implicit), jid 0, uid 1001: exited on signal 11 (core dumped) Apr 22 23:34:54 generic kernel: pid 43109 (prep), jid 0, uid 1001: exited on signal 11 (core dumped) Apr 29 19:24:49 generic kernel: pid 81996 (thread), jid 0, uid 1001: exited on signal 11 (core dumped) Apr 30 10:58:42 generic kernel: pid 65438 (prep), jid 0, uid 1001: exited on signal 11 (core dumped) May 3 22:15:57 generic kernel: pid 21640 (prep), jid 0, uid 1001: exited on signal 11 (core dumped) May 4 12:08:15 generic kernel: pid 98832 (alloc), jid 0, uid 1001: exited on signal 11 (core dumped) May 5 12:04:33 generic kernel: pid 65140 (prep), jid 0, uid 1001: exited on signal 11 (core dumped) May 5 13:05:45 generic kernel: pid 12122 (prep), jid 0, uid 1001: exited on signal 11 (core dumped) So there is plenty of segfaults in the other ecpg tests, it seems. Sadly, I haven't found any core files. I'll try to look again after the next failure. Any ideas? I don't see similar failures on other machines. regards -- Tomas Vondra
pgsql-hackers by date: