Home > mailing lists

ecpg test thread/alloc hangs on sidewinder running NetBSD 9.3 - Mailing list pgsql-hackers

From	Alexander Lakhin
Subject	ecpg test thread/alloc hangs on sidewinder running NetBSD 9.3
Date	August 30, 2025 23:00:00
Msg-id	c2de415e-3110-4f26-a32f-5990bc3de489@gmail.com Whole thread Raw
Responses	Re: ecpg test thread/alloc hangs on sidewinder running NetBSD 9.3
List	pgsql-hackers

Tree view

Hello hackers,

I've noticed several timeout failures occurred during this month on
sidewinder: [1] [2] [3].
All three hangs happened at test thread/alloc:
...
ok 60        - thread/thread                              95 ms
ok 61        - thread/thread_implicit                     89 ms
ok 62        - thread/prep                               305 ms

I've installed NetBSD 9.3 (from [4]) and reproduced the issue with:
$ printf 'test: thread/alloc\n%.0s' {1..100} > src/interfaces/ecpg/test/ecpg_schedule
$ gmake -s check -C src/interfaces/ecpg/
...
ok 44        - thread/alloc                              133 ms
ok 45        - thread/alloc                              180 ms
ok 46        - thread/alloc                              129 ms
--- hang ---

ps shows:
  1283 pts/0 Is   0:00.42 | |   `-- -bash
18059 pts/0 I+   0:00.01 | |     `-- gmake -s check -C src/interfaces/ecpg/
23360 pts/0 I+   0:00.01 | |       `-- gmake -C test check
22349 pts/0 S+   0:00.27 | |         `-- ./pg_regress --expecteddir=. --dbname=ecpg1_regression,ecpg2_regression 
--create-role=regress_ecpg_user1,regress_ecpg_
15449 pts/0 S+   0:01.06 | |           |-- postgres -D /home/vagrant/postgresql/src/interfaces/ecpg/test/tmp_check/data

-F -c listen_addresses= -k /tmp/pg_regr
  1959 ?     Is   0:00.01 | |           | |-- postgres: io worker 1
  5608 ?     Is   0:00.01 | |           | |-- postgres: autovacuum launcher
  7218 ?     Is   0:00.07 | |           | |-- postgres: io worker 0
15867 ?     Is   0:00.01 | |           | |-- postgres: io worker 2
21071 ?     Is   0:00.00 | |           | |-- postgres: logical replication launcher
22122 ?     Ss   0:00.18 | |           | |-- postgres: walwriter
22159 ?     Is   0:00.00 | |           | |-- postgres: checkpointer
24606 ?     Ss   0:00.02 | |           | `-- postgres: background writer
24407 pts/0 Sl+  0:00.03 | |           `-- /home/vagrant/postgresql/src/interfaces/ecpg/test/thread/alloc

$ gdb -p 24407

(gdb) bt
#0  0x000077122aca23fa in ___lwp_park60 () from /usr/lib/libc.so.12
#1  0x000077122b209c66 in ?? () from /usr/lib/libpthread.so.1
#2  0x000077122ad0fcb9 in je_malloc_mutex_lock_slow () from /usr/lib/libc.so.12
#3  0x000077122ad087c1 in je_arena_choose_hard () from /usr/lib/libc.so.12
#4  0x000077122acb1915 in je_tsd_tcache_data_init () from /usr/lib/libc.so.12
#5  0x000077122acb1b44 in je_tsd_tcache_enabled_data_init () from /usr/lib/libc.so.12
#6  0x000077122acaeda4 in je_tsd_fetch_slow () from /usr/lib/libc.so.12
#7  0x000077122ad08b1c in malloc () from /usr/lib/libc.so.12
#8  0x000077122be0af6b in ECPGget_sqlca () from /home/vagrant/postgresql/tmp_install/usr/local/pgsql/lib/libecpg.so.6
#9  0x000077122be0328f in ECPGconnect () from /home/vagrant/postgresql/tmp_install/usr/local/pgsql/lib/libecpg.so.6

The stack is always the same, so maybe it's an issue with jemalloc,
similar to [5].

The test continues running after gdb detach.

Upgrade to NetBSD 9.4 fixed the issue for me.

It reproduced even on a commit from 2017-01-01, so it's not clear why
there were no such timeouts before...

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2025-08-22%2011%3A29%3A27
[2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2025-08-27%2016%3A35%3A01
[3] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2025-08-29%2012%3A35%3A01
[4] https://portal.cloud.hashicorp.com/vagrant/discover/generic/netbsd9
[5] https://github.com/jemalloc/jemalloc/issues/2402

Best regards,
Alexander

pgsql-hackers by date:

From: Nathan Bossart
Date: 30 August 2025, 22:42:47
Subject: Re: PG 18 relnotes and RC1

From: Nathan Bossart
Date: 30 August 2025, 23:02:10
Subject: Re: PG 18 relnotes and RC1

ecpg test thread/alloc hangs on sidewinder running NetBSD 9.3 - Mailing list pgsql-hackers

Previous

Next