ecpg test thread/alloc hangs on sidewinder running NetBSD 9.3 - Mailing list pgsql-hackers
From | Alexander Lakhin |
---|---|
Subject | ecpg test thread/alloc hangs on sidewinder running NetBSD 9.3 |
Date | |
Msg-id | c2de415e-3110-4f26-a32f-5990bc3de489@gmail.com Whole thread Raw |
Responses |
Re: ecpg test thread/alloc hangs on sidewinder running NetBSD 9.3
|
List | pgsql-hackers |
Hello hackers, I've noticed several timeout failures occurred during this month on sidewinder: [1] [2] [3]. All three hangs happened at test thread/alloc: ... ok 60 - thread/thread 95 ms ok 61 - thread/thread_implicit 89 ms ok 62 - thread/prep 305 ms I've installed NetBSD 9.3 (from [4]) and reproduced the issue with: $ printf 'test: thread/alloc\n%.0s' {1..100} > src/interfaces/ecpg/test/ecpg_schedule $ gmake -s check -C src/interfaces/ecpg/ ... ok 44 - thread/alloc 133 ms ok 45 - thread/alloc 180 ms ok 46 - thread/alloc 129 ms --- hang --- ps shows: 1283 pts/0 Is 0:00.42 | | `-- -bash 18059 pts/0 I+ 0:00.01 | | `-- gmake -s check -C src/interfaces/ecpg/ 23360 pts/0 I+ 0:00.01 | | `-- gmake -C test check 22349 pts/0 S+ 0:00.27 | | `-- ./pg_regress --expecteddir=. --dbname=ecpg1_regression,ecpg2_regression --create-role=regress_ecpg_user1,regress_ecpg_ 15449 pts/0 S+ 0:01.06 | | |-- postgres -D /home/vagrant/postgresql/src/interfaces/ecpg/test/tmp_check/data -F -c listen_addresses= -k /tmp/pg_regr 1959 ? Is 0:00.01 | | | |-- postgres: io worker 1 5608 ? Is 0:00.01 | | | |-- postgres: autovacuum launcher 7218 ? Is 0:00.07 | | | |-- postgres: io worker 0 15867 ? Is 0:00.01 | | | |-- postgres: io worker 2 21071 ? Is 0:00.00 | | | |-- postgres: logical replication launcher 22122 ? Ss 0:00.18 | | | |-- postgres: walwriter 22159 ? Is 0:00.00 | | | |-- postgres: checkpointer 24606 ? Ss 0:00.02 | | | `-- postgres: background writer 24407 pts/0 Sl+ 0:00.03 | | `-- /home/vagrant/postgresql/src/interfaces/ecpg/test/thread/alloc $ gdb -p 24407 (gdb) bt #0 0x000077122aca23fa in ___lwp_park60 () from /usr/lib/libc.so.12 #1 0x000077122b209c66 in ?? () from /usr/lib/libpthread.so.1 #2 0x000077122ad0fcb9 in je_malloc_mutex_lock_slow () from /usr/lib/libc.so.12 #3 0x000077122ad087c1 in je_arena_choose_hard () from /usr/lib/libc.so.12 #4 0x000077122acb1915 in je_tsd_tcache_data_init () from /usr/lib/libc.so.12 #5 0x000077122acb1b44 in je_tsd_tcache_enabled_data_init () from /usr/lib/libc.so.12 #6 0x000077122acaeda4 in je_tsd_fetch_slow () from /usr/lib/libc.so.12 #7 0x000077122ad08b1c in malloc () from /usr/lib/libc.so.12 #8 0x000077122be0af6b in ECPGget_sqlca () from /home/vagrant/postgresql/tmp_install/usr/local/pgsql/lib/libecpg.so.6 #9 0x000077122be0328f in ECPGconnect () from /home/vagrant/postgresql/tmp_install/usr/local/pgsql/lib/libecpg.so.6 The stack is always the same, so maybe it's an issue with jemalloc, similar to [5]. The test continues running after gdb detach. Upgrade to NetBSD 9.4 fixed the issue for me. It reproduced even on a commit from 2017-01-01, so it's not clear why there were no such timeouts before... [1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2025-08-22%2011%3A29%3A27 [2] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2025-08-27%2016%3A35%3A01 [3] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sidewinder&dt=2025-08-29%2012%3A35%3A01 [4] https://portal.cloud.hashicorp.com/vagrant/discover/generic/netbsd9 [5] https://github.com/jemalloc/jemalloc/issues/2402 Best regards, Alexander
pgsql-hackers by date: