05.05.2025 10:52, Bertrand Drouvot пишет:
> Hi hackers,
>
> While working on wait event related stuff I observed a failed assertion:
>
> "
> TRAP: failed Assert("node->next == 0 && node->prev == 0"), File: "../../../../src/include/storage/proclist.h", Line:
91
> "
>
> during pg_regress/database.
>
> To reproduce, add an ereport(LOG,..) or CHECK_FOR_INTERRUPTS() or whatever
> would trigger ConditionVariableBroadcast() in pgstat_report_wait_end():
Interestingly, our colleague stepped into same problem recently [1] . It
happened because he attempted to make overcomplex timeout (SIGALARM) handler.
But his solution was a bit different [2].
[1] https://postgr.es/m/076eb7bd-52e6-4a51-ba00-c744d027b15c@postgrespro.ru
[2]
https://postgr.es/m/attachment/175030/0001-CV-correctly-handle-cv_sleep_target-change.patch
And I believe, his solution is more elegant. Doesn't it?
But in first step, I doubt there should be any thing that cancels condition
variable during WaitLatch. Most probably you did wrong thing.
We convinced the colleague to rework the code to not trigger the issue in
first place.
--
regards
Yura Sokolov aka funny-falcon