Some additional observation and food for thoughts. Our app uses connection caching (Apache::DBI). By disabling Apache::DBI and forcing client re-connection for every (http) request processed I eliminated the stall. The user cpu usage jumped (mostly cause prepared sql queries are no longer available, and some additional overhead on re-connection), but no single case of high-sys-cpu stall.
I can not completely rule out the possibility of some left-overs (unfinished transaction?) remain after serving http request, which, in the absence of connection caching, are discarded for sure....