On pg 9.6.* we have sometimes strange problem with some sessions. We experienced it from bash scripts, golang program and node.js applications. Environment are Google Compute Engine instances with Debian 8.
Sometimes some session goes into some kind of "drowsy" state and runs incredibly slowly. It causes some very low disk IO and runs like 20x slower then usual. This happens mainly when database is under heavy load and it happens randomly to different tasks. It happens randomly to both connections from local instance (cronjob running on instance with PG) and remote connections from other instances.
When we restart task it runs normally. I cannot see any changes in process priority in OS, process switches between sleeping and running state and I do not see any locks or waits in pg_stat_activity or pg_locks. Usually all other processes finish OK and this "drowsy" one runs "for ever..."
We implemented timeouts for PG tasks into our programs to restart tasks but it very annoying and we have to use long enough timeouts anyway so it delays tasks.... It happens from time to time also to dblink connections inside pg function so we had to rewrite our programs which were running without any problems on PG 9.5. Problems started when we switched to PG 9.6 and we experienced them on all minor versions - we currently run 9.6.3 and problem is still present...
Do you please have any advice what to check or what setting might help?
it can be spin lock issue. Try to use "perf top". Install postgresql-server debug symbols before.