Touch wood but I think I found the problem thanks to these pointers. I checked the
vm.zone_reclaim_mode and mine was set to 0. However just before the locking starts I can see many of my CPUs flashing red and jump to high percentage sys usage. When I look at top it's the migration kernel tasks that seem to trigger it.
# Postgres Kernel Tweaks
kernel.sched_migration_cost = 5000000
# kernel.sched_autogroup_enabled = 0
The second recommended setting 'sched_autogroup_enabled' is not available on the kernel I'm running but it doesn't seem to be a problem.
Again, thanks again for the help. It was seriously appreciated. Long night was long.
If things change and the problem pops up again I'll update you guys.
Cheers,
Armand