Apparently it is waiting for locks, cant the check be make in a "non-blocking" way, so if it ends up waiting for a lock then it just assumes non-visible and moves onto the next non-blocking?
Not only, it's a reason but you can get the same slow down with only one client and a bigger insert.
The issue is that a btree search for one value degenerate to a linear search other 1000 or more tuples.
As a matter of fact you get the same slow down after a rollback until autovacuum, and if autovacuum can't keep up...
Have you experimentally verified the last part? btree indices have some special kill-tuple code which should remove aborted tuples from the index the first time they are encountered, without need for a vacuum.