Apparently it is waiting for locks, cant the check be make in a "non-blocking" way, so if it ends up waiting for a lock then it just assumes non-visible and moves onto the next non-blocking?
Not only, it's a reason but you can get the same slow down with only one client and a bigger insert.
The issue is that a btree search for one value degenerate to a linear search other 1000 or more tuples.
As a matter of fact you get the same slow down after a rollback until autovacuum, and if autovacuum can't keep up...