This is your problem. There is only one row in the pgbench_branch table, and every transaction has to update that one row. This is inherently a seriaized event.
You trying to overcome the latency by cramming more stuff through the pipe at a time, but what you are cramming through must be done in single-file.
One solution is to just use a large scale on the benchmark so that they upate random pgbench_branch rows, rather than all updating the same row:
pgbench -i -s50
Alternatively, you could write a custom file so that all 7 commands are sent down in one packet. That way the releasing COMMIT shows up at the same time as the locking UPDATE does, rather than having 2 more round trips between them. But this would violate the spirit of the benchmark, as presumably you are expected to inspect the results of the SELECT statement before proceeding with the rest of the transaction.
Or you could write a custom benchmark which more closely resembles whatever your true workload will be.