The start-up cost of bounded (top-N) sorts is sensitive at the small end of N, and the comparison_cost * tuples * LOG2(2.0 * output_tuples) curve doesn't seem to correspond to reality. Here's a contrived example that comes from a benchmark:
create table foo as select generate_series(1, 1000000)::int a, (random() * 1000)::int b; analyze foo; select * from foo order by b limit N;
This looks like a costing bug. The estimated cost of sorting 416,667 estimated tuples in one parallel worker is almost identical to the estimated cost of sorting 1,000,000 tuples when parallelism is turned off. Adding some logging to the cost_sort function, it looks like the limit does not get sent down for the parallel estimate:
So the LIMIT gets passed down for the execution step, but not for the planning step.
(On my system, LIMIT 61 is the point at which it switches to parallel)
In any case, if we "fixed" the top-n estimate to use the random-case rather than the worst-case, that would make the LIMIT 133 look more like the LIMIT 1, so would be the opposite direction of what you want.