From my perspective, this is much much better. For sufficiently large tables, I get parallel behaviour without jimmying with the defaults on parallel_setup_cost and parallel_tuple_cost. *And*, the parallel behaviour *is* sensitive to the costs of functions in target lists, so reasonably chosen costs will flip us into a parallel mode for expensive functions against smaller tables too.
Hopefully some variant of this finds it's way into core! Is there any way I can productively help?
P.