IMHO writing all the data into a WAL would be the cleanest solution.
Also, what is a small database? I don't think a static value will work, because the sweet spot between the current approach (forcing two checkpoints) and writing everything in WAL depends on the amount of dirty buffers that need to be checkpointed. Which is mostly driven by the size of shared buffers and write activity - for small shared buffers and/or mostly-read workload, checkpoints are cheap, so the 'small database' threshold (when choosing the WAL approach) is much higher than for large shared buffers or write-heavy workloads.
So are you proposing having a heuristic based on the amount of data in shared buffers and write activity? Do you have something in mind that works for general workloads as well?
So maybe if we could determine the amount of data to be checkpointed, and then base the decision on that, that'd work better? This would also have to take into account that writing into WAL is sequential, while checkpoints usually cause random writes all over the datafiles (which is more expensive).
Another option might be forcing just a "spread" checkpoint, not the immediate one (which is what we do now). That would not fix the CREATE DATABASE duration (actually, it would make it longer), but it would lower the impact on other activity on the machine.
I believe this to be the cleanest way to reduce the amount of I/O generated. If I understand correctly, the original problem you mentioned was not the time CREATE DATABASE is taking but rather the amount of I/O each one is generating.
This also leads me to think if it makes sense to explore group commits around the creation of files for a new database (for a same backend, of course). This might be on call, if the user knows he/she is going to create a lot of databases in the near future and is fine with a large spike in I/O at one go. Again, might be even more broken than the current scenario, but depends on what the user wants...