OTOH, if we use max_wal_size as a hard limit, we can avoid such PANIC error and long down time. Of course, in this case, once max_wal_size is reached, we cannot complete any query writing WAL until the checkpoint has completed and removed old WAL files. During that time, the database service looks like down from a client, but its down time is shorter than the PANIC error case. So I'm thinking that some users might want the hard limit of pg_xlog size.
I wonder if we could tie this in with the recent proposal from the Heroku guys to have a way to slow down WAL writing. Maybe we have several limits:
I didn't see that proposal, link? Because the idea of slowing down wal-writing sounds insane.