On Wed, Nov 26, 2014 at 7:13 PM, Jeff Janes <jeff.janes@gmail.com> wrote: > If I do a pg_ctl stop -mf, then both files go away. If I do a pg_ctl stop > -mi, then neither goes away. It is only with the /sbin/reboot that I get > the fatal combination of _init being gone but the other still present.
Eh? That sounds wonky.
I mean, reboot normally kills processes with SIGTERM or SIGKILL, in which case I'd expect the outcome to match what you get with pg_ctl stop -mf or pg_ctl stop -mi. The only way I can see that you'd get a different behavior is if you did a hard reboot (like echo b > /proc/sysrq-trigger); if that changes things, then we might have a missing-fsync bug. How is that reboot managing to leave the main fork behind while losing the init fork?
During abort processing after getting a SIGTERM, the back end truncates 59288 to zero size, and unlinks all the other files (including 59288_init). The actual removal of 59288 is left until the checkpoint. So if you SIGTERM the backend, then take down the server uncleanly before the next checkpoint completes, you are left with just 59288.
Here is the strace:
open("base/16416/59288", O_RDWR) = 8
ftruncate(8, 0) = 0
close(8) = 0
unlink("base/16416/59288.1") = -1 ENOENT (No such file or directory)
unlink("base/16416/59288_fsm") = -1 ENOENT (No such file or directory)
unlink("base/16416/59288_vm") = -1 ENOENT (No such file or directory)
unlink("base/16416/59288_init") = 0
unlink("base/16416/59288_init.1") = -1 ENOENT (No such file or directory)