Thread: Is it OK to ignore directory open failure in ResetUnloggedRelations?
While working through Michael Paquier's patch to clean up inconsistent usage of AllocateDir(), I noticed that ResetUnloggedRelations and its subroutines are not consistent about whether a directory open failure results in erroring out or just emitting a LOG message and continuing. ResetUnloggedRelations itself throws a hard error if it fails to open pg_tblspc, but all the rest of reinit.c thinks a LOG message is sufficient. My first thought was to change ResetUnloggedRelations to match the rest, but on reflection I'm less sure about that. What we've got at the moment is that a possibly-transient directory open failure can result in failure to reset an unlogged relation to empty, which to me amounts to data corruption. If the contents of the unlogged relation are inconsistent, which is plenty likely after a crash, we could end up crashing later because of that; and in any case the user would not see what they expect in the tables. So now I'm thinking we should do the reverse and change these functions to give a hard error on AllocateDir failure. That would result in startup-process failure if we are unable to scan the database, which is not great, but there's certainly something badly wrong if we can't. Thoughts? regards, tom lane
Hi Tom, On 12/4/17 3:15 PM, Tom Lane wrote: > While working through Michael Paquier's patch to clean up inconsistent > usage of AllocateDir(), I noticed that ResetUnloggedRelations and its > subroutines are not consistent about whether a directory open failure > results in erroring out or just emitting a LOG message and continuing. > ResetUnloggedRelations itself throws a hard error if it fails to open > pg_tblspc, but all the rest of reinit.c thinks a LOG message is > sufficient. By a strange coincidence I spent a while today reading through this code... > My first thought was to change ResetUnloggedRelations to match the > rest, but on reflection I'm less sure about that. What we've got > at the moment is that a possibly-transient directory open failure > can result in failure to reset an unlogged relation to empty, > which to me amounts to data corruption. I'm wondering how this transient directory open failure is going to happen without a bunch of other things going wrong, but I agree that if it happens then corruption would be the likely result. > If the contents of the > unlogged relation are inconsistent, which is plenty likely after > a crash, we could end up crashing later because of that; and in > any case the user would not see what they expect in the tables. Agreed. > So now I'm thinking we should do the reverse and change these functions > to give a hard error on AllocateDir failure. That would result in > startup-process failure if we are unable to scan the database, which is > not great, but there's certainly something badly wrong if we can't. +1. If a tablespace or database directory cannot be opened then I don't think it makes any sense to continue. Regards, -- -David david@pgmasters.net
On Mon, Dec 04, 2017 at 03:15:08PM -0500, Tom Lane wrote: > While working through Michael Paquier's patch to clean up inconsistent > usage of AllocateDir(), I noticed that ResetUnloggedRelations and its > subroutines are not consistent about whether a directory open failure > results in erroring out or just emitting a LOG message and continuing. > ResetUnloggedRelations itself throws a hard error if it fails to open > pg_tblspc, but all the rest of reinit.c thinks a LOG message is > sufficient. ... > So now I'm thinking we should do the reverse and change these functions > to give a hard error on AllocateDir failure. That would result in > startup-process failure if we are unable to scan the database, which is > not great, but there's certainly something badly wrong if we can't. I can offer a data point unrelated to unlogged relations. Sometimes, following a reboot, if there's a tablespace on ZFS, and if a ZPOOL backing device is missing/renamed (especially under qemu), postgres (if it was shutdown cleanly) will happily start even though a tablespace is missing (due to unable to find backing device - ZFS wants it to be exported and imported before it scans all devices for matching UUID). That has been surprising to me in the past and lead me to believe that "services are up" following a reboot only to notice a bunch of ERRORs in the logs a handful of minutes later. Maybe that counts for a tangential +1. Justin