Tom Lane wrote:
> "Patrick Earl" <patearl@patearl.net> writes:
>> In any case, the unit tests remove all contents and schema within the
>> database before starting, and they remove the tables they create as
>> they proceed. Certainly there are many things have been recently
>> deleted.
>
> Yeah, I think then there's no question that the bgwriter is trying to
> fsync something that's been deleted but isn't yet closed by every
> process. We have things set up so that that's not a really serious
> problem anymore --- eventually it will be closed and then the next
> checkpoint will succeed. But CREATE DATABASE insists on checkpointing
> and so it's vulnerable to even a transient failure.
>
> I've been resisting changing the checkpoint code to treat EACCES as a
> non-error situation on Windows, but maybe we have no choice. How do
> people feel about this idea: #ifdef WIN32 and the open or fsync fails
> with EACCES, then
>
> 1. Emit a LOG (or maybe DEBUG) message noting the problem.
> 2. Leave the fsync request entry in the hashtable for next time.
> 3. Allow the current checkpoint to complete normally anyway.
>
> If the file has actually been deleted, then eventually it will be closed
> and the next checkpoint will be able to remove the hash entry. If
> there's something else wrong, we'll keep bleating and maybe the DBA will
> notice eventually.
>
> The downside of this is that a real EACCES problem wouldn't get noted at
> any level higher than LOG, and so you could theoretically lose data
> without much warning. But I'm not seeing anything else we could do
> about it --- AFAIK we have not heard of a way we can distinguish this
> case from a real permissions problem. And anyway there should never
> *be* a real permissions problem; if there is then the user's been poking
> under the hood sufficient to void the warranty anyway ;-)
>
> Comments?
I find it very unlikely that you would "during normal operations" end up
in a situation where you would first have permissions to create files in
a directory, and then lose them.
What could be is that you have a directory where you never had
permissions to create the file in the first place.
Any chance to differentiate between these? In the first case, someone
did something to change the permissions, and can be expected to actually
check that things continued to work after that. In the second case, it
would be nice if it was possible to catch it faster.
//Magnus