Re: Deleting a table file does not raise an error when the table is touched afterwards, why? - Mailing list pgsql-general

From David G. Johnston
Subject Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
Date
Msg-id CAKFQuwY20YCud+-m2QXEn-1uqTwuYMiW8NtpezZQPA2x-9O_rQ@mail.gmail.com
Whole thread Raw
In response to Re: Deleting a table file does not raise an error when the table is touched afterwards, why?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
List pgsql-general
On Mon, May 30, 2016 at 2:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Daniel Westermann <daniel.westermann@dbi-services.com> writes:
> - if the above is correct why does PostgreSQL only write a partial file back to disk/wal? For me this still seems dangerous as potentially nobody will notice it

In quiescent circumstances, Postgres wouldn't have written anything at
all, and the file would have disappeared completely at server shutdown,
and you would have gotten some sort of file-not-found error when you tried
the "count(*)" after restarting.  I hypothesize that you did an unclean
shutdown leading to replaying some amount of WAL at restart, and that WAL
included writing at least one block of the file (perhaps as a result of a
hint-bit update, or some other not-user-visible maintenance operation,
rather than anything you did explicitly).  The WAL replay code will
recreate the file if it doesn't exist on-disk --- this is important for
robustness.  Then you'd have a file that exists on-disk but is partially
filled with empty pages, which matches the observed behavior.  Depending
on various details you haven't provided, this might be indistinguishable
from a valid database state.


I suspect that page checksums might have detected the broken state, but if any of the written pages were partials since the non-overwritten-zeros on the partially written pages would have resulted in a different hash.

> - PostgreSQL assumes that someone with write access to the files knows what she/he is doing. ok, but still, in the real world cases like this happen (for whatever reason)

[ shrug... ] There's also an implied contract that you don't do "rm -rf /",
or shoot the disk drive full of holes with a .45, or various other
unrecoverable actions.  We're not really prepared to expend large amounts
of developer effort, or large amounts of runtime overhead, to detect such
cases.  (In particular, the fact that all-zero pages are a valid state is
unfortunate from this perspective, but it's more or less forced by
robustness concerns associated with table-extension behavior.  Most users
would not thank us for making table extension slower in order to issue a
more intelligible error for examples like this one.)

​rant​

​I have to think that we can reasonably ascribe unexpected system state to causes other than human behavior.  In both of the other examples PostgreSQL would fail to start so I'd say we have expected behavior in the face of those particular unexpected system states.

​IMO too much attention is being paid to the act of recreation.  But even if we presume that the only viable way to recreate this circumstance is to do so intentionally we've documented a clever way for someone to mess with the system in a subtle manner.

Up until Tom's last email I got very little out of the discussion.  It doesn't fill me with confidence when such an important topic is taken too glibly.  I suspect a large number of uses of PostgreSQL are in situations where if the application works everything is assumed to be fine.  People know that random things happen to hardware and that software can have bugs.  That is what this thread describes -  a potential situation that could happen due to non-human causes that results in a somewhat silently mis-operating system.

​There is still quite a bit of hand-waving here though - and I don't know whether being more precise really doesn't an end-user enough good that it would be worth writing up in the user-facing docs.  Like all areas I'm sure this is open to improvement but I'm sufficiently happy that the probability of an event of this precision is sufficiently unlikely to thus warrant the present behavior.​

​/rant​

David J.

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
Next
From: Alex Ignatov
Date:
Subject: Re: Silent data loss in its pure form