Thread: Deleting a table file does not raise an error when the table is touched afterwards, why?
Deleting a table file does not raise an error when the table is touched afterwards, why?
From
Daniel Westermann
Date:
Hi,
I need to understand something: Lets assume I have a table t5 with 1'000'000 rows:
(postgres@[local]:5432) [sample] > select count(*) from t5;
count
---------
1000000
(1 row)
Time: 2363.834 ms
(postgres@[local]:5432) [sample] >
count
---------
1000000
(1 row)
Time: 2363.834 ms
(postgres@[local]:5432) [sample] >
I get the file for that table:
postgres@pg_essentials_p1:/u02/pgdata/PG1/base/16422/ [PG1] oid2name -d sample -t t5
From database "sample":
Filenode Table Name
----------------------
32809 t5
From database "sample":
Filenode Table Name
----------------------
32809 t5
Then I delete the file:
postgres@pg_essentials_p1:/u02/pgdata/PG1/base/16422/ [PG1] rm 32809
When doing the count(*) on the table again:
(postgres@[local]:5432) [sample] > select count(*) from t5;
count
---------
1000000
(1 row)
count
---------
1000000
(1 row)
No issue in the log. This is probably coming from the cache, isn't it? Is this intended and safe?
Then I restart the instance and do the select again:
2016-05-30 19:25:20.633 CEST - 9 - 2777 - - @ FATAL: could not open file "base/16422/32809": No such file or directory
2016-05-30 19:25:20.633 CEST - 10 - 2777 - - @ CONTEXT: writing block 8192 of relation base/16422/32809
2016-05-30 19:25:20.633 CEST - 10 - 2777 - - @ CONTEXT: writing block 8192 of relation base/16422/32809
(postgres@[local]:5432) [sample] > select count(*) from t5;
count
--------
437920
(1 row)
count
--------
437920
(1 row)
Can someone please tell me the intention behind that? From my point of view this is dangerous. If nobody is monitoring the log (which sadly is the case in reality) nobody will notice that only parts of the table are there. Wouldn't it be much more safe to raise an error as soon as the table is touched?
PostgreSQL version:
(postgres@[local]:5432) [sample] > select version();
-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------
version | PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bit
-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------
version | PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bit
Thanks in advance
Daniel
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
Melvin Davidson
Date:
On Mon, May 30, 2016 at 11:35 AM, Daniel Westermann <daniel.westermann@dbi-services.com> wrote:
Hi,I need to understand something: Lets assume I have a table t5 with 1'000'000 rows:(postgres@[local]:5432) [sample] > select count(*) from t5;
count
---------
1000000
(1 row)
Time: 2363.834 ms
(postgres@[local]:5432) [sample] >I get the file for that table:postgres@pg_essentials_p1:/u02/pgdata/PG1/base/16422/ [PG1] oid2name -d sample -t t5
From database "sample":
Filenode Table Name
----------------------
32809 t5Then I delete the file:postgres@pg_essentials_p1:/u02/pgdata/PG1/base/16422/ [PG1] rm 32809When doing the count(*) on the table again:(postgres@[local]:5432) [sample] > select count(*) from t5;
count
---------
1000000
(1 row)No issue in the log. This is probably coming from the cache, isn't it? Is this intended and safe?Then I restart the instance and do the select again:2016-05-30 19:25:20.633 CEST - 9 - 2777 - - @ FATAL: could not open file "base/16422/32809": No such file or directory
2016-05-30 19:25:20.633 CEST - 10 - 2777 - - @ CONTEXT: writing block 8192 of relation base/16422/32809(postgres@[local]:5432) [sample] > select count(*) from t5;
count
--------
437920
(1 row)Can someone please tell me the intention behind that? From my point of view this is dangerous. If nobody is monitoring the log (which sadly is the case in reality) nobody will notice that only parts of the table are there. Wouldn't it be much more safe to raise an error as soon as the table is touched?PostgreSQL version:(postgres@[local]:5432) [sample] > select version();
-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------
version | PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bitThanks in advanceDaniel
I have not heard of pg_essentials, but obviously it is external to the PostgreSQL server.
PostgreSQL cannot tell is someone is intentionally messing with the file system. You have removed only the first file node with rm 32809.
First off, you should never do that. If you want to drop the table, then do DROP TABLE t5;
That will drop all the file nodes for that table.
You may as well ask "If I shoot myself in the head, why don't I feel any pain?".
You could also do rm -r *.* if you really want to screw the pooch. The O/S won't complain, but you will be very sorry!
You could also do rm -r *.* if you really want to screw the pooch. The O/S won't complain, but you will be very sorry!
--
Melvin Davidson
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
Francisco Olarte
Date:
Hi Daniel: On Mon, May 30, 2016 at 5:35 PM, Daniel Westermann <daniel.westermann@dbi-services.com> wrote: > I get the file for that table: ... > Then I delete the file: Well, you corrupted the database and invoked undefined behaviour ( not exactly, but postgres is not designed for this ). > No issue in the log. This is probably coming from the cache, isn't it? Is > this intended and safe? It's probably not intended. It can come from the cache or it can arrive from the fact that you are running a unix flavour. In unix ( at the OS level, in the clasical filesystems ) you do not delete a file, you unlink it ( remove the pointer to it in the directory ), the file is removed by the OS when nobody can reach it, which means nobody has it open an no directory points to it ( so no one else can open it, is like reference counting ) ( In fact this behaviour is used on purpose for temporary files, you open it, unlink it and know when you exit, either normaly or crashing, the OS deletes it ). Postgres has the file open, and probably does not bother checking wether somebody removed it under from the directory, as there is no correct behaviour in this case, so no point in checking it. > Then I restart the instance and do the select again: > 2016-05-30 19:25:20.633 CEST - 9 - 2777 - - @ FATAL: could not open file > "base/16422/32809": No such file or directory As expected. > Can someone please tell me the intention behind that? From my point of view > this is dangerous. If nobody is monitoring the log (which sadly is the case > in reality) nobody will notice that only parts of the table are there. > Wouldn't it be much more safe to raise an error as soon as the table is > touched? If you are going to implement idealised behaviour, prohibiting people from deleting it would be better. Any user with minimu knwledge and enouugh privileges can put programs in states from which they cannot recover, there is not point in checking every corner case. In fact, if you can remove the file under the servers feet you can probably alter the running server memory, which would you think the correct behaviour would be for a 'poke rand(),rand()' in the server process? It could have triple redundancy copy of every page and try to vote and detect in each instruction, but is pointless. Francisco Olarte.
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
Simon Riggs
Date:
On 30 May 2016 at 16:35, Daniel Westermann <daniel.westermann@dbi-services.com> wrote:
It's a very good thing that we remain flying even with multiple bullet holes in the wings.
--
...
Then I delete the file:
...
No issue in the log. This is probably coming from the cache, isn't it? Is this intended and safe?
Postgres manages your data for you. What you're doing is not a supported use case and I recommend not to do that in the future.
Can someone please tell me the intention behind that? From my point of view this is dangerous. If nobody is monitoring the log (which sadly is the case in reality) nobody will notice that only parts of the table are there. Wouldn't it be much more safe to raise an error as soon as the table is touched?
How would we know that an external agent had deleted the file? What action should we take if we did notice?
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
Tom Lane
Date:
Daniel Westermann <daniel.westermann@dbi-services.com> writes: > Then I delete the file: > postgres@pg_essentials_p1:/u02/pgdata/PG1/base/16422/ [PG1] rm 32809 There's a reason why the database directory is not readable/writable by unprivileged users: it's to prevent them from doing dumb things like that. People who do have write access on the database are assumed to know better. > Wouldn't it be much more safe to raise an error as soon as the table is touched? Well, yes, but it would impose huge amounts of overhead in order to raise an error a bit sooner for a stupid user action. The ideal thing would be to prevent users from breaking their database in the first place --- but there's not much we can do in that direction beyond setting the directory permissions. regards, tom lane
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
Alex Ignatov
Date:
On 30.05.2016 18:35, Daniel Westermann wrote:
Hi,I need to understand something: Lets assume I have a table t5 with 1'000'000 rows:(postgres@[local]:5432) [sample] > select count(*) from t5;
count
---------
1000000
(1 row)
Time: 2363.834 ms
(postgres@[local]:5432) [sample] >I get the file for that table:postgres@pg_essentials_p1:/u02/pgdata/PG1/base/16422/ [PG1] oid2name -d sample -t t5
From database "sample":
Filenode Table Name
----------------------
32809 t5Then I delete the file:postgres@pg_essentials_p1:/u02/pgdata/PG1/base/16422/ [PG1] rm 32809When doing the count(*) on the table again:(postgres@[local]:5432) [sample] > select count(*) from t5;
count
---------
1000000
(1 row)No issue in the log. This is probably coming from the cache, isn't it? Is this intended and safe?Then I restart the instance and do the select again:2016-05-30 19:25:20.633 CEST - 9 - 2777 - - @ FATAL: could not open file "base/16422/32809": No such file or directory
2016-05-30 19:25:20.633 CEST - 10 - 2777 - - @ CONTEXT: writing block 8192 of relation base/16422/32809(postgres@[local]:5432) [sample] > select count(*) from t5;
count
--------
437920
(1 row)Can someone please tell me the intention behind that? From my point of view this is dangerous. If nobody is monitoring the log (which sadly is the case in reality) nobody will notice that only parts of the table are there. Wouldn't it be much more safe to raise an error as soon as the table is touched?PostgreSQL version:(postgres@[local]:5432) [sample] > select version();
-[ RECORD 1 ]----------------------------------------------------------------------------------------------------------------------------
version | PostgreSQL 9.4.4 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bitThanks in advanceDaniel
Hi if you delete file from external process that open this file this external process never ever notice it. Only after it close this file handler you fall it some issues with "file not exist" and other.
Alex Ignatov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
David W Noon
Date:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Mon, 30 May 2016 17:35:34 +0200 (CEST), Daniel Westermann (daniel.westermann@dbi-services.com) wrote about "[GENERAL] Deleting a table file does not raise an error when the table is touched afterwards, why?" (in <184509399.5590018.1464622534207.JavaMail.zimbra@dbi-services.com>): [snip] > Then I delete the file: > > postgres@pg_essentials_p1:/u02/pgdata/PG1/base/16422/ [PG1] rm > 32809 Actually, you are not deleting the file. You are asking the filesystem driver to delete it when it has stopped being used. The directory entry is removed immediately though, so that no other process can open i t. > When doing the count(*) on the table again: > > (postgres@[local]:5432) [sample] > select count(*) from t5; count > --------- 1000000 (1 row) > > No issue in the log. This is probably coming from the cache, isn't > it? No, the file still exists because a PG back-end still has it open. > Is this intended and safe? It is standard UNIX behaviour. It is not safe because you are not supposed to do things that way. - -- Regards, Dave [RLU #314465] *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* david.w.noon@googlemail.com (David W Noon) *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAldMcjcACgkQogYgcI4W/5T/xgCfaQBh6g0WCBRkeNOlRK4Kbc43 Gs4An0UXb+piw+BQUGJupPtN+oHJZjVH =td+i -----END PGP SIGNATURE-----
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
Daniel Westermann
Date:
>>>
>>>On Mon, 30 May 2016 17:35:34 +0200 (CEST), Daniel Westermann
>>>(daniel.westermann@dbi-services.com) wrote about "[GENERAL] Deleting a
>>>table file does not raise an error when the table is touched
>>>afterwards, why?" (in
>>><184509399.5590018.1464622534207.JavaMail.zimbra@dbi-services.com>):
>>>
>>>[snip]
>>>> Then I delete the file:
>>>>
>>>> postgres@pg_essentials_p1:/u02/pgdata/PG1/base/16422/ [PG1] rm
>>>> 32809
>>>
>>>Actually, you are not deleting the file. You are asking the filesystem
>>>driver to delete it when it has stopped being used. The directory
>>>entry is removed immediately though, so that no other process can open i
>>>t
>>>
>>>> When doing the count(*) on the table again:
>>>>
>>>> (postgres@[local]:5432) [sample] > select count(*) from t5; count
>>>> --------- 1000000 (1 row)
>>>>
>>>> No issue in the log. This is probably coming from the cache, isn't
>>>> it?
>>>
>>>No, the file still exists because a PG back-end still has it open.
>>>
>>>> Is this intended and safe?
>>>
>>>It is standard UNIX behaviour. It is not safe because you are not
>>>supposed to do things that way.
>>>- --
>>>Regards,
>>>
>>>Dave [RLU #314465]
>>>table file does not raise an error when the table is touched
>>>afterwards, why?" (in
>>><184509399.5590018.1464622534207.JavaMail.zimbra@dbi-services.com>):
>>>
>>>[snip]
>>>> Then I delete the file:
>>>>
>>>> postgres@pg_essentials_p1:/u02/pgdata/PG1/base/16422/ [PG1] rm
>>>> 32809
>>>
>>>Actually, you are not deleting the file. You are asking the filesystem
>>>driver to delete it when it has stopped being used. The directory
>>>entry is removed immediately though, so that no other process can open i
>>>t
>>>
>>>> When doing the count(*) on the table again:
>>>>
>>>> (postgres@[local]:5432) [sample] > select count(*) from t5; count
>>>> --------- 1000000 (1 row)
>>>>
>>>> No issue in the log. This is probably coming from the cache, isn't
>>>> it?
>>>
>>>No, the file still exists because a PG back-end still has it open.
>>>
>>>> Is this intended and safe?
>>>
>>>It is standard UNIX behaviour. It is not safe because you are not
>>>supposed to do things that way.
>>>- --
>>>Regards,
>>>
>>>Dave [RLU #314465]
Thanks all for your answers. Maybe I should have provided more background information: We had an internal workshop today and one of the topics was backup/restore. One of the questions was what will happen if (for any reason) a file gets deleted so we tested it. I am aware that this is not the common use case. But still I want to understand why PostgreSQL works the way described. From the answers I understand this:
- the file is not really deleted because PostgreSQL is still using it => correct?
- if the above is correct why does PostgreSQL only write a partial file back to disk/wal? For me this still seems dangerous as potentially nobody will notice it
- PostgreSQL assumes that someone with write access to the files knows what she/he is doing. ok, but still, in the real world cases like this happen (for whatever reason)
Simon's answer:
- It's a very good thing that we remain flying even with multiple bullet holes in the wings.
Really? It depends on how you look at that, doesn't it? I'd prefer to get an error in this case, maybe I am wrong but I prefer to be noticed if a file is missing instead of getting results
Tom's answer:
- Well, yes, but it would impose huge amounts of overhead in order to raise an error a bit sooner for a stupid user action. The ideal thing would be to prevent users from breaking their database in the first place --- but there's not much we can do in that direction beyond setting the directory permissions.
Ok, makes sense. But "a bit sooner" to what? The count(*) just returns a result. From a user perspective I have no idea that the result is wrong
Thanks again
Daniel
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
David W Noon
Date:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Mon, 30 May 2016 19:49:36 +0200 (CEST), Daniel Westermann (daniel.westermann@dbi-services.com) wrote about "Re: [GENERAL] Deleting a table file does not raise an error when the table is touched afterwards, why?" (in <1247360337.5599235.1464630576430.JavaMail.zimbra@dbi-services.com>): [snip] > Thanks all for your answers. Maybe I should have provided more > background information: We had an internal workshop today and one > of the topics was backup/restore. One of the questions was what > will happen if (for any reason) a file gets deleted so we tested > it. I am aware that this is not the common use case. But still I > want to understand why PostgreSQL works the way described. From the > answers I understand this: > > - the file is not really deleted because PostgreSQL is still using > it => correct? It is correct. > - if the above is correct why does PostgreSQL only write a partial > file back to disk/wal? PG only writes modified pages to WAL, even if another process has requested the deletion of the tablespace file. In fact, PG is not even aware of the deletion request. > For me this still seems dangerous as potentially nobody will notice > it It is intrinsically dangerous, which is why only root and postgres userids have write permissions on physical filesystems to be used by PG. Indeed, if you use an alternative security model even root will not have write permissions, only postgres. > - PostgreSQL assumes that someone with write access to the files > knows what she/he is doing. ok, but still, in the real world cases > like this happen (for whatever reason) It is very unlikely to happen once; it should never happen twice. No competent DBA would do that once, let alone twice; if a DBA does that once, he/she ceases to be a DBA. The issue you are raising here is the misuse of filesystem permissions. There is nothing PG can do here beyond what it already does: check that the permissions mask and ownership are as tight as possible during start-up of each tablespace. This reminds me of the old music hall joke: a patient goes to the doctor, raises his arm and says "Doc, it hurts when I do this!", to which the doctor replies "Well don't do that." Basically, it behoves your support staff to manage the physical filesystems correctly and not damage them. - -- Regards, Dave [RLU #314465] *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* david.w.noon@googlemail.com (David W Noon) *-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-* -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAldMiSMACgkQogYgcI4W/5T4UgCgzQQGvhdo+yxr7VSUPyzTJMa8 xAwAn3vPHQ4UOOhSL4kjCtl6Cq5sVeb0 =/lld -----END PGP SIGNATURE-----
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
Tom Lane
Date:
Daniel Westermann <daniel.westermann@dbi-services.com> writes: > - if the above is correct why does PostgreSQL only write a partial file back to disk/wal? For me this still seems dangerousas potentially nobody will notice it In quiescent circumstances, Postgres wouldn't have written anything at all, and the file would have disappeared completely at server shutdown, and you would have gotten some sort of file-not-found error when you tried the "count(*)" after restarting. I hypothesize that you did an unclean shutdown leading to replaying some amount of WAL at restart, and that WAL included writing at least one block of the file (perhaps as a result of a hint-bit update, or some other not-user-visible maintenance operation, rather than anything you did explicitly). The WAL replay code will recreate the file if it doesn't exist on-disk --- this is important for robustness. Then you'd have a file that exists on-disk but is partially filled with empty pages, which matches the observed behavior. Depending on various details you haven't provided, this might be indistinguishable from a valid database state. > - PostgreSQL assumes that someone with write access to the files knows what she/he is doing. ok, but still, in the realworld cases like this happen (for whatever reason) [ shrug... ] There's also an implied contract that you don't do "rm -rf /", or shoot the disk drive full of holes with a .45, or various other unrecoverable actions. We're not really prepared to expend large amounts of developer effort, or large amounts of runtime overhead, to detect such cases. (In particular, the fact that all-zero pages are a valid state is unfortunate from this perspective, but it's more or less forced by robustness concerns associated with table-extension behavior. Most users would not thank us for making table extension slower in order to issue a more intelligible error for examples like this one.) regards, tom lane
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
"David G. Johnston"
Date:
Daniel Westermann <daniel.westermann@dbi-services.com> writes:
> - if the above is correct why does PostgreSQL only write a partial file back to disk/wal? For me this still seems dangerous as potentially nobody will notice it
In quiescent circumstances, Postgres wouldn't have written anything at
all, and the file would have disappeared completely at server shutdown,
and you would have gotten some sort of file-not-found error when you tried
the "count(*)" after restarting. I hypothesize that you did an unclean
shutdown leading to replaying some amount of WAL at restart, and that WAL
included writing at least one block of the file (perhaps as a result of a
hint-bit update, or some other not-user-visible maintenance operation,
rather than anything you did explicitly). The WAL replay code will
recreate the file if it doesn't exist on-disk --- this is important for
robustness. Then you'd have a file that exists on-disk but is partially
filled with empty pages, which matches the observed behavior. Depending
on various details you haven't provided, this might be indistinguishable
from a valid database state.
I suspect that page checksums might have detected the broken state, but if any of the written pages were partials since the non-overwritten-zeros on the partially written pages would have resulted in a different hash.
> - PostgreSQL assumes that someone with write access to the files knows what she/he is doing. ok, but still, in the real world cases like this happen (for whatever reason)
[ shrug... ] There's also an implied contract that you don't do "rm -rf /",
or shoot the disk drive full of holes with a .45, or various other
unrecoverable actions. We're not really prepared to expend large amounts
of developer effort, or large amounts of runtime overhead, to detect such
cases. (In particular, the fact that all-zero pages are a valid state is
unfortunate from this perspective, but it's more or less forced by
robustness concerns associated with table-extension behavior. Most users
would not thank us for making table extension slower in order to issue a
more intelligible error for examples like this one.)
rant
I have to think that we can reasonably ascribe unexpected system state to causes other than human behavior. In both of the other examples PostgreSQL would fail to start so I'd say we have expected behavior in the face of those particular unexpected system states.
IMO too much attention is being paid to the act of recreation. But even if we presume that the only viable way to recreate this circumstance is to do so intentionally we've documented a clever way for someone to mess with the system in a subtle manner.
Up until Tom's last email I got very little out of the discussion. It doesn't fill me with confidence when such an important topic is taken too glibly. I suspect a large number of uses of PostgreSQL are in situations where if the application works everything is assumed to be fine. People know that random things happen to hardware and that software can have bugs. That is what this thread describes - a potential situation that could happen due to non-human causes that results in a somewhat silently mis-operating system.
There is still quite a bit of hand-waving here though - and I don't know whether being more precise really doesn't an end-user enough good that it would be worth writing up in the user-facing docs. Like all areas I'm sure this is open to improvement but I'm sufficiently happy that the probability of an event of this precision is sufficiently unlikely to thus warrant the present behavior.
/rant
David J.
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
"David G. Johnston"
Date:
I have to think that we can reasonably ascribe unexpected system state to causes other than human behavior. In both of the other examples PostgreSQL would fail to start so I'd say we have expected behavior in the face of those particular unexpected system states.
Alex Ignatov started a new thread was started on this topic as well...
I posted a link to this thread on his new one as well.
David J.
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
Daniel Westermann
Date:
>> Alex Ignatov started a new thread was started on this topic as well...
>>
>>
>>I posted a link to this thread on his new one as well.
>>
>>David J.
for completeness: same issue with data checksums enabled:
(postgres@[local]:5555) [postgres] > show data_checksums ;
data_checksums
----------------
on
(1 row)
data_checksums
----------------
on
(1 row)
Steps to reproduce:
create table t1 ( a int );
insert into t1 values ( generate_series(1,1000000));
select count(*) from t1;
=> rm the table files
=> select count(*) still works
(postgres@[local]:5555) [postgres] > select count(*) from t1;
count
---------
1000000
(1 row)
count
---------
1000000
(1 row)
=> then shutdown fast:
LOG: received fast shutdown request
LOG: aborting any active transactions
FATAL: terminating connection due to administrator command
LOG: autovacuum launcher shutting down
LOG: shutting down
FATAL: could not open file "base/13056/16384": No such file or directory
CONTEXT: writing block 2946 of relation base/13056/16384
LOG: checkpointer process (PID 3004) exited with exit code 1
LOG: terminating any other active server processes
LOG: abnormal database system shutdown
LOG: aborting any active transactions
FATAL: terminating connection due to administrator command
LOG: autovacuum launcher shutting down
LOG: shutting down
FATAL: could not open file "base/13056/16384": No such file or directory
CONTEXT: writing block 2946 of relation base/13056/16384
LOG: checkpointer process (PID 3004) exited with exit code 1
LOG: terminating any other active server processes
LOG: abnormal database system shutdown
=> startup
LOG: database system shutdown was interrupted; last known up at 2016-05-31 10:52:17 CEST
LOG: database system was not properly shut down; automatic recovery in progress
LOG: redo starts at 0/7070808
LOG: record with zero length at 0/76EECD0
LOG: redo done at 0/76EEC68
LOG: last completed transaction was at log time 2016-05-31 10:51:27.689776+02
LOG: MultiXact member wraparound protections are now enabled
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
LOG: database system was not properly shut down; automatic recovery in progress
LOG: redo starts at 0/7070808
LOG: record with zero length at 0/76EECD0
LOG: redo done at 0/76EEC68
LOG: last completed transaction was at log time 2016-05-31 10:51:27.689776+02
LOG: MultiXact member wraparound protections are now enabled
LOG: database system is ready to accept connections
LOG: autovacuum launcher started
(postgres@[local]:5555) [postgres] > select count(*) from t1;
count
--------
186450
(1 row)
count
--------
186450
(1 row)
Re: Deleting a table file does not raise an error when the table is touched afterwards, why?
From
Francisco Olarte
Date:
Hi: On Tue, May 31, 2016 at 8:58 AM, Daniel Westermann <daniel.westermann@dbi-services.com> wrote: > for completeness: same issue with data checksums enabled: ... > => rm the table files > => select count(*) still works And ? I would vote for not doing anything on this, after all you are working outside the 'envelope' on this. Is like saying 'I commit with fsync enabled, but then I zero the disk and the data is lost'. As I said is like undefined behaviour in C. Currently *0 tends to SIGSEGV on both reads and writes, but I've worked in OSs where it did only on writes, and where it worked fine ( like CP/M ). For you the correct behaviour maybe to fail fast and loose a little speed, for others the current one may be OK. Normally for this things you go for the path with less code, as deleted code is bug free. Francisco Olarte.