Thread: Unused files in the database directory after crashed VACUUM FULL
Hi, I've just had a "VACUUM FULL <table>" crash due to 100% disk usage. Clearly my fault, I was expecting the new table to be small enough. After freeing up space, restarting the cluster and issuing another VACCUM FULL, I noticed that the cluster was way bigger that it should be. In the base/<db>/ folder, there was a large number of files with one certain number that pg_filenode_relation() could not turn into a relation. As that number was just a bit smaller that the pg_relation_filepath() of the table I was working on, I guess these were the remains of the failed VACUUM FULL operation? I removed those files and a VACCUM ANALYZE of the whole database went fine. So... is this the expected behaviour? Is there a safe procedure how to check/clean up "unnecessary" files in the cluster directory? Version: ('PostgreSQL 10.3 (Debian 10.3-1.pgdg90+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, 64-bit') Table definition: ( id serial PRIMARY KEY, data bytea, parent_id integer NOT NULL ) The "data" column values are large enough to be TOASTed. Thanks & best regards, -hannes
Hannes Erven <hannes@erven.at> writes: > I've just had a "VACUUM FULL <table>" crash due to 100% disk usage. > Clearly my fault, I was expecting the new table to be small enough. What do you mean by "crash" exactly? A normal transactional failure should've cleaned up orphaned files. I suppose if the kernel decided to kill -9 the vacuum process, that wouldn't happen --- but that's not the usual response to out-of-disk-space. > After freeing up space, restarting the cluster and issuing another > VACCUM FULL, I noticed that the cluster was way bigger that it should be. > In the base/<db>/ folder, there was a large number of files with one > certain number that pg_filenode_relation() could not turn into a > relation. As that number was just a bit smaller that the > pg_relation_filepath() of the table I was working on, I guess these were > the remains of the failed VACUUM FULL operation? > I removed those files and a VACCUM ANALYZE of the whole database went fine. > So... is this the expected behaviour? Is there a safe procedure how to > check/clean up "unnecessary" files in the cluster directory? You just described it --- verify with pg_filenode_relation that the file doesn't correspond to any pg_class entry, then manually delete. (There is a crash-recovery phase that does something similar, but I think it's only designed to clean up temporary files.) regards, tom lane
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes: > Hannes Erven <hannes@erven.at> writes: >> I've just had a "VACUUM FULL <table>" crash due to 100% disk usage. >> Clearly my fault, I was expecting the new table to be small enough. Tom> What do you mean by "crash" exactly? A normal transactional Tom> failure should've cleaned up orphaned files. I suppose if the Tom> kernel decided to kill -9 the vacuum process, that wouldn't happen Tom> --- but that's not the usual response to out-of-disk-space. The failure mode of this kind we see reported most often is getting a PANIC failure from inability to extend the WAL. -- Andrew (irc:RhodiumToad)
Hi again, Am 10.02.19 um 16:41 schrieb Tom Lane: > Hannes Erven <hannes@erven.at> writes: >> I've just had a "VACUUM FULL <table>" crash due to 100% disk usage. >> Clearly my fault, I was expecting the new table to be small enough. > > What do you mean by "crash" exactly? A normal transactional failure > should've cleaned up orphaned files. I suppose if the kernel decided to > kill -9 the vacuum process, that wouldn't happen --- but that's not > the usual response to out-of-disk-space. Here's the exact log (the pgadmin process was running the VACCUM FULL): 2019-02-09 23:44:53.516 CET [20341] @/ <> PANIC: could not write to file "pg_wal/xlogtemp.20341": No space left on device 2019-02-09 23:44:53.516 CET [19181] @/ <> LOG: WAL writer process (PID 20341) was terminated by signal 6: Aborted 2019-02-09 23:44:53.516 CET [19181] @/ <> LOG: terminating any other active server processes 2019-02-09 23:44:53.516 CET [20816] postgres@IP/db <pgAdmin 4 - DB:db> WARNING: terminating connection because of crash of another serve r process 2019-02-09 23:44:53.516 CET [20816] postgres@IP/db <pgAdmin 4 - DB:db> DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory. 2019-02-09 23:44:53.516 CET [20816] postgres@IP/db <pgAdmin 4 - DB:db> HINT: In a moment you should be able to reconnect to the database and repeat your command. 2019-02-09 23:44:53.597 CET [19181] @/ <> LOG: all server processes terminated; reinitializing 2019-02-09 23:44:53.679 CET [11686] @/ <> LOG: database system was interrupted; last known up at 2019-02-09 23:44:40 CET 2019-02-09 23:44:58.761 CET [11686] @/ <> LOG: database system was not properly shut down; automatic recovery in progress 2019-02-09 23:44:58.807 CET [11686] @/ <> LOG: redo starts at 1DA/B3400798 2019-02-09 23:44:58.811 CET [11686] @/ <> FATAL: could not extend file "base/16400/612716_fsm": No space left on device 2019-02-09 23:44:58.811 CET [11686] @/ <> HINT: Check free disk space. 2019-02-09 23:44:58.811 CET [11686] @/ <> CONTEXT: WAL redo at 1DA/B3406310 for Heap/INSERT: off 4 After freeing space and restarting: 2019-02-10 06:28:25.622 CET [14344] @/ <> LOG: database system was interrupted while in recovery at 2019-02-09 23:44:58 CET 2019-02-10 06:28:25.622 CET [14344] @/ <> HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery. 2019-02-10 06:28:26.762 CET [14344] @/ <> LOG: database system was not properly shut down; automatic recovery in progress 2019-02-10 06:28:26.831 CET [14344] @/ <> LOG: redo starts at 1DA/B3400798 2019-02-10 06:28:27.624 CET [14344] @/ <> LOG: redo done at 1DA/D5FFF9D8 >> Is there a safe procedure how to >> check/clean up "unnecessary" files in the cluster directory? > > You just described it --- verify with pg_filenode_relation that the > file doesn't correspond to any pg_class entry, then manually delete. OK, I see. I came up with this: with d0 as ( select oid from pg_database where datname=current_database() ), d1 as ( select pg_ls_dir as fn, regexp_match(pg_ls_dir, '^([0-9]+)(.*)$') as match from d0, pg_ls_dir('base/'||d0.oid) order by 1 ), d2 as ( select fn, match[1] as base, match[2] as ext from d1 where (fn NOT ILIKE 'pg_%') ), d3 as ( select d.*, pg_filenode_relation(0, base::oid) as relation from d2 d ) select fn, pg_size_pretty((pg_stat_file('base/'||d0.oid||'/'||fn)).size) from d0, d3 where relation is null; fn | pg_size_pretty ------------+---------------- 612713 | 25 MB 612718 | 1024 MB 612718.1 | 1024 MB 612718.2 | 180 MB t11_621351 | 96 kB t11_621354 | 0 bytes t11_621356 | 8192 bytes t11_621357 | 160 kB t11_621360 | 0 bytes t11_621362 | 8192 bytes So there's still 612713 and 612718* left to clean up; judging from the numbers and timestamps, they probably belong to other VACUUM FULLs I manually canceled in the pgadmin UI during my experiments. Is this interesting enough to further investigate? If yes, I'll be happy to work on a test case. In any case, I've now learned (how) to check for orphaned files if a database cluster's on-disk size is way off from the sums of pg_total_relation_size() + WAL... Thanks again & best regards, -hannes
Hannes Erven <hannes@erven.at> writes: > Am 10.02.19 um 16:41 schrieb Tom Lane: >> What do you mean by "crash" exactly? > Here's the exact log (the pgadmin process was running the VACCUM FULL): > 2019-02-09 23:44:53.516 CET [20341] @/ <> PANIC: could not write to > file "pg_wal/xlogtemp.20341": No space left on device Ah, so Andrew was correct: we panicked due to lack of WAL space, and that explains why the vacuuming process didn't have an opportunity to delete the files belonging to the uncommitted new relation. If you're concerned about this sort of thing, you might consider putting the WAL directory on a separate filesystem from the main database files. >>> Is there a safe procedure how to >>> check/clean up "unnecessary" files in the cluster directory? >> You just described it --- verify with pg_filenode_relation that the >> file doesn't correspond to any pg_class entry, then manually delete. > OK, I see. I came up with this: I think this isn't accounting for files in non-default tablespaces, but maybe that's not a problem for your usage. > Is this interesting enough to further investigate? It's a pretty well-understood dynamic, I believe. Perhaps we should try harder to recover cleanly, but I don't know of anyone putting effort into the case. regards, tom lane
On Mon, Feb 11, 2019 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Ah, so Andrew was correct: we panicked due to lack of WAL space, and > that explains why the vacuuming process didn't have an opportunity > to delete the files belonging to the uncommitted new relation. > It's a pretty well-understood dynamic, I believe. Perhaps we should > try harder to recover cleanly, but I don't know of anyone putting > effort into the case. FTR I am working on a PG13 patch that records relfilenodes of uncommitted transactions in undo logs, so it can unlink them reliably, even if you crash (at the cost of introducing a WAL flush before creating files). I haven't specifically studied the VACUUM FULL case yet, but in principle this is exactly what my project aims to fix. It's mostly intended as example code to demonstrate the undo log machinery (defining undo record types, registering undo log action functions that are invoked during rollback, rollback of aborted but not yet rolled back transaction at startup, ...) without having to understand the whole zheap sandwich at once, but it's also a solution to an age old problem. More on that soon. -- Thomas Munro http://www.enterprisedb.com