Thread: How to deal with dangling files after aborted `pg_restore`?
The scenario:
1. There is a postgresql 17 server running
2. Restore dump with `--single-transaction` flag
3. For whatever reason the server goes away (eg: we kill the process)
4. Now `base` directory is filled with abandoned table files which postgresql know nothing about
Playground:
Terminal 1:
Start server: `docker run --rm -it -e POSTGRES_HOST_AUTH_METHOD=trust -v $PWD/postgres:/var/lib/postgresql/data postgres:17.2`
Terminal 2:
1. Start container with recent pg_restore: `docker run --rm -it -v $PWD:/app -w /app postgres:17.2 bash`
2. Start dump recovery: `pg_restore -h 172.17.0.2 -U postgres -d postgres --single-transaction -v -Fc --no-owner dump.sql`
Terminal 3:
1. Find what container is the server: `docker ps`
2. Kill it: `docker kill d7ecf6e66c1d`
Terminal 1:
Start the server again, with the same command
Terminal 3:
Check there are abandoned large files:
```
# ls -la /home/ivan/postgres/base/5
<truncated>
-rw------- 1 systemd-coredump systemd-coredump 342884352 Dec 17 18:58 16399
-rw------- 1 systemd-coredump systemd-coredump 11149312 Dec 17 18:58 16404
-rw------- 1 systemd-coredump systemd-coredump 188416 Dec 17 18:58 16403_fsm
-rw------- 1 systemd-coredump systemd-coredump 686145536 Dec 17 18:58 16403
-rw------- 1 systemd-coredump systemd-coredump 11149312 Dec 17 18:58 16404
-rw------- 1 systemd-coredump systemd-coredump 188416 Dec 17 18:58 16403_fsm
-rw------- 1 systemd-coredump systemd-coredump 686145536 Dec 17 18:58 16403
```
Terminal 2:
1. Confirm those OIDs are not accounted:
```
psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16404
psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16403
psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16399
```
Hence a question: am I doing something wrong? Is it expected behaviour? If so - how would one recover from this scenario now WITHOUT dropping entire database?
With best regards, Ivan Kurnosov
Genuine question: Why are you looking to recover from this half-cooked state instead of restarting the restore process from the beginning?
On Tue, Dec 17, 2024, 1:10 a.m. Ivan Kurnosov <zerkms@zerkms.com> wrote:
The scenario:1. There is a postgresql 17 server running2. Restore dump with `--single-transaction` flag3. For whatever reason the server goes away (eg: we kill the process)4. Now `base` directory is filled with abandoned table files which postgresql know nothing aboutPlayground:Terminal 1:Start server: `docker run --rm -it -e POSTGRES_HOST_AUTH_METHOD=trust -v $PWD/postgres:/var/lib/postgresql/data postgres:17.2`Terminal 2:1. Start container with recent pg_restore: `docker run --rm -it -v $PWD:/app -w /app postgres:17.2 bash`2. Start dump recovery: `pg_restore -h 172.17.0.2 -U postgres -d postgres --single-transaction -v -Fc --no-owner dump.sql`Terminal 3:1. Find what container is the server: `docker ps`2. Kill it: `docker kill d7ecf6e66c1d`Terminal 1:Start the server again, with the same commandTerminal 3:Check there are abandoned large files:```# ls -la /home/ivan/postgres/base/5<truncated>-rw------- 1 systemd-coredump systemd-coredump 342884352 Dec 17 18:58 16399
-rw------- 1 systemd-coredump systemd-coredump 11149312 Dec 17 18:58 16404
-rw------- 1 systemd-coredump systemd-coredump 188416 Dec 17 18:58 16403_fsm
-rw------- 1 systemd-coredump systemd-coredump 686145536 Dec 17 18:58 16403```Terminal 2:1. Confirm those OIDs are not accounted:```psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16404psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16403psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16399```Hence a question: am I doing something wrong? Is it expected behaviour? If so - how would one recover from this scenario now WITHOUT dropping entire database?--With best regards, Ivan Kurnosov
It could be that he only restored some tables in an existing database.
On Wed, Dec 18, 2024 at 7:02 PM Saul Perdomo <saul.perdomo@gmail.com> wrote:
Genuine question: Why are you looking to recover from this half-cooked state instead of restarting the restore process from the beginning?
On Tue, Dec 17, 2024, 1:10 a.m. Ivan Kurnosov <zerkms@zerkms.com> wrote:The scenario:1. There is a postgresql 17 server running2. Restore dump with `--single-transaction` flag3. For whatever reason the server goes away (eg: we kill the process)4. Now `base` directory is filled with abandoned table files which postgresql know nothing aboutPlayground:Terminal 1:Start server: `docker run --rm -it -e POSTGRES_HOST_AUTH_METHOD=trust -v $PWD/postgres:/var/lib/postgresql/data postgres:17.2`Terminal 2:1. Start container with recent pg_restore: `docker run --rm -it -v $PWD:/app -w /app postgres:17.2 bash`2. Start dump recovery: `pg_restore -h 172.17.0.2 -U postgres -d postgres --single-transaction -v -Fc --no-owner dump.sql`Terminal 3:1. Find what container is the server: `docker ps`2. Kill it: `docker kill d7ecf6e66c1d`Terminal 1:Start the server again, with the same commandTerminal 3:Check there are abandoned large files:```# ls -la /home/ivan/postgres/base/5<truncated>-rw------- 1 systemd-coredump systemd-coredump 342884352 Dec 17 18:58 16399
-rw------- 1 systemd-coredump systemd-coredump 11149312 Dec 17 18:58 16404
-rw------- 1 systemd-coredump systemd-coredump 188416 Dec 17 18:58 16403_fsm
-rw------- 1 systemd-coredump systemd-coredump 686145536 Dec 17 18:58 16403```Terminal 2:1. Confirm those OIDs are not accounted:```psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16404psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16403psql -h 172.17.0.2 -U postgres -c 'select * from pg_class'|grep 16399```Hence a question: am I doing something wrong? Is it expected behaviour? If so - how would one recover from this scenario now WITHOUT dropping entire database?--With best regards, Ivan Kurnosov
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!