Thread: Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted
From
David Zhang
Date:
I verified the patch "v2-0001-Free-disk-space-for-dropped-relations-on-commit.patch" on master branch "0cc99327888840f2bf572303b68438e4caf62de9".It works for me. Below is my test procedure and results. === Before the patch === #1 from psql console 1, create table and index then insert enough data postgres=# CREATE TABLE test_tbl ( a int, b text); postgres=# CREATE INDEX idx_test_tbl on test_tbl (a); postgres=# INSERT INTO test_tbl SELECT generate_series(1,80000000),'Hello world!'; postgres=# INSERT INTO test_tbl SELECT generate_series(1,80000000),'Hello world!'; #2 check files size david:12867$ du -h 12G . #3 from psql console 2, drop the index postgres=# drop index idx_test_tbl; #4 check files size in different ways, david:12867$ du -h 7.8G . david:12867$ ls -l ... -rw------- 1 david david 0 Nov 23 20:07 16402 ... $ lsof -nP | grep '(deleted)' |grep pgdata ... postgres 25736 david 45u REG 259,2 0 12592758 /home/david/sandbox/postgres/pgdata/base/12867/16402(deleted) postgres 25736 david 49u REG 259,2 1073741824 12592798 /home/david/sandbox/postgres/pgdata/base/12867/16402.1(deleted) postgres 25736 david 53u REG 259,2 1073741824 12592739 /home/david/sandbox/postgres/pgdata/base/12867/16402.2(deleted) postgres 25736 david 59u REG 259,2 372604928 12592800 /home/david/sandbox/postgres/pgdata/base/12867/16402.3(deleted) ... The index relnode id "16402" displays size "0" from postgres database folder, but when using lsof to check, all 16402.x arestill in used by a psql connection except 16402 is set to 0. Check it again after an hour, lsof shows the same results. === After the patch === Repeat step 1 ~ 4, lsof shows all the index relnode files (in this case, the index relnode id 16389) are removed within about1 minute. $ lsof -nP | grep '(deleted)' |grep pgdata ... postgres 32707 david 66u REG 259,2 0 12592763 /home/david/sandbox/postgres/pgdata/base/12867/16389.1(deleted) postgres 32707 david 70u REG 259,2 0 12592823 /home/david/sandbox/postgres/pgdata/base/12867/16389.2(deleted) postgres 32707 david 74u REG 259,2 0 12592805 /home/david/sandbox/postgres/pgdata/base/12867/16389.3(deleted) ... One thing interesting for me is that, if the index is created after data records has been inserted, then lsof doesn't showthis issue.
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted
From
Pavel Borisov
Date:
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation: not tested Given we got two other reviews from Neil and David, I think I can finalize my own review and mark the patch as ready forcommitter if nobody has objections. Thank you! Pavel Borisov The new status of this patch is: Ready for Committer
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted
From
Thomas Munro
Date:
On Wed, Nov 25, 2020 at 8:00 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote: > The new status of this patch is: Ready for Committer Thanks! One small thing bothered me about the last version of the patch. It tried to unlink when ENOENT had already been enountered by open(2), so COMMIT of a DROP looks like: openat(AT_FDCWD, "base/14208/16384", O_RDWR) = 54 ftruncate(54, 0) = 0 close(54) = 0 openat(AT_FDCWD, "base/14208/16384.1", O_RDWR) = -1 ENOENT unlink("base/14208/16384.1") = -1 ENOENT openat(AT_FDCWD, "base/14208/16384_fsm", O_RDWR) = -1 ENOENT unlink("base/14208/16384_fsm") = -1 ENOENT openat(AT_FDCWD, "base/14208/16384_vm", O_RDWR) = -1 ENOENT unlink("base/14208/16384_vm") = -1 ENOENT openat(AT_FDCWD, "base/14208/16384_init", O_RDWR) = -1 ENOENT unlink("base/14208/16384_init") = -1 ENOENT So I fixed that, by adding a return value to do_truncate() and checking it. That's the version I plan to commit tomorrow, unless there are further comments or objections. I've also attached a version suitable for REL_11_STABLE and earlier branches (with a name that cfbot should ignore), where things are slightly different. In those branches, the register_forget_request() logic is elsewhere. While looking at trace output, I figured we should just use truncate(2) on non-Windows, on the master branch only. It's not like it really makes much difference, but I don't see why we shouldn't allow ourselves to use ancient standardised Unix syscalls when we can. That'd get us down to just the following when committing a DROP: truncate("base/14208/16396", 0) = 0 truncate("base/14208/16396.1", 0) = -1 ENOENT truncate("base/14208/16396_fsm", 0) = -1 ENOENT truncate("base/14208/16396_vm", 0) = -1 ENOENT truncate("base/14208/16396_init", 0) = -1 ENOENT
Attachment
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted
From
Thomas Munro
Date:
On Mon, Nov 30, 2020 at 6:59 PM Thomas Munro <thomas.munro@gmail.com> wrote: > On Wed, Nov 25, 2020 at 8:00 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote: > > The new status of this patch is: Ready for Committer > ... That's the version I plan to commit tomorrow, unless > there are further comments or objections. ... Done, and back-patched. I thought a bit more about the fact that we fail to unlink higher-numbered segments in certain error cases, potentially leaving stray files behind. As far as I can see, nothing we do in this code-path is going to be a bullet-proof solution to that problem. One simple idea would be for the checkpointer to refuse to unlink segment 0 (thereby allowing the relfilenode to be recycled) until it has scanned the parent directory for any related files that shouldn't be there. > While looking at trace output, I figured we should just use > truncate(2) on non-Windows, on the master branch only. It's not like > it really makes much difference, but I don't see why we shouldn't > allow ourselves to use ancient standardised Unix syscalls when we can. Also pushed, but only to master.
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted
From
Michael Paquier
Date:
On Mon, Nov 30, 2020 at 06:59:40PM +1300, Thomas Munro wrote: > So I fixed that, by adding a return value to do_truncate() and > checking it. That's the version I plan to commit tomorrow, unless > there are further comments or objections. I've also attached a > version suitable for REL_11_STABLE and earlier branches (with a name > that cfbot should ignore), where things are slightly different. In > those branches, the register_forget_request() logic is elsewhere. Hmm. Sorry for arriving late at the party. But is that really something suitable for a backpatch? Sure, it is not optimal to not truncate all the segments when a transaction dropping a relation commits, but this was not completely broken either. -- Michael
Attachment
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted
From
Thomas Munro
Date:
On Tue, Dec 1, 2020 at 3:55 PM Michael Paquier <michael@paquier.xyz> wrote: > On Mon, Nov 30, 2020 at 06:59:40PM +1300, Thomas Munro wrote: > > So I fixed that, by adding a return value to do_truncate() and > > checking it. That's the version I plan to commit tomorrow, unless > > there are further comments or objections. I've also attached a > > version suitable for REL_11_STABLE and earlier branches (with a name > > that cfbot should ignore), where things are slightly different. In > > those branches, the register_forget_request() logic is elsewhere. > > Hmm. Sorry for arriving late at the party. But is that really > something suitable for a backpatch? Sure, it is not optimal to not > truncate all the segments when a transaction dropping a relation > commits, but this was not completely broken either. I felt on balance it was a "bug", since it causes operational difficulties for people and was clearly not our intended behaviour, and I announced this intention 6 weeks ago. Of course I'll be happy to revert it from the back-branches if that's the consensus. Any other opinions?
Re: BUG #16663: DROP INDEX did not free up disk space: idle connection hold file marked as deleted
From
Michael Paquier
Date:
On Tue, Dec 01, 2020 at 04:06:48PM +1300, Thomas Munro wrote: > I felt on balance it was a "bug", since it causes operational > difficulties for people and was clearly not our intended behaviour, > and I announced this intention 6 weeks ago. Oops, sorry for missing this discussion for such a long time :/ > Of course I'll be happy to revert it from the back-branches if > that's the consensus. Any > other opinions? If there are no other opinions, I am also fine to rely on your judgment. -- Michael