Home > mailing lists

Re: BUG #17449: Disk space not released - Mailing list pgsql-bugs

From	Giorgio Saviane
Subject	Re: BUG #17449: Disk space not released
Date	May 19, 2022 20:15:26
Msg-id	CAHs6c0c9j5P+kBTX67KbJ5_e4wtdrAumyRk-k3vOnJRYUmR0YA@mail.gmail.com Whole thread Raw
In response to	Re: BUG #17449: Disk space not released (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses	Re: BUG #17449: Disk space not released
List	pgsql-bugs

Tree view

On Sat, Mar 26, 2022 at 7:13 AM PG Bug reporting form
<noreply@postgresql.org> wrote:
>
> The following bug has been logged on the website:
>
> Bug reference: 17449
> Logged by: Giorgio Saviane
> Email address: gsaviane@gmail.com
> PostgreSQL version: 11.13
> Operating system: Linux 5.8
> Description:
>
> Hello, I noticed an uncontrolled disk occupation growth caused by a Postgres
> database in one of my deployments.
> The actual database size took more than 500Gb (checked with select
> pg_size_pretty(pg_database_size('dbname')) although tables accounted for a
> total of ~ 50Gb (checked with pg_total_relation_size()).

So, the output of pg_database_size is 500GB and pg_total_relation_size
for all the tables is 50GB right? pg_database_size includes table
space size as well, were there any table spaces and files under it?
BTW, pg_total_relation_size also includes all the files of a relation
irrespective of whether the vacuum happened or not.

The database was created in a single tablespace. Tables did not hold any other

additional files than those inherently used for storing their data and indexes.

> Despite any attempt
> of full vacuum the discrepancy remained the same. I suspect that Postgres
> started leaking disk space. I could see many 1Gb files with a timestamp of
> two months back in time in the postgres data folder.
> Restarting the server did not have any effect, so I decided to pg_dump the
> database and pg_restore the backup in a new instance. That worked, the new
> database is now ~ 50 Gb and dropping the old one released that 500Gb of disk
> space.
> The database was under streaming replication and I noticed the postgres log
> reporting many of these messages
>
> requested WAL segment 0000000100000000000000E3 has already been removed
>
> Could be that the leak started in correspondence of that error?
> If so, is there anything we can do to prevent it? I already set
> wal_keep_segments = 100, but I'm not sure it is enough and how to tune it.
> Is there any possible workaround to release the leaked space without going
> through a backup? (It took two hours)

Which all other directories in the data directory took more space? Is
it pg_wal or base or some other? Were there any inactive replication
slots? Was the archiving happening properly?

The content of /var/lib/postoresql/11/main/base dir was as following:

$ du -hs
7.6M 1
78M 1194171
7.6M 1236634587
7.6M 13052
7.6M 13053
561G 16385
20M pgsql_tmp

And that 16385 dir was full of 1Gb files whose name had a <number>.<digit>

pattern. I cannot say for sure if there were other subdirectories in it.

I didn't check the replication status unfortunately. I just removed it and recreated

it over the new instance.

Kind regards

pgsql-bugs by date:

From: PG Bug reporting form
Date: 19 May 2022, 18:46:41
Subject: BUG #17487: Parallel execution fails when original user is removed

From: Tom Lane
Date: 19 May 2022, 20:47:56
Subject: Re: BUG #17487: Parallel execution fails when original user is removed

Re: BUG #17449: Disk space not released - Mailing list pgsql-bugs

Previous

Next