Hello Hackers,
There is some strange behavior we're experiencing with one of the customer's DBs (8.4)
We've noticed that free disk space went down heavily on a system, and after a short analysis determined that the reason
wasthat postmaster was holding lots of unlinked files open. A sample of lsof output was something like this:
postmaste 15484 postgres 57u REG 253,0 1073741824 41125093 /srv/pgsql/data/base/352483309/2613.2
(deleted)
postmaste 15484 postgres 58u REG 253,0 1073741824 41125094 /srv/pgsql/data/base/352483309/2613.3
(deleted)
postmaste 15484 postgres 59u REG 253,0 1073741824 41125095 /srv/pgsql/data/base/352483309/2613.4
(deleted)
There were about 450 such (or similar) files, all of them having /2613 in the filename. Since 2613 is a regclass of
pg_largeobjectand we are indeed working with quite a few large objects in that DB so this is where our problem lies we
suspect.
Restarting PostgreSQL obviously helps the issue and the disk space occupied by those unlinked files (about 63GB
actually)is reclaimed.
So what happens on that host is that we drop/restore a fresh version of the DB from the production host, followed by a
migrationscript which among other things loads around 16GB of data files as large objects. This happens nightly.
But if we go and run the whole drop/restore and migration manually, the problem doesn't manifest itself right after
migrationis successfully finished.
Our next thought was that it must be dropdb part of the nightly script that removes the pg_largeobject's files (still
wedon't know what makes it keep them opened,) but dropping the DB doesn't manifest the problem either.
I'm currently running a VACUUM pg_largeobject on the problematic DB, to see if it triggers the problem, but this didn't
happenso far.
At this point it would be nice to hear what are your thoughts. What could cause such behavior?
--
Regards,
Alex