Re: fd.c doesn't remove files on a crash-restart - Mailing list pgsql-hackers

From Andres Freund
Subject Re: fd.c doesn't remove files on a crash-restart
Date
Msg-id 20160316175629.7i5uf7elql4aynie@alap3.anarazel.de
Whole thread Raw
In response to fd.c doesn't remove files on a crash-restart  ("Joshua D. Drake" <jd@commandprompt.com>)
Responses Re: fd.c doesn't remove files on a crash-restart
List pgsql-hackers
On 2016-03-16 10:53:42 -0700, Joshua D. Drake wrote:
> Hello,
> 
> fd.c[1] will remove files from pgsql_tmp on a restart but not a
> crash-restart per this comment:
> 
> /*
> * NOTE: we could, but don't, call this during a post-backend-crash restart
> * cycle.  The argument for not doing it is that someone might want to
> examine
> * the temp files for debugging purposes.  This does however mean that
> * OpenTemporaryFile had better allow for collision with an existing temp
> * file name.
> */
> 
> I understand that this is designed this way. I think it is a bad idea
> because:
> 
> 1. The majority crash-restarts in the wild are going to be diagnosed rather
> easily within the OS itself. They fall into things like OOM killer and out
> of disk space.

I don't buy 1), like at all. I've seen numerous production instances
with crashes outside of os triggered things.


> 2. It can cause significant issues, we ran into this yesterday:
> 
> -bash-4.1$ ls pgsql_tmp31227*|du -sh
> 250G    
> 
> There is no active process/backend with PID 31227. The database itself is
> only 55G, but we are taking up an 5x that with dead files.
> 
> 3. The problem can get worse over time. If you have a very long running
> instance, any time the backend crash-restarts you have to potential to
> increase disk space used for no purpose.

But I think these outweigh the debugging benefit.


Andres



pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: fd.c doesn't remove files on a crash-restart
Next
From: Robert Haas
Date:
Subject: Re: [PROPOSAL] VACUUM Progress Checker.