Plugging fd leaks (was Re: Switching timeline over streaming replication) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Plugging fd leaks (was Re: Switching timeline over streaming replication)
Date
Msg-id 50AF7B13.9050404@vmware.com
Whole thread Raw
In response to Re: Switching timeline over streaming replication  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Plugging fd leaks (was Re: Switching timeline over streaming replication)  (Amit Kapila <amit.kapila@huawei.com>)
List pgsql-hackers
On 15.11.2012 17:16, Heikki Linnakangas wrote:
> On 15.11.2012 16:55, Tom Lane wrote:
>> Heikki Linnakangas<hlinnakangas@vmware.com> writes:
>>> This is a fairly general issue, actually. Looking around, I can see at
>>> least two similar cases in existing code, with BasicOpenFile, where we
>>> will leak file descriptors on error:
>>
>> Um, don't we automatically clean those up during transaction abort?
>
> Not the ones allocated with PathNameOpenFile or BasicOpenFile. Files
> allocated with AllocateFile() and OpenTemporaryFile() are cleaned up at
> abort.
>
>> If we don't, we ought to think about that, not about cluttering calling
>> code with certain-to-be-inadequate cleanup in error cases.
>
> Agreed. Cleaning up at end-of-xact won't help walsender or other
> non-backend processes, though, because they don't do transactions. But a
> top-level ResourceOwner that's reset in the sigsetjmp() cleanup routine
> would work.

This is what I came up with. It adds a new function, OpenFile, that
returns a raw file descriptor like BasicOpenFile, but the file
descriptor is associated with the current subtransaction and
automatically closed at end-of-xact, in the same way that AllocateFile
and AllocateDir work. In other words, OpenFile is to open() what
AllocateFile is to fopen(). BasicOpenFile is unchanged, it returns a raw
fd and it's solely the caller's responsibility to close it, but many of
the places that used to call BasicOpenFile now use the safer OpenFile
function instead.

This patch plugs three existing fd (or virtual fd) leaks:

1. copy_file() - fixed by by using OpenFile instead of BasicOpenFile
2. XLogFileLinit() - fixed by adding close() calls to the error cases.
Can't use OpenFile here because the fd is supposed to persist over
transaction boundaries.
3. lo_import/lo_export - fixed by using OpenFile instead of
PathNameOpenFile.

In addition, this replaces many BasicOpenFile() calls with OpenFile()
that were not leaking, because the code meticulously closed the file on
error. That wasn't strictly necessary, but IMHO it's good for robustness.

One thing I'm not too fond of is the naming. I'm calling the new
functions OpenFile and CloseFile. There's some danger of confusion
there, as the function to close a virtual file opened with
PathNameOpenFile is called FileClose. OpenFile is really the same kind
of operation as AllocateFile and AllocateDir, but returns an unbuffered
fd. So it would be nice if it was called AllocateSomething, too. But
AllocateFile is already taken. And I don't much like the Allocate*
naming for these anyway, you really would expect the name to contain "open".

Do we want to backpatch this? We've had zero complaints, but this seems
fairly safe to backpatch, and at least the leak in copy_file() can be
quite annoying. If you run out of disk space in CREATE DATABASE, the
target file is kept open even though it's deleted, so the space isn't
reclaimed until you disconnect.

- Heikki

Attachment

pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: [WIP PATCH] for Performance Improvement in Buffer Management
Next
From: Michael Meskes
Date:
Subject: Re: fix ecpg core dump when there's a very long struct variable name in .pgc file