Re: PostgreSQL under BSD/OS - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: PostgreSQL under BSD/OS
Date
Msg-id 199808290401.AAA28495@candle.pha.pa.us
Whole thread Raw
In response to Re: PostgreSQL under BSD/OS  (Greg Black <gjb@acm.org>)
Responses Re: PostgreSQL under BSD/OS
List pgsql-hackers
> > > (relatively minor) bug in psql -- it fails to close files it reads for a
> > > COPY command, meaning it can keep a multi-megabyte file open for days.
> > > The workaround is to do a new connect to the same database after the
> > > COPY, at which point the data file gets closed.  Maybe you can get that
> > > fixed in a future release.
> >
> > This is the first I have heard of this.  The file commands/copy.c does
> > use a file descriptor cache, but that is really just used for allowing
> > more file opens that the OS permits.  Actual opens and closes are
> > happending.
> >
> > I assume the files you are talking about are the database table files.
> > Yes, they stay open because the backend may want to use them someday.
>
> No, that's not what I meant.  Perhaps my attempt to be concise made my
> explanation unclear.  Here's a more complete explanation of the problem.
>
> I am planning to use PostgreSQL to manage some databases that have been
> handled by completely different software up until now.  Therefore, there
> is a lot of data to be extracted from the old databases and loaded into
> PostgreSQL databases.
>
> I do this with C programs which output two principal files: the first is
> an input file for psql which contains various commands to create tables
> and indexes, etc.; the second, much larger, file contains the actual
> data in a suitable format.
>
> The first file is input to psql by the \i command.  The last thing in
> that first file is a SQL COPY command which copies the data from the big
> data file into the appropriate table.  After an hour or so, the data
> input completes and I can issue psql commands to play with the data to
> see if it's the way I expect.  At this point, psql ought to close the
> data file that it copied the data from so that I can delete it.  As
> things stand, I have three copies of all the data -- the original
> database (which I can't remove until this process is completed in a few
> weeks); the temporary data file, used as input to psql (which I want to
> delete since it can be recreated if needed); and the PostgreSQL database
> which I have just created.
>
> But if I remove the temporary file, I don't get any disk space back
> because psql still has it open.  If I \connect to the same database,
> then psql closes the input file, but my contention is that I should not
> have to do that.  I hope this explanation is clear.

OK, in pgsql/src/backend/command/copy.c, you should see a function call
to FreeFile().  That is what is supposed to be called to free the open
file.  AllocateFile opens the file a few lines above it, in either read
or write mode.

If you can, can you put a little printf statement just before the
FreeFile, and see if it is getting called.  You have to look in the
postmaster log file to see the output of the printf().  If it is getting
called, I have no idea why it would still be holding the file
descriptor.  If it is not calling that function, I am confused because I
can't see how it could get out of that function without calling it.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [INTERFACES] Re: [HACKERS] changes in 6.4
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] 6.4 Aggregate Bug