Re: PostgreSQL under BSD/OS - Mailing list pgsql-hackers

From Greg Black
Subject Re: PostgreSQL under BSD/OS
Date
Msg-id 19980826075802.13774.qmail@alice.gba.oz.au
Whole thread Raw
In response to Re: PostgreSQL under BSD/OS  (Bruce Momjian <maillist@candle.pha.pa.us>)
Responses Re: PostgreSQL under BSD/OS  (Bruce Momjian <maillist@candle.pha.pa.us>)
Re: PostgreSQL under BSD/OS  (Bruce Momjian <maillist@candle.pha.pa.us>)
List pgsql-hackers
> > (relatively minor) bug in psql -- it fails to close files it reads for a
> > COPY command, meaning it can keep a multi-megabyte file open for days.
> > The workaround is to do a new connect to the same database after the
> > COPY, at which point the data file gets closed.  Maybe you can get that
> > fixed in a future release.
>
> This is the first I have heard of this.  The file commands/copy.c does
> use a file descriptor cache, but that is really just used for allowing
> more file opens that the OS permits.  Actual opens and closes are
> happending.
>
> I assume the files you are talking about are the database table files.
> Yes, they stay open because the backend may want to use them someday.

No, that's not what I meant.  Perhaps my attempt to be concise made my
explanation unclear.  Here's a more complete explanation of the problem.

I am planning to use PostgreSQL to manage some databases that have been
handled by completely different software up until now.  Therefore, there
is a lot of data to be extracted from the old databases and loaded into
PostgreSQL databases.

I do this with C programs which output two principal files: the first is
an input file for psql which contains various commands to create tables
and indexes, etc.; the second, much larger, file contains the actual
data in a suitable format.

The first file is input to psql by the \i command.  The last thing in
that first file is a SQL COPY command which copies the data from the big
data file into the appropriate table.  After an hour or so, the data
input completes and I can issue psql commands to play with the data to
see if it's the way I expect.  At this point, psql ought to close the
data file that it copied the data from so that I can delete it.  As
things stand, I have three copies of all the data -- the original
database (which I can't remove until this process is completed in a few
weeks); the temporary data file, used as input to psql (which I want to
delete since it can be recreated if needed); and the PostgreSQL database
which I have just created.

But if I remove the temporary file, I don't get any disk space back
because psql still has it open.  If I \connect to the same database,
then psql closes the input file, but my contention is that I should not
have to do that.  I hope this explanation is clear.

--
Greg Black <gjb@acm.org>



pgsql-hackers by date:

Previous
From: Andreas Zeugswetter
Date:
Subject: AW: [HACKERS] Massimo patches
Next
From: jwieck@debis.com (Jan Wieck)
Date:
Subject: TODO (was: Re: [HACKERS] Problem with parser)