Re: Large number of open(2) calls with bulk INSERT into empty table - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Large number of open(2) calls with bulk INSERT into empty table
Date
Msg-id CA+Tgmoac+6qTNp2U+wedY8-PU6kK_b6hbdhR5xYGBG3GtdFcww@mail.gmail.com
Whole thread Raw
In response to Re: Large number of open(2) calls with bulk INSERT into empty table  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Large number of open(2) calls with bulk INSERT into empty table
List pgsql-hackers
On Wed, Nov 30, 2011 at 12:29 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sun, Nov 27, 2011 at 10:24 AM, Florian Weimer <fweimer@bfk.de> wrote:
>>> I noticed that a bulk INSERT into an empty table (which has been
>>> TRUNCATEd in the same transaction, for good measure) results in a
>>> curious number of open(2) calls for the FSM resource fork:
>
>> That's kind of unfortunate.  It looks like every time we extend the
>> relation, we try to read the free space map to see whether there's a
>> block available with free space in it.  But since we never actually
>> make any entries in the free space map, the fork never gets created,
>> so every attempt to read it involves a system call to see whether it's
>> there.
>
> I wonder whether it'd help if we went ahead and created the FSM file,
> with length zero, as soon as the relation is made (or maybe when it
> first becomes of nonzero length).  That would at least save the failed
> open()s.  We'd still be doing lseeks on the FSM file, but those ought
> to be cheaper.
>
> A less shaky way to do it would be to just create the first page of the
> FSM file immediately, but that would represent an annoying percentage
> increase in the disk space needed for small tables.

Well, unfortunately, we're not really doing a good job dodging that
problem as it is.  For example:

rhaas=# create table foo (a int);
CREATE TABLE
rhaas=# select pg_relation_size('foo'), pg_table_size('foo');pg_relation_size | pg_table_size
------------------+---------------               0 |             0
(1 row)

rhaas=# insert into foo values (1);
INSERT 0 1
rhaas=# select pg_relation_size('foo'), pg_table_size('foo');pg_relation_size | pg_table_size
------------------+---------------            8192 |          8192
(1 row)

rhaas=# vacuum foo;
VACUUM
rhaas=# select pg_relation_size('foo'), pg_table_size('foo');pg_relation_size | pg_table_size
------------------+---------------            8192 |         40960
(1 row)

rhaas=#

Yikes!  A table with 4 bytes of useful data is consuming 40kB on disk
- 8kB in the main form, 8kB in the VM fork, and 24kB in the FSM fork.
Ouch!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: review: CHECK FUNCTION statement
Next
From: "Kevin Grittner"
Date:
Subject: Re: Reserved words and delimited identifiers