Home > mailing lists

Performance lossage in checkpoint dumping - Mailing list pgsql-hackers

From	Tom Lane
Subject	Performance lossage in checkpoint dumping
Date	February 16, 2001 21:32:39
Msg-id	23621.982377108@sss.pgh.pa.us Whole thread Raw
Responses	Re: Performance lossage in checkpoint dumping
List	pgsql-hackers

Tree view

While poking at Peter Schmidt's comments about pgbench showing worse
performance than for 7.0 (using -F in both cases), I noticed that given
enough buffer space, FileWrite never seemed to get called at all.  A
little bit of sleuthing revealed the following:

1. Under WAL, we don't write dirty buffers out of the shared memory at
every transaction commit.  Instead, as long as a dirty buffer's slot
isn't needed for something else, it just sits there until the next
checkpoint or shutdown.  CreateCheckpoint calls FlushBufferPool which
writes out all the dirty buffers in one go.  This is a Good Thing; it
lets us consolidate multiple updates of a single datafile page by
successive transactions into one disk write.  We need this to buy back
some of the extra I/O required to write the WAL logfile.

2. However, this means that a lot of the dirty-buffer writes get done by
the periodic checkpoint process, not by the backends that originally
dirtied the buffers.  And that means that every last one gets done by
blind write, because the checkpoint process isn't going to have opened
any relation cache entries --- maybe a couple of system catalog
relations, but for sure it won't have any for user relations.  If you
look at BufferSync, any page that the current process doesn't have an
already-open relcache entry for is sent to smgrblindwrt not smgrwrite.

3. Blind write is gratuitously inefficient: it does separate open,
seek, write, close kernel calls for every request.  This was the right
thing in 7.0.*, because backends relatively seldom did blind writes and
even less often needed to blindwrite multiple pages of a single relation
in succession.  But the typical usage has changed a lot.


I am thinking it'd be a good idea if blind write went through fd.c and
thus was able to re-use open file descriptors, just like normal writes.
This should improve the efficiency of dumping dirty buffers during
checkpoint by a noticeable amount.

Comments?
        regards, tom lane

pgsql-hackers by date:

From: Bruce Momjian
Date: 16 February 2001, 21:18:25
Subject: Re: beta5 ...

From: Bruce Momjian
Date: 16 February 2001, 21:47:50
Subject: Re: Performance lossage in checkpoint dumping

Performance lossage in checkpoint dumping - Mailing list pgsql-hackers

Previous

Next