The basic idea is to change register_dirty_segment() to
register_opened_segment().
That is, we don't care if a segment is dirty or not, if someone opened it,
then we will fsync it at checkpoint time. Currently,
register_dirty_segment() is called in mdextend(), mdwrite() and
mdtruncate(), this is costly since ForwardFsyncRequest() has to grab the
BgWriterCommLock lock exclusively each time and mdwrite() is quite frequent.
Benefits:
+ reduce BgWriterCommLock lock contention;
+ simplify code - we just need to register_opened_segment() when we open the
segment;
+ reduce the BgWriterShmem->requests[] size;
Costs:
+ have to fsync() a file even if we made no modification on it. The cost is
just open/close file, so I think this is acceptable;
Corner case:
+ what if we run out of shared memory for ForwardFsyncRequest()? In the
original way, we just fsync() the file ourselves; Now we can't do this.
Instead, we will issue and wait a checkpoint request to Bgwriter(let him
absorb the requests) and try ForwardFsyncRequest() again.
Comments?
Regards,
Qingqing