Re: Experimental patch for inter-page delay in VACUUM - Mailing list pgsql-hackers

From Greg Stark
Subject Re: Experimental patch for inter-page delay in VACUUM
Date
Msg-id 87znfc3s45.fsf@stark.dyndns.tv
Whole thread Raw
In response to Re: Experimental patch for inter-page delay in VACUUM  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Experimental patch for inter-page delay in VACUUM
List pgsql-hackers
Tom Lane <tgl@sss.pgh.pa.us> writes:

> I would like to see us go over to fsync, or some other technique that
> gives more certainty about when the write has occurred.  There might be
> some scope that way to allow stretching out the I/O, too.
> 
> The main problem with this is knowing which files need to be fsync'd.

Why could the postmaster not just fsync *every* file? Does any OS make it a
slow operation to fsync a file that has no pending writes? Would we even care,
it would mean the checkpoint would take longer but not actually issue any
extra i/o.

I'm assuming fsync syncs writes issued by other processes on the same file,
which isn't necessarily true though. Otherwise every process would have to
fsync every file descriptor it has open.

> The only idea I have come up with is to move all buffer write operations
> into a background writer process, which could easily keep track of
> every file it's written into since the last checkpoint.  

I fear this approach. It seems to limit a lot of design flexibility later. But
I can't come up with any concrete way it limits things so perhaps that
instinct is just fud.

It also can become a point of contention. At least on Oracle you often need
multiple such processes to keep up with the i/o bandwidth.

> Actually, once you build it this way, you could make all writes synchronous
> (open the files O_SYNC) so that there is never any need for explicit fsync
> at checkpoint time.

Or using aio write ahead as much as you want and then just make checkpoint
block until all the writes are completed. You don't actually need to rush them
at all, just know when they're done. That would completely eliminate the i/o
storm without changing the actual pattern of writes at all.

-- 
greg



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: RC1 on AIX - Some Anti-results
Next
From: Jan Wieck
Date:
Subject: Re: Experimental patch for inter-page delay in VACUUM