Re: [HACKERS] TODO item - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: [HACKERS] TODO item
Date
Msg-id 200002080002.TAA28209@candle.pha.pa.us
Whole thread Raw
In response to Re: [HACKERS] TODO item  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses RE: [HACKERS] TODO item
List pgsql-hackers
> > So, I think we are safe if we can either keep that file descriptor open
> > until commit, or re-open it and fsync it on commit.  That assume a
> > re-open is hitting the same file.  My opinion is that we should just
> > fsync it on close and not worry about a reopen.
> 
> There's still the problem that your backend might never have opened the
> relation file at all, still less done a write through its fd or vfd.
> I think we would need to have a separate data structure saying "these
> relations were dirtied in the current xact" that is not tied to fd's or
> vfd's.  Maybe the relcache would be a good place to keep such a flag.
> 
> Transaction commit would look like:
> 
> * scan buffer cache for dirty buffers, fwrite each one that belongs
> to one of the relations I'm trying to commit;
> 
> * open and fsync each segment of each rel that I'm trying to commit
> (or maybe just the dirtied segments, if we want to do the bookkeeping
> at that level of detail);

By fsync'ing on close, we can not worry about file descriptors that were
forced out of the file descriptor cache during the transaction.

If we dirty a buffer, we have to mark the buffer as dirty, and the file
descriptor associated with that buffer needing fsync.  If someone else
writes and removes that buffer from the cache before we get to commit
it, the file descriptor flag will tell us the file descriptor needs
fsync.

We have to:
write our dirty buffersfsync all file descriptors marked as "written" during our transactionfsync all file descriptors
onclose when being cycled out of fd cache(fd close has to write dirty buffers before fsync)
 

So we have three states for a write:
still in dirty bufferfile descriptor marked as dirty/need fsyncfile descriptor removed from cache, fsync'ed on close

Seems this covers all the cases.

> 
> * make pg_log entry;
> 
> * write and fsync pg_log.

Yes.

> 
> fsync-on-close is probably a waste of cycles.  The only way that would
> matter is if someone else were doing a RENAME TABLE on the rel, thus
> preventing you from reopening it.  I think we could just put the
> responsibility on the renamer to fsync the file while he's doing it
> (in fact I think that's already in there, at least to the extent of
> flushing the buffer cache).

I hadn't thought of that case. I was thinking of file descriptor cache
removal, or don't they get removed if they are in use?  If not, you can
skip my close examples.

--  Bruce Momjian                        |  http://www.op.net/~candle pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] New Globe
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] network_ops in 7.0 and pg_dump question