Re: Index Scans become Seq Scans after VACUUM ANALYSE - Mailing list pgsql-hackers

From J. R. Nield
Subject Re: Index Scans become Seq Scans after VACUUM ANALYSE
Date
Msg-id 1024835880.1793.264.camel@localhost.localdomain
Whole thread Raw
In response to Re: Index Scans become Seq Scans after VACUUM ANALYSE  (Bruce Momjian <pgman@candle.pha.pa.us>)
Responses Re: Index Scans become Seq Scans after VACUUM ANALYSE
List pgsql-hackers
On Sat, 2002-06-22 at 19:17, Bruce Momjian wrote:
> J. R. Nield wrote:
> > One other point:
> > 
> > Page pre-image logging is fundamentally the same as what Jim Grey's
> > book[1] would call "careful writes". I don't believe they should be in
> > the XLOG, because we never need to keep the pre-images after we're sure
> > the buffer has made it to the disk. Instead, we should have the buffer
> > IO routines implement ping-pong writes of some kind if we want
> > protection from partial writes.
> 
> Ping-pong writes to where?  We have to fsync, and rather than fsync that
> area and WAL, we just do WAL.  Not sure about a win there.
> 

The key question is: do we have some method to ensure that the OS
doesn't do the writes in parallel?

If the OS will ensure that one of the two block writes of a ping-pong
completes before the other starts, then we don't need to fsync() at 
all. 

The only thing we are protecting against is the possibility of both
writes being partial. If neither is done, that's fine because WAL will
protect us. If the first write is partial, we will detect that and use
the old data from the other, then recover from WAL. If the first is
complete but the second is partial, then we detect that and use the
newer block from the first write. If the second is complete but the
first is partial, we detect that and use the newer block from the second
write.

So does anyone know a way to prevent parallel writes in one of the
common unix standards? Do they say anything about this?

It would seem to me that if the same process does both ping-pong writes,
then there should be a cheap way to enforce a serial order. I could be
wrong though.

As to where the first block of the ping-pong should go, maybe we could
reserve a file with nBlocks space for them, and write the information
about which block was being written to the XLOG for use in recovery.
There are many other ways to do it.

;jrnield

-- 
J. R. Nield
jrnield@usol.com





pgsql-hackers by date:

Previous
From: Gavin Sherry
Date:
Subject: Re: Code questions
Next
From: Michael Meskes
Date:
Subject: Re: ecpg and bison again