> > > > I think you have identified a problem that needs
> > > > a more general solution: we need to be robust in the case that
> > > > an index entry is on disk that points to a tuple that never made
> > > > it to disk.
> >
> > And this general solution is WAL.
> >
> Yes exactly.
> But I've thought it's mainly for aborts in the middle of btree page
> splitting or for system crash in which we couldn't expect synchronous
> flushing of dirty buffers.
Central idea of WAL - write (and flush) to log all changes made in data
buffers _before_ data files will be changed. Buffer mgmr will be
responsible for this. Changes made in table buffers will be logged before
changes made in index ones, redo will insert un-inserted table rows and
index rows will not point to unexistent tuples in table. Undo will erase
all uncommitted changes (but will not shrink tables/indices).
Vadim