On Sun, 2006-04-02 at 20:39 -0700, Martin Scholes wrote:
> The lesson here is that whatever WAL magic has been performed on the
> latest release gives over 100% speedup
That is good news.
> and the speedup is so good that skipping WAL for indexes does
> basically nothing.
I don't agree with this conclusion. Your original idea has possibilities
and these are not proved pointless by that test result. ISTM that any
reduction in WAL will give a performance increase on a correctly
configured system, and if it doesn't there's something else wrong also.
The idea of WALBypass for indexes is a valid one and would seem likely
to have good benefit with small tables that are very frequently updated
or inserted into, or for databases in a replication group where longer
recovery time doesn't influence overall availability.
If we have this as a per-index option, that would allow some indexes to
be more important than others if multiple workloads were supported on
the same set of tables. Plus its easier to add a CREATE INDEX option,
but would never apply to catalog indexes. We would automatically rebuild
all marked indexes at the end of recovery - hence we'd need catalog
indexes to be functioning - other parts of the system would not yet be
available. Perhaps optimized with a unique hash table that is inserted
into during recovery to remember all indexes that have been modified.
I'd be interested in implementing this unless someone beats me to it.
Thinking about this some more, I ask myself: why is it we log index
inserts at all? We log heap inserts, which contain all the information
we need to replay all index inserts also, so why bother? We would
clearly need to still log full page writes of any changed index pages,
but we wouldn't need to log each individual change. My only answer is
that WAL records are required because index insertion is not completely
deterministic because of the use of random(). Perhaps we could make the
process completely deterministic but pseudo-random by using some aspect
of the data/structure to determine the "random" insertion point? (I
would still want to have full logging for catalog indexes).
[I'll leave UPDATEs out of the discussion for now]
Best Regards, Simon Riggs