Index Page Split logging - Mailing list pgsql-hackers

From Simon Riggs
Subject Index Page Split logging
Date
Msg-id 1199192779.9558.256.camel@ebony.site
Whole thread Raw
Responses Re: Index Page Split logging  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Implementing Sorting Refinements  (<mac_man2005@hotmail.it>)
List pgsql-hackers
When we split an index page we perform a multi-block operation that is
both fairly expensive and complex to reconstruct should we crash partway
through.

If we could log *only* the insert that caused the split, rather than the
split itself, we would avoid that situation entirely. This would then
mean that the recovery code would resolve the split by performing a full
logical split rather than replaying pieces of the original physical
split. Doing that would remove a ton of complexity, as well as reducing
log volumes.

We would need to ensure that the right-hand page of the split reached
disk before the left-hand page. If a crash occurs when only the right
hand page has reached disk then there would be no link (on disk) to it
and so it would be ignored. We would need an orphaned page detection
mechanism to allow the page to be VACUUMed sometime in the future. There
would also be some sort of ordering required in the buffer manager, so
that pages which must be written last are kept pinned until the first
page is written. That sounds like it is fairly straightforward and it
would allow a generic mechanism that worked for all index splits, rather
than requiring individual code for each rmgr.

ISTM that would require Direct I/O to perform physical writes in a
specific order, rather than just issue the writes and fsync. Which
probably kills it for now, even assuming you followed me on every point
up till now...

So I'm mentioning this really to get the idea out there and see if
anybody has any bright ideas, rather than as a well-formed proposal for
immediate implementation.

--  Simon Riggs 2ndQuadrant  http://www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: kenneth d'souza
Date:
Subject: concurrency in psql
Next
From: Bruce Momjian
Date:
Subject: 8.3RC1 release date