From: Alvaro Herrera [mailto:alvherre@alvh.no-ip.org]
> Tsunakawa, Takayuki wrote:
>
> > (Although unrelated to this, I've also been wondering why PostgreSQL
> > flushes WAL to disk when writing a page in the shared buffer, because
> > PostgreSQL doesn't use WAL for undo.)
>
> The reason is that if the system crashes after writing the data page to
> disk, but before writing the WAL, the data page would be inconsistent with
> data in pages that weren't flushed, since there is no WAL to update those
> other pages. Also, if the system crashes after partially writing the page
> (say it writes the first 4kB) then the page is downright corrupted with
> no way to fix it.
>
> So there has to be a barrier that ensures that the WAL is flushed up to
> the last position that modified a page (i.e. that page's LSN) before actually
> writing that page to disk. And this is why we can't use mmap() for shared
> buffers -- there is no mechanism to force the WAL down if the operation
> system has the liberty to flush pages whenever it likes.
I see. The latter is a torn page problem, which is solved by a full page image WAL record. I understood that an
exampleof the former problem is the inconsistency between a table page and an index page -- if an index page is flushed
todisk without slushing the WAL and the corresponding table page, an index entry would point to a wroing table record
afterrecovery.
Thanks, my long-standing question has beenn solved.
Regards
Takayuki Tsunakawa
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers