Re: Disaster! - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: Disaster!
Date
Msg-id 20040123212249.GA31024@dcc.uchile.cl
Whole thread Raw
In response to Re: Disaster!  (Martín Marqués<martin@bugs.unl.edu.ar>)
Responses Re: Disaster!  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Fri, Jan 23, 2004 at 05:58:33PM -0300, Martín Marqués wrote:

> Tom, could you give a small insight on what occurred here, why those 8k of zeros
> fixed it, and what is a "WAL replay"?

If I may ...

- the disk filled up
- Postgres registered something in WAL that required some commit status (WAL log space is preallocated on disk, so this
didn'tfail)
 
- the clog code tried to store information about the commit bits, but noticed that it needed to extend the clog file.
- the extension failed because the disk was full
- the server went down and a WAL replay was in order, but ...
- the WAL replay could not be done because the code tried to read a commit status in pg_clog that wasn't there

(some time later)

- Chris emptied up some space and extended the clog
- WAL replay completed, reading an "uncommitted" status from the clog.

Here, "clog" is the "commit log", an area which indicates for each
transaction whether it committed or aborted.  A WAL replay is the
operation of bringing the data files (tables, indexes, etc) up to date
by reading the "Write-ahead log" (WAL).

Tom's answer will be undoubtly better ...

-- 
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"¿Que diferencia tiene para los muertos, los huérfanos, y aquellos que han
perdido su hogar, si la loca destrucción ha sido realizada bajo el nombre
del totalitarismo o del santo nombre de la libertad y la democracia?" (Gandhi)


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Disaster!
Next
From: Tom Lane
Date:
Subject: Re: Disaster!