Thread: Need help understanding WAL and checkpoints

Need help understanding WAL and checkpoints

From
drew_hunt1976@yahoo.co.jp
Date:
Hi there=0A=0AI'm trying to get my head around WAL and checkpoints and need=
 to ask a couple of questions before I get a headache.=0A=0AFirstly, I see =
the terms "WAL log", "WAL file" and "transaction log" all over the place - =
are these the same thing (i.e. files in the pg_xlog directory)?=0A=0AI'm a =
bit confused by this paragraph in the docs:=0A=0A"Checkpoints are points in=
 the sequence of transactions at which it is guaranteed that the heap and i=
ndex data files have been updated with all information written before the c=
heckpoint. At checkpoint time, all dirty data pages are flushed to disk and=
 a special checkpoint record is written to the log file. (The changes were =
previously flushed to the WAL files.)"=0A=0A( see: http://www.postgresql.or=
g/docs/current/static/wal-configuration.html )=0A=0A"a special checkpoint r=
ecord is written to the log file."=0A=A0=A0=A0 -> which log file is meant h=
ere?=0A=0A"The changes were previously flushed to the WAL files."=0A=A0=A0 =
-> does "previously" here mean "at a previous point in time" or "in previou=
s PostgreSQL versions"?=0A=0A=A0=A0 -> at what point are changes flushed to=
 WAL files?=0A=0ASo say I perform an operation like :=0A=0A=A0 UPDATE foo S=
ET bar=3D'baz'=0A=0Aare the following assumptions correct?=0A=0A- The first=
 time this changed data hits the disk, it is as an entry in the WAL log=0A=
=0A- At some point a checkpoint occurs, and the changed data is written to =
the actual data file from system memory (the "dirty data pages"?)=0A=0A- th=
e only time the actual data files will be updated from the WAL log (i.e. no=
t from system memory) will be after a crash, when the logs are replayed fro=
m the last checkpoint?=0A=0Athanks for any help on this!=0A=0A- drew

Re: Need help understanding WAL and checkpoints

From
Albe Laurenz
Date:
drew_hunt wrote:
> I'm trying to get my head around WAL and checkpoints and need to ask a co=
uple of questions before I
> get a headache.
>=20
> Firstly, I see the terms "WAL log", "WAL file" and "transaction log" all =
over the place - are these
> the same thing (i.e. files in the pg_xlog directory)?

Usually they mean the same thing.
For exact definitions, read
http://www.postgresql.org/docs/current/static/continuous-archiving.html#BAC=
KUP-ARCHIVING-WAL
"WAL" consists of a stream of "WAL records" and is physically
represented as "WAL segment files" (in pg_xlog).

> I'm a bit confused by this paragraph in the docs:
>=20
> "Checkpoints are points in the sequence of transactions at which it is gu=
aranteed that the heap and
> index data files have been updated with all information written before th=
e checkpoint. At checkpoint
> time, all dirty data pages are flushed to disk and a special checkpoint r=
ecord is written to the log
> file. (The changes were previously flushed to the WAL files.)"
>=20
> ( see: http://www.postgresql.org/docs/current/static/wal-configuration.ht=
ml )
>=20
> "a special checkpoint record is written to the log file."
> =A0=A0=A0 -> which log file is meant here?

The WAL.

> "The changes were previously flushed to the WAL files."
> =A0=A0 -> does "previously" here mean "at a previous point in time" or "i=
n previous PostgreSQL versions"?

The former.

> =A0=A0 -> at what point are changes flushed to WAL files?

The change must be on disk in a WAL segment before the
transaction can commit.

> So say I perform an operation like :
>=20
> =A0 UPDATE foo SET bar=3D'baz'
>=20
> are the following assumptions correct?
>=20
> - The first time this changed data hits the disk, it is as an entry in th=
e WAL log
>=20
> - At some point a checkpoint occurs, and the changed data is written to t=
he actual data file from
> system memory (the "dirty data pages"?)
>=20
> - the only time the actual data files will be updated from the WAL log (i=
.e. not from system memory)
> will be after a crash, when the logs are replayed from the last checkpoin=
t?

All three are correct.

Yours,
Laurenz Albe

Re: Need help understanding WAL and checkpoints

From
drew_hunt1976@yahoo.co.jp
Date:
Hi Albe=0A=0A--- On Wed, 2013/2/6, Albe Laurenz <laurenz.albe@wien.gv.at> w=
rote:=0A=0A> drew_hunt wrote:=0A> > I'm trying to get my head around WAL an=
d checkpoints and need to ask a couple of questions before I=0A> > get a he=
adache.=0A> > =0A> > Firstly, I see the terms "WAL log", "WAL file" and "tr=
ansaction log" all over the place - are these=0A> > the same thing (i.e. fi=
les in the pg_xlog directory)?=0A> =0A> Usually they mean the same thing.=
=0A> For exact definitions, read=0A> http://www.postgresql.org/docs/current=
/static/continuous-archiving.html#BACKUP-ARCHIVING-WAL=0A> "WAL" consists o=
f a stream of "WAL records" and is physically=0A> represented as "WAL segme=
nt files" (in pg_xlog).=0A> =0A> > I'm a bit confused by this paragraph in =
the docs:=0A> > =0A> > "Checkpoints are points in the sequence of transacti=
ons at which it is guaranteed that the heap and=0A> > index data files have=
 been updated with all information written before the checkpoint. At checkp=
oint=0A> > time, all dirty data pages are flushed to disk and a special che=
ckpoint record is written to the log=0A> > file. (The changes were previous=
ly flushed to the WAL files.)"=0A> > =0A> > ( see: http://www.postgresql.or=
g/docs/current/static/wal-configuration.html )=0A> > =0A> > "a special chec=
kpoint record is written to the log file."=0A> > =A0=A0=A0 -> which log fil=
e is meant here?=0A> =0A> The WAL.=0A> =0A> > "The changes were previously =
flushed to the WAL files."=0A> > =A0=A0 -> does "previously" here mean "at =
a previous point in time" or "in previous PostgreSQL versions"?=0A> =0A> Th=
e former.=0A> =0A> > =A0=A0 -> at what point are changes flushed to WAL fil=
es?=0A> =0A> The change must be on disk in a WAL segment before the=0A> tra=
nsaction can commit.=0A> =0A> > So say I perform an operation like :=0A> > =
=0A> > =A0 UPDATE foo SET bar=3D'baz'=0A> > =0A> > are the following assump=
tions correct?=0A> > =0A> > - The first time this changed data hits the dis=
k, it is as an entry in the WAL log=0A> > =0A> > - At some point a checkpoi=
nt occurs, and the changed data is written to the actual data file from=0A>=
 > system memory (the "dirty data pages"?)=0A> > =0A> > - the only time the=
 actual data files will be updated from the WAL log (i.e. not from system m=
emory)=0A> > will be after a crash, when the logs are replayed from the las=
t checkpoint?=0A> =0A> All three are correct.=0A=0AMany thanks for the answ=
ers, its cleared things up for me :)=0A=0A- drew