AW: [HACKERS] RE: [GENERAL] Transaction logging - Mailing list pgsql-hackers

From Zeugswetter Andreas IZ5
Subject AW: [HACKERS] RE: [GENERAL] Transaction logging
Date
Msg-id 219F68D65015D011A8E000006F8590C60339E07A@sdexcsrv1.f000.d0188.sd.spardat.at
Whole thread Raw
List pgsql-hackers
> Vadim Mikheev wrote:
> > 
> > > The "restore of a server" is a main problem here, but I suggest the
> > > following
> > > additional backup tool, that could be used for a "restore of a server"
> > > which could then be used for a rollforward and would also be a lot
> faster
> > > than a pg_dump:
> > >
> > > 1. place a vacuum lock on db (we don't want vacuum during backup)
> > > 2. backup pg_log using direct file access (something like dd bs=32k)
> >      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > > 3. backup the rest in any order (same as pg_log)
> > > 4. release vacuum lock
> > 
> > It looks like log archiving, not backup.
> > I believe that _full_ backup will do near the same
> > things as pg_dump now, but _incremental_ backup will
> > fetch info about what changed after last _full_ backup
> > from log.
> 
> Sorry, I was wrong. pg_dump is what's known as Export utility
> in Oracle and backup is quite different thing. But I have
> corrections for full backup described above:
> 
> 1. no vacuum lock is needed: all vacuum ops will be logged
>    in normal way to rollback changes in failures;
Yes.
> 2. all datafiles have to be backed up _before_ log backup
>    due to WAL logic: changes must be written to log before
>    they'll be written to on-disk data pages.
> 
When I was talking about pg_log, I meant pg_log as it is now.
As I understand it, it only stores commit/rollback info for each used xtid
and no other info.

This would be all we need, for a rollback of all transactions that were not 
committed at the time the backup began, as long as no vacuum removes
the old rows (and these are not reused). The xtid's that are higher than the

largest xtid in pg_log need also be rolled back. I am not sure though
whether 
we have enough info after the commit is flushed to the new row. 
This flush would have to be undone at restore time.

I like this approach more than always needing a transaction log at restore
time.
It makes it possible to configure a db to not write a transaction log,
as postgresql behaves now. After all a lot of installations only need to be
able 
to restore the database to the state it was at the last full backup.

The main issue is IMHO a very fast consistent online backup,
and a fast foolproof restore of same. The transaction log, and
rollforward comes after that.

Andreas

PS: for rollback you need the before image of rows, I would keep this in a
separate place like Oracle (rollback segment) and Informix (physical log)
since this info does not need to go to the rollforward tape.

Although if we did keep this info in the WAL, then postgresql could also do
a "rollback in time" by walking this log in the opposite direction.
Might be worth discussion.


pgsql-hackers by date:

Previous
From: "Gene Sokolov"
Date:
Subject: Re: Security WAS RE: [HACKERS] Updated TODO list
Next
From: "Ansley, Michael"
Date:
Subject: RE: Security WAS RE: [HACKERS] Updated TODO list