Re: Plans for solving the VACUUM problem - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Plans for solving the VACUUM problem
Date
Msg-id 27745.990236257@sss.pgh.pa.us
Whole thread Raw
In response to RE: Plans for solving the VACUUM problem  ("Mikheev, Vadim" <vmikheev@SECTORBASE.COM>)
Responses Re: Plans for solving the VACUUM problem
List pgsql-hackers
"Mikheev, Vadim" <vmikheev@SECTORBASE.COM> writes:
>> Vadim, can you remind me what UNDO is used for?
> Ok, last reminder -:))

> On transaction abort, read WAL records and undo (rollback)
> changes made in storage. Would allow:

> 1. Reclaim space allocated by aborted transactions.
> 2. Implement SAVEPOINTs.
>    Just to remind -:) - in the event of error discovered by server
>    - duplicate key, deadlock, command mistyping, etc, - transaction
>    will be rolled back to the nearest implicit savepoint setted
>    just before query execution; - or transaction can be aborted by
>    ROLLBACK TO <savepoint_name> command to some explicit savepoint
>    setted by user. Transaction rolled back to savepoint may be continued.
> 3. Reuse transaction IDs on postmaster restart.
> 4. Split pg_log into small files with ability to remove old ones (which
>    do not hold statuses for any running transactions).

Hm.  On the other hand, relying on WAL for undo means you cannot drop
old WAL segments that contain records for any open transaction.  We've
already seen several complaints that the WAL logs grow unmanageably huge
when there is a long-running transaction, and I think we'll see a lot
more.

It would be nicer if we could drop WAL records after a checkpoint or two,
even in the presence of long-running transactions.  We could do that if
we were only relying on them for crash recovery and not for UNDO.

Looking at the advantages:

1. Space reclamation via UNDO doesn't excite me a whole lot, if we can
make lightweight VACUUM work well.  (I definitely don't like the idea
that after a very long transaction fails and aborts, I'd have to wait
another very long time for UNDO to do its thing before I could get on
with my work.  Would much rather have the space reclamation happen in
background.)

2. SAVEPOINTs would be awfully nice to have, I agree.

3. Reusing xact IDs would be nice, but there's an answer with a lot less
impact on the system: go to 8-byte xact IDs.  Having to shut down the
postmaster when you approach the 4Gb transaction mark isn't going to
impress people who want a 24x7 commitment, anyway.

4. Recycling pg_log would be nice too, but we've already discussed other
hacks that might allow pg_log to be kept finite without depending on
UNDO (or requiring postmaster restarts, IIRC).

I'm sort of thinking that undoing back to a savepoint is the only real
usefulness of WAL-based UNDO.  Is it practical to preserve the WAL log
just back to the last savepoint in each xact, not the whole xact?

Another thought: do we need WAL UNDO at all to implement savepoints?
Is there some way we could do them like nested transactions, wherein
each savepoint-to-savepoint segment is given its own transaction number?
Committing multiple xact IDs at once might be a little tricky, but it
seems like a narrow, soluble problem.  Implementing UNDO without
creating lots of performance issues looks a lot harder.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Larry Rosenman
Date:
Subject: Re: Interesting question
Next
From: ncm@zembu.com (Nathan Myers)
Date:
Subject: Re: Plans for solving the VACUUM problem