Re: Plans for solving the VACUUM problem - Mailing list pgsql-hackers

From Vadim Mikheev
Subject Re: Plans for solving the VACUUM problem
Date
Msg-id 002d01c0e0f7$376b59a0$4c79583f@sectorbase.com
Whole thread Raw
In response to RE: Plans for solving the VACUUM problem  ("Mikheev, Vadim" <vmikheev@SECTORBASE.COM>)
Responses Re: Plans for solving the VACUUM problem  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
> Hm.  On the other hand, relying on WAL for undo means you cannot drop
> old WAL segments that contain records for any open transaction.  We've
> already seen several complaints that the WAL logs grow unmanageably huge
> when there is a long-running transaction, and I think we'll see a lot
> more.
> 
> It would be nicer if we could drop WAL records after a checkpoint or two,
> even in the presence of long-running transactions.  We could do that if
> we were only relying on them for crash recovery and not for UNDO.

As you understand this is old, well-known problem in database practice,
described in books. Two ways - either abort too long running transactions
or (/and) compact old log segments: fetch and save (to use for undo)
records of long-running transactions and remove other records. Neither
way is perfect but nothing is perfect at all -:)

> 1. Space reclamation via UNDO doesn't excite me a whole lot, if we can
> make lightweight VACUUM work well.  (I definitely don't like the idea

Sorry, but I'm going to consider background vacuum as temporary solution
only. As I've already pointed, original PG authors finally became
disillusioned with the same approach. What is good in using UNDO for 1.
is the fact that WAL records give you *direct* physical access to changes
which should be rolled back.

> that after a very long transaction fails and aborts, I'd have to wait
> another very long time for UNDO to do its thing before I could get on
> with my work.  Would much rather have the space reclamation happen in
> background.)

Understandable, but why other transactions should read dirty data again
and again waiting for background vacuum? I think aborted transaction
should take some responsibility for mess made by them -:)
And keeping in mind 2. very long transactions could be continued -:)

> 2. SAVEPOINTs would be awfully nice to have, I agree.
> 
> 3. Reusing xact IDs would be nice, but there's an answer with a lot less
> impact on the system: go to 8-byte xact IDs.  Having to shut down the
> postmaster when you approach the 4Gb transaction mark isn't going to
> impress people who want a 24x7 commitment, anyway.

+8 bytes in tuple header is not so tiny thing.

> 4. Recycling pg_log would be nice too, but we've already discussed other
> hacks that might allow pg_log to be kept finite without depending on
> UNDO (or requiring postmaster restarts, IIRC).

We did... and didn't get agreement.

> I'm sort of thinking that undoing back to a savepoint is the only real
> usefulness of WAL-based UNDO. Is it practical to preserve the WAL log
> just back to the last savepoint in each xact, not the whole xact?

No, it's not. It's not possible in overwriting systems at all - all
transaction records are required.

> Another thought: do we need WAL UNDO at all to implement savepoints?
> Is there some way we could do them like nested transactions, wherein
> each savepoint-to-savepoint segment is given its own transaction number?
> Committing multiple xact IDs at once might be a little tricky, but it
> seems like a narrow, soluble problem.

Implicit savepoints wouldn't be possible - this is very convenient
feature I've found in Oracle.
And additional code in tqual.c wouldn't be good addition.

> Implementing UNDO without creating lots of performance issues looks
> a lot harder.

What *performance* issues?!
The only issue is additional disk requirements.

Vadim




pgsql-hackers by date:

Previous
From: Oleg Bartunov
Date:
Subject: Re: Re: Functions returning sets
Next
From: Christopher
Date:
Subject: DROP CONSTRAINT patch