Re: Mulit-Vesions and Vacuum - Mailing list pgsql-general

From nconway@klamath.dyndns.org (Neil Conway)
Subject Re: Mulit-Vesions and Vacuum
Date
Msg-id 20020719022855.GA17496@klamath.dyndns.org
Whole thread Raw
In response to Mulit-Vesions and Vacuum  (Anthony Berglas <anthony.berglas@lucida.com>)
List pgsql-general
On Thu, Jul 18, 2002 at 06:43:03PM -0700, Anthony Berglas wrote:
> Does anyone really know how this multi version concurrency really works?

Yes, lots of people -- namely, those that read the documentation.

> 1. How long do previous version hang arround?  As long as the oldest
> transaction, or until a Vacuum?

I may as well quote the docs:

 In normal PostgreSQL operation, an UPDATE or DELETE of a row does not
 immediately remove the old tuple (version of the row). This approach is
 necessary to gain the benefits of multiversion concurrency control (see
 the User's Guide): the tuple must not be deleted while it is still
 potentially visible to other transactions. But eventually, an outdated
 or deleted tuple is no longer of interest to any transaction. The space
 it occupies must be reclaimed for reuse by new tuples, to avoid
 infinite growth of disk space requirements. This is done by running
 VACUUM.

(from http://developer.postgresql.org/docs/postgres/routine-vacuuming.html)

> 2. How expensive are they?  Do the multi-version copies clutter the main
> database blocks, or are they stored elsewhere?  And if I update one field of
> a row, is the entire row copied or just the field?

Sequential scans will need to read through old tuples.

In 7.2 and earlier versions, dead tuples remain in indexes; to determine
whether a tuple found from an index scan is visible, the backend needs
to check the heap tuple (since the index entry doesn't store transaction
visibility info).

In 7.3, Tom Lane wrote the code to allow index scans to ignore dead
tuples dynamically (without the need for a VACUUM) -- the dead index
tuple still exists, but the index scan ignores it. Search for "tuple
killing" in the pgsql-hackers archives for more information.

> 3. What does Vacuum really do?  Will it interfere with open transactions, or
> will it know to leave multi version copies for existing transactions?

This is from the same page I quoted above:

 Beginning in PostgreSQL 7.2, the standard form of VACUUM can run in
 parallel with normal database operations (selects, inserts, updates,
 deletes, but not changes to table schemas).

> PS.  I have done a few experiements on how the locking really works, see
> http://www.SimpleORM.org/DBNotes.html.

On that page, you note "We do not see any of the changes made by Session
One, including the phantoms. This is in contradiction to the PostgreSQL
documentation, but good." Where is the contradiction between the
behavior you observed, and the documentation? For example:

 Read Committed  is the default isolation level in PostgreSQL. When a
 transaction runs on this isolation level, a SELECT query sees only data
 committed before the query began; it never sees either uncommitted data
 or changes committed during query execution by concurrent transactions.
 (However, the SELECT does see the effects of previous updates executed
 within its own transaction, even though they are not yet committed.) In
 effect, a SELECT query sees a snapshot of the database as of the
 instant that that query begins to run. Notice that two successive
 SELECTs can see different data, even though they are within a single
 transaction, if other transactions commit changes during execution of
 the first SELECT.

(from
http://developer.postgresql.org/docs/postgres/xact-read-committed.html)

Cheers,

Neil

--
Neil Conway <neilconway@rogers.com>
PGP Key ID: DB3C29FC

pgsql-general by date:

Previous
From: Anthony Berglas
Date:
Subject: Mulit-Vesions and Vacuum
Next
From: Martijn van Oosterhout
Date:
Subject: Re: Mulit-Vesions and Vacuum