Re: How many Mandatory Process are there in Postgres - Mailing list pgsql-general

From Scott Marlowe
Subject Re: How many Mandatory Process are there in Postgres
Date
Msg-id dcc563d11003252052p6f03e2ebi9bc4679e192b4f3f@mail.gmail.com
Whole thread Raw
In response to Re: How many Mandatory Process are there in Postgres  (Tadipathri Raghu <traghu.dba@gmail.com>)
Responses Re: How many Mandatory Process are there in Postgres  (Tadipathri Raghu <traghu.dba@gmail.com>)
List pgsql-general
On Thu, Mar 25, 2010 at 9:03 PM, Tadipathri Raghu <traghu.dba@gmail.com> wrote:
> Hi Scott, Thomas,
>
> Thank you for the update.
>
>>
>> >> Oracle uses a completely different implementation of MVCC architecture.
>> >> It
>> >> overwrites the data and then uses rollback segments to provide
>> >> 'previous
>> >> versions' to running transactions etc.
>> >>
>> >> PostgreSQL does not overwrite the data - it just creates a copy of the
>> >> row
>> >> and then decides which version should each session see (depending on
>> >> the
>> >> transaction IDs etc.). So it does not need to do rollbacks the way
>> >> Oracle
>> >> does, but it has to remove stale copies of the rows (such that no
>> >> running
>> >> transaction can see) - that's why there is VACUUM.
>> >
>> >
>> > Here, if you have issued a command pg_start_backup() at that time the
>> > cluster is freezed, and if any transaction takes place before the
>> > pg_stop_backup() issued at that time where the transaction data will be
>> > kept
>> > if the undo's are not there.
>>
>>
>> What do you mean by 'freezed'? The cluster operates normally, the
>> pg_start_backup() just creates a backup label (and performs a checkpoint),
>> but that's not a problem. OK, there could be a performance decrease
>> because of full page writes, but the data will be processed as if there is
>> no backup running. PostgreSQL does not need the checkpoints to perform
>> backup.
>
>> The 'UNDO' data is just kept in the main data files.  Then, based on what
>> your xid is, and the xmin / xmax on each row, it's either visible or not.
>>  That's what vacuum does, clean up the rows that are for 'undo' and can
>> never been seen anymore.  The REDO itself is kept in the transaction logs.
>
>
>>
>> Like Thomas says, the pg_start_backup() just creates a label that tells
>> the restoring database what pg_xlog record to start with when you 'restore'
>> your data.  The 'UNDO' type of data (just the old rows actually) is just
>> kept in the main table until vacuum nukes 'em.
>
> --Scott
> Q1. Does vacuum process come into existence even you turn off the
> autovacuum.? What is the main work for vacuum process, to the clean the
> buffers or work only when autovacuum in on and to clean up the rows that are
> 'undo'

1a: Yes, if you are approaching transaction id wraparound (txid wrap)
then autovacuum will crank up a vacuum to fix that situation whether
you like it or not.  The alternative is db shutdown.
1b: The main job of vacuum is to free dead tuples, which postgresql
accumulates as it updates or deletes tuples. When you have several old
versions of a tuple to collect, vacuum recycles the space and makes it
available to other updates to now use instead of allocating on the end
of the relation.

> Q2. Do you mean to say there is a vacuum process which will clean up the
> buffers that are for 'undo'?

No buffers for undo.  It's all stored on disc, ready for instant
access.  Try it on a test db.  begin; load a table with 1M rows;
rollback; sub second response.  A million changes just disappeared
instantly.  And now you've got 1M dead tuples in that table.  Vacuum's
job is to make them available for re-use.  But they're all in one big
consecutive file so the performance is actually pretty good as the
table repopulates, especially if it'll get back up to 1M records again
in the near future.  As long as it got vacuumed.

> Q3. Fine, if the vacuum is off then your data files will hold the undo data
> also in it?

Yep.  Until a forced vacuum due to txid wrap occurs.

pgsql-general by date:

Previous
From: Tadipathri Raghu
Date:
Subject: Re: How many Mandatory Process are there in Postgres
Next
From: Pete Kay
Date:
Subject: Re: Problem with Memory Leak