Home > mailing lists

Re: Incomplete description of pg_start_backup? - Mailing list pgsql-hackers

From	Dmitry Koterov
Subject	Re: Incomplete description of pg_start_backup?
Date	May 24, 2013 18:34:23
Msg-id	CA+CZih6L2w+BcLH4_EmhthdJDiiygGH5oApLuB2UvzSb6bCeag@mail.gmail.com Whole thread Raw
In response to	Re: Incomplete description of pg_start_backup? (Jeff Janes <jeff.janes@gmail.com>)
Responses	Re: Incomplete description of pg_start_backup?
List	pgsql-hackers

Tree view

I don't get still.

Suppose we have a data file with blocks with important (non-empty) data:

A B C D

1. I call pg_start_backup().

2. Tar starts to copy A block to the destination archive...

3. During this copying, somebody removes data from a table which is situated in B block. So this data is a subject for vacuuming, and the block is marked as a free space.

4. Somebody writes data to a table, and this data is placed to a free space - to B block. This is also added to the WAL log (so the data is stored at 2 places: at B block and at WAL).

5. Tar (at last!) finishes copying of A block and begins to copy B block.

6. It finishes, then it copies C and D to the archive too.

7. Then we call pg_stop_backup() and also archive collected WAL (which contains the new data of B block as we saw above).

The question is - where is the OLD data of B block in this scheme? Seems it is NOT in the backup! So it cannot be restored. (And, in case when we never overwrite blocks between pg_start_backup...pg_stop_backup, but always append the new data, it is not a problem.) Seems to me this is not documented at all! That is what my initial e-mail about.

(I have one hypothesis on that, but I am not sure. Here is it: does vacuum saves ALL deleted data of B block to WAL on step 3 prior deletion? If yes, it is, of course, a part of the backup. But it wastes space a lot...)

On Tue, May 14, 2013 at 6:05 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Mon, May 13, 2013 at 4:31 PM, Dmitry Koterov <dmitry@koterov.ru> wrote:
Could you please provide a bit more detailed explanation on how it works?

And how could postgres write at the middle of archiving files during an active pg_start_backup? if it could, here might be a case when a part of archived data file contains an overridden information "from the future",

The data files cannot contain information from the future. If the backup is restored, it must be restored to the time of pg_stop_backup (at least), which means the data would at that point be from the past/present, not the future.

Cheers,

Jeff

pgsql-hackers by date:

From: Amit Langote
Date: 24 May 2013, 18:24:29
Subject: Re: WAL segments (names) not in a sequence

From: Greg Smith
Date: 24 May 2013, 18:39:46
Subject: Re: Cost limited statements RFC

Re: Incomplete description of pg_start_backup? - Mailing list pgsql-hackers

Previous

Next