Re: PITR, checkpoint, and local relations - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: PITR, checkpoint, and local relations
Date
Msg-id 200208012114.g71LE4l00726@candle.pha.pa.us
Whole thread Raw
In response to PITR, checkpoint, and local relations  ("J. R. Nield" <jrnield@usol.com>)
Responses Re: PITR, checkpoint, and local relations  ("J. R. Nield" <jrnield@usol.com>)
List pgsql-hackers
J.R needs comments on this.  PITR has problems because local relations
aren't logged to WAL.  Suggestions?

---------------------------------------------------------------------------

J. R. Nield wrote:
> As per earlier discussion, I'm working on the hot backup issues as part
> of the PITR support. While I was looking at the buffer manager and the
> relcache/MyDb issues to figure out the best way to work this, it
> occurred to me that PITR will introduce a big problem with the way we
> handle local relations.
> 
> The basic problem is that local relations (rd_myxactonly == true) are
> not part of a checkpoint, so there is no way to get a lower bound on the
> starting LSN needed to recover a local relation. In the past this did
> not matter, because either the local file would be (effectively)
> discarded during recovery because it had not yet become visible, or the
> file would be flushed before the transaction creating it made it
> visible. Now this is a problem.
> 
> So I need a decision from the core team on what to do about the local
> buffer manager. My preference would be to forget about the local buffer
> manager entirely, or if not that then to allow it only for _true_
> temporary data. The only alternative I can devise is to create some way
> for all other backends to participate in a checkpoint, perhaps using a
> signal. I'm not sure this can be done safely. 
> 
> Anyway, I'm glad the tuplesort stuff doesn't try to use relation files
> :-)
> 
> Can the core team let me know if this is acceptable, and whether I
> should move ahead with changes to the buffer manager (and some other
> stuff) needed to avoid special treatment of rd_myxactonly relations?
> 
> Also to Richard: have you guys at multera dealt with this issue already?
> Is there some way around this that I'm missing?
> 
> 
> Regards,
> 
>   John Nield
> 
> 
> 
> 
> Just as an example of this problem, imagine the following sequence:
> 
> 1) Transaction TX1 creates a local relation LR1 which will eventually
> become a globally visible table. Tuples are inserted into the local
> relation, and logged to the WAL file. Some tuples remain in the local
> buffer cache and are not yet written out, although they are logged. TX1
> is still in progress.
> 
> 2) Backup starts, and checkpoint is called to get a minimum starting LSN
> (MINLSN) for the backed-up files. Only the global buffers are flushed.
> 
> 3) Backup process copies LR1 into the backup directory. (postulate some
> way of coordinating with the local buffer manager, a problem I have not
> solved).
> 
> 4) TX1 commits and flushes its local buffers. A dirty buffer exists
> whose LSN is before MINLSN. LR1 becomes globally visible.
> 
> 5) Backup finishes copying all the files, including the local relations,
> and then flushes the log. The log files between MINLSN and the current
> LSN are copied to the backup directory, and backup is done.
> 
> 6) Sometime later, a system administrator restores the backup and plays
> the logs forward starting at MINLSN. LR1 will be corrupt, because some
> of the log entries required for its restoration will be before MINLSN.
> This corruption will not be detected until something goes wrong.
> 
> BTW: The problem doesn't only happen with backup! It occurs at every
> checkpoint as well, I just missed it until I started working on the hot
> backup issue.
> 
> -- 
> J. R. Nield
> jrnield@usol.com
> 
> 
> 
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 2: you can get off all lists at once with the unregister command
>     (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> 

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


pgsql-hackers by date:

Previous
From: "Marc G. Fournier"
Date:
Subject: Re: Open 7.3 items
Next
From: Bruce Momjian
Date:
Subject: Re: Trimming the Fat, Part Deux ...