Re: Checkpoint cost, looks like it is WAL/CRC - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: Checkpoint cost, looks like it is WAL/CRC |
Date | |
Msg-id | 200507071622.j67GMmf19838@candle.pha.pa.us Whole thread Raw |
In response to | Re: Checkpoint cost, looks like it is WAL/CRC (Simon Riggs <simon@2ndquadrant.com>) |
List | pgsql-hackers |
Simon Riggs wrote: > On Wed, 2005-07-06 at 18:22 -0400, Bruce Momjian wrote: > > Well, I added #1 yesterday as 'full_page_writes', and it has the same > > warnings as fsync (namely, on crash, be prepared to recovery or check > > your system thoroughly. > > Yes, which is why I comment now that the GUC alone is not enough. > > There is no way to "check your system thoroughly". If there is a certain > way of knowing torn pages had *not* occurred, then I would be happy. Yep, it is a pain, and like fsync. > > As far as #2, my posted proposal was to write the full pages to WAL when > > they are written to the file system, and not when they are first > > modified in the shared buffers --- the goal being that it will even out > > the load, and it will happen in a non-critical path, hopefully by the > > background writer or at checkpoint time. > > The page must be written before the changes to the page are written, so > that they are available sequentially in the log for replay. The log and > the database are not connected, so we cannot do it that way. If the page > is written out of sequence from the changes to it, how would recovery > know where to get the page from? See my later email --- the full page will be restored later from WAL, so our changes don't have to be made at that point. > ISTM there is mileage in your idea of trying to shift the work to > another time. My thought is "which blocks exactly are the ones being > changed?". Maybe that would lead to a reduction. > > > > With wal_changed_pages= off *any* crash would possibly require an > > > archive recovery, or a replication rebuild. It's good that we now have > > > PITR, but we do also have other options for availability. Users of > > > replication could well be amongst the first to try out this option. > > > > Seems it is similar to fsync in risk, which is not a new option. > > Risk is not acceptable. We must have certainty, either way. > > Why have two GUCs? Why not just have one GUC that does both at the same > time? When would you want one but not the other? > risk_data_loss_to_gain_performance = true Yep, one new one might make sense. > > I think if we document full_page_writes as similar to fsync in risk, we > > are OK for 8.1, but if something can be done easily, it sounds good. > > Documenting something simply isn't enough. I simply cannot advise > anybody ever to use the new GUC. If their data was low value, they > wouldn't even be using PostgreSQL, they'd use a non-transactional DBMS. > > I agree we *must* have the GUC, but we also *must* have a way for crash > recovery to tell us for certain that it has definitely worked, not just > maybe worked. Right. I am thinking your CRC write to WAL might do that. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001+ If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania19073
pgsql-hackers by date: