Re: Enabling Checksums - Mailing list pgsql-hackers
From | Jeff Davis |
---|---|
Subject | Re: Enabling Checksums |
Date | |
Msg-id | 1357798000.5162.38.camel@jdavis-laptop Whole thread Raw |
In response to | Re: Enabling Checksums (Jeff Davis <pgsql@j-davis.com>) |
Responses |
Re: Enabling Checksums
Re: Enabling Checksums |
List | pgsql-hackers |
On Tue, 2012-12-04 at 01:03 -0800, Jeff Davis wrote: > For now, I rebased the patches against master, and did some very minor > cleanup. I think there is a problem here when setting PD_ALL_VISIBLE. I thought I had analyzed that before, but upon review, it doesn't look right. Setting PD_ALL_VISIBLE needs to be associated with a WAL action somehow, and a bumping of the LSN, otherwise there is a torn page hazard. The solution doesn't seem particularly difficult, but there are a few strange aspects and I'm not sure exactly which path I should take. First of all, the relationship between MarkBufferDirty and SetBufferCommitInfoNeedsSave is a little confusing. The comment over MarkBufferDirty is confusing because it says that the caller must have an exclusive lock, or else bad data could be written. But that doesn't have to do with marking the buffer dirty, that has to do with the data page change you make while you are marking it dirty -- if it's a single bit change, then there is no risk that I can see. In the current code, the only real code difference between the two is that SetBufferCommitInfoNeedsSave might fail to mark the buffer dirty if there is a race. So, in the current code, we could actually combine the two by passing a "force" flag (if true, behaves like MarkBufferDirty, if false, behaves like SetBufferCommitInfoNeedsSave). The checksums patch also introduces another behavior into SetBufferCommitInfoNeedsSave, which is to write an XLOG_HINT WAL record if checksums are enabled (to avoid torn page hazards). That's only necessary for changes where the caller does not write WAL itself and doesn't bump the LSN of the data page. (There's a reason the caller can't easily write the XLOG_HINT WAL itself.) So, we could introduce another flag "needsWAL" that would control whether we write the XLOG_HINT WAL or not (only applies with checksums on, of course). The reason for all of this is because the setting of PD_ALL_VISIBLE does not fit MarkBufferDirty, because MarkBufferDirty does not write the XLOG_HINT WAL and neither does the caller. But it also doesn't fit SetBufferCommitInfoNeedsSave, because that is subject to a race. If MarkBufferDirty had the signature: MarkBufferDirty(Buffer buffer, bool force, bool needsWAL) then "normal" page changes would look like: MarkBufferDirty(buffer, true, false) setting PD_ALL_VISIBLE would look like: MarkBufferDirty(buffer, true, true) and setting a hint would look like: MarkBufferDirty(buffer, false, true) Another approach would be for the caller who sets PD_ALL_VISIBLE to write WAL. But that requires inventing a new WAL record or chaining the heap block onto the wal entry when doing visibilitymap_set (only necessary when checksums are on). That seems somewhat of a hack, but perhaps it's not too bad. Also, I have another patch posted that is removing PD_ALL_VISIBLE entirely, which is dampening my enthusiasm to do too much work that might be thrown away. So right now, I'm leaning toward just adding the heap buffer to the WAL chain during visibilitymap_set. Thoughts? Regards,Jeff Davis
pgsql-hackers by date: