Re: WAL format changes - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: WAL format changes |
Date | |
Msg-id | 201206142358.12431.andres@2ndquadrant.com Whole thread Raw |
In response to | WAL format changes (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>) |
Responses |
Re: WAL format changes
|
List | pgsql-hackers |
On Thursday, June 14, 2012 11:01:42 PM Heikki Linnakangas wrote: > As I threatened earlier > (http://archives.postgresql.org/message-id/4FD0B1AB.3090405@enterprisedb.co > m), here are three patches that change the WAL format. The goal is to > change the format so that when you're inserting a WAL record of a given > size, you know exactly how much space it requires in the WAL. > > 1. Use a 64-bit segment number, instead of the log/seg combination. And > don't waste the last segment on each logical 4 GB log file. The concept > of a "logical log file" is now completely gone. XLogRecPtr is unchanged, > but it should now be understood as a plain 64-bit value, just split into > two 32-bit integers for historical reasons. On disk, this means that > there will be log files ending in FF, those were skipped before. Whats the reason for keeping that awkward split now? There aren't that many users of xlogid/xcrecoff and many of those would be better served by using helper macros. API compatibility isn't a great argument either as code manually playing around with those needs to be checked anyway. I think there might be some code around that does XLogRecPtr addition manuall and such. > 2. Always include the xl_rem_len field, used for continuation records, > in the xlog page header. A continuation log record only contained that > one field, it's now included straight in the page header, so the concept > of a continuation record doesn't exist anymore. Because of alignment, > this wastes 4 bytes on every page that contains continued data from a > previous record, and 8 bytes on pages that don't. That's not very much, > and the next step will buy that back: > > 3. Allow WAL record header to be split across pages. Per Tom's > suggestion, move xl_tot_len to be the first field in XLogRecord, so that > even if the header is split, xl_tot_len is always on the first page. > xl_crc is moved to be the last field, and xl_prev is the second to last. > This has the advantage that you can calculate the CRC for all the other > fields before acquiring WALInsertLock. For xl_prev, you need to know > where exactly the record is inserted, so it's handy that it's the last > field before CRC. This patch doesn't try to take advantage of that, > however, and I'm not sure if that makes any difference once I finish the > patch to make XLogInsert scale better, which is the ultimate goal of all > this. > > Those are the three patches I'd like to get committed in this > commitfest. To see where all this is leading to, I've included a rough > WIP version of the XLogInsert scaling patch. This version is quite > different from the one I posted in spring, it takes advantage of the WAL > format changes, and I'm also experimenting with a different method of > tracking how far each WAL insertion has progressed. But more on that later. > > (Note to self: remember to bump XLOG_PAGE_MAGIC) Will review. Andres -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: