Reducing size of WAL record headers - Mailing list pgsql-hackers
From | Simon Riggs |
---|---|
Subject | Reducing size of WAL record headers |
Date | |
Msg-id | CA+U5nMJKvGhBF0Zwvg0-fuLisXf+Okue7_9fxAShwmq2UBM0KA@mail.gmail.com Whole thread Raw |
Responses |
Re: Reducing size of WAL record headers
Re: Reducing size of WAL record headers |
List | pgsql-hackers |
Overall, the WAL record is MAXALIGN'd, so with 8 byte alignment we waste 4 bytes per record. Or put another way, if we could reduce record header by 4 bytes, we would actually reduce it by 8 bytes per record. So looking for ways to do that seems like a good idea. The WAL record header starts with xl_tot_len, a 4 byte field. There is also another field, xl_len. The difference is that xl_tot_len includes the header, xl_len and any backup blocks. Since the header is fixed, the only time xl_tot_len != SizeOfXLogRecord + xl_len is when we have backup blocks. We can re-arrange the record layout so that we remove xl_tot_len and add another (maxaligned) 4 byte field (--> 8 bytes) after the record header, xl_bkpblock_len that only exists if we have backup blocks. This will then save 8 bytes from every record that doesn't have backup blocks, and be the same as now with backup blocks. The only problem is that we currently allow WAL records to be written so that the header wraps across pages. This allows us to save space in WAL when we have between 5 and 32 bytes spare at the end of a page. To reduce the header size by 8 bytes we would need to ensure that the whole header, which would now be 24 or 32 bytes, is all on one page. My math tells me that would waste on average 12 bytes per page because of the end-of-page wastage, but would gain 8 bytes per record when we don't have backup blocks. My thinking is that the end of page loss would be much reduced on average when we had backup blocks, so we could ignore that case. Assuming typically 100 records per page when we have no backup blocks, this is a considerable upside. We would make gains on any page with 3 or more WAL records on it, so low downside even in worst cases. That seems like a great break-even point for optimisation. Since we've changed the WAL format already this release, another change seems OK. More to the point, we can remove backup blocks in the common case without changing WAL format, so this might be the last time we have the chance to make this change. Forcing the XLogRecord header to be all on one page makes the format more robust and simplifies the code that copes with header wrapping. The format changes would mean that its still possible to work out the length of the WAL record precisely = SizeOfXLogRecord + (HasBkpBlocks ? SizeOf(uint32) : 0) + xl_len and so would then be protected by the WAL record CRC. Thoughts? -- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: