Re: XLog size reductions: smaller XLRec block header for PG17 - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: XLog size reductions: smaller XLRec block header for PG17 |
Date | |
Msg-id | CALDaNm2Wg9OwUumwd9oPsFEGfF8j_LA3eLjJdUzDNuX9eTMLDA@mail.gmail.com Whole thread Raw |
In response to | Re: XLog size reductions: smaller XLRec block header for PG17 (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: XLog size reductions: smaller XLRec block header for PG17
|
List | pgsql-hackers |
On Tue, 26 Sept 2023 at 02:09, Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > > On Tue, 19 Sept 2023 at 01:03, Andres Freund <andres@anarazel.de> wrote: > > > > Hi, > > > > On 2023-05-18 19:22:26 +0300, Heikki Linnakangas wrote: > > > On 18/05/2023 17:59, Matthias van de Meent wrote: > > > > It changes the block IDs used to fit in 6 bits, using the upper 2 bits > > > > of the block_id field to store how much data is contained in the > > > > record (0, <=UINT8_MAX, or <=UINT16_MAX bytes). > > > > > > Perhaps we should introduce a few generic inline functions to do varint > > > encoding. That could be useful in many places, while this scheme is very > > > tailored for XLogRecordBlockHeader. > > This scheme is reused later for the XLogRecord xl_tot_len field over > at [0], and FWIW is thus being reused. Sure, it's tailored to this WAL > use case, but IMO we're getting good value from it. We don't use > protobuf or JSON for WAL, we use our own serialization format. Having > some specialized encoding/decoding in that format for certain fields > is IMO quite acceptable. > > > Yes - I proposed that and wrote an implementation of reasonably efficient > > varint encoding. Here's my prototype: > > https://postgr.es/m/20221004234952.anrguppx5owewb6n%40awork3.anarazel.de > > As I mentioned on that thread, that prototype has a significant > probability of doing nothing to improve WAL size, or even increasing > the WAL size for installations which consume a lot of OIDs. > > > I think it's a bad tradeoff to write lots of custom varint encodings, just to > > eek out a bit more space savings. > > This is only a single "custom" varint encoding though, if you can even > call it that. It makes a field's size depend on flags set in another > byte, which is not that much different from the existing use of > XLR_BLOCK_ID_DATA_[LONG, SHORT]. > > > The increase in code complexity IMO makes it a bad tradeoff. > > Pardon me for asking, but what would you consider to be a good > tradeoff then? I think the code relating to the WAL storage format is > about as simple as you can get it within the feature set it provides > and the size of the resulting records. While I think there is still > much to gain w.r.t. WAL record size, I don't think we can get much of > those improvements without adding at least some amount of complexity, > something I think to be true for most components in PostgreSQL. > > So, except for redesigning significant parts of the public WAL APIs, > are we just going to ignore any potential improvements because they > "increase code complexity"? I'm seeing that there has been no activity in this thread for nearly 4 months, I'm planning to close this in the current commitfest unless someone is planning to take it forward. Regards, Vignesh
pgsql-hackers by date: