Re: [PATCH] Clarify the comments about varlena header encoding - Mailing list pgsql-hackers
From | John Naylor |
---|---|
Subject | Re: [PATCH] Clarify the comments about varlena header encoding |
Date | |
Msg-id | CAFBsxsEriQKtU9fEDA8cf=0FME62EZyUXZ+csgz73mTUoZcSwQ@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Clarify the comments about varlena header encoding (Aleksander Alekseev <aleksander@timescale.com>) |
Responses |
Re: [PATCH] Clarify the comments about varlena header encoding
|
List | pgsql-hackers |
On Tue, Sep 6, 2022 at 5:19 PM Aleksander Alekseev <aleksander@timescale.com> wrote: > > Hi John, > > Thanks for the feedback. > > > Maybe "00xxxxxx 4-byte length word (aligned)," is more clear about > > what is aligned. Also, adding all those xxx's obscures the point that > > we only need to examine one byte to figure out what to do next. > > IMO "00xxxxxx 4-byte length word" is still confusing. One can misread > this as a 00-xx-xx-xx hex value, where the first byte (not two bits) > is 00h. The top of the comment literally says * Bit layouts for varlena headers on big-endian machines: ...but maybe we can say at the top that we inspect the first byte to determine what kind of header it is. Or put the now-standard 0b in front. > > The patch has: > > > > + * In the third case the va_tag field (see varattrib_1b_e) is used to discern > > + * the specific type and length of the pointer datum. On disk the "xxx" bits > > + * currently always store sizeof(varatt_external) + 2. > > > > ...so not sure where 17 came from. > > Right, AFTER applying the patch it's clear that it's actually 18 > bytes. Okay, I see now that this quote from your first email: "This is misleading too. The comments above this line say that `struct varatt_external` is a TOAST pointer. sizeof(varatt_external) = 16, plus 1 byte equals 17, right? However the documentation [1] claims the result should be 18:" ...is not your thought, but one of a fictional misled reader. I actually found this phrase more misleading than the header comments. :-) I think the problem is ambiguity about what a "toast pointer" is. This comment: * struct varatt_external is a traditional "TOAST pointer", that is, the has caused people to think a toasted value in the main relation takes up 16 bytes on disk sizeof(varatt_external) = 16, when it's actually 18. Is the 16 the "toast pointer" or the 18? > > - * 10000000 1-byte length word, unaligned, TOAST pointer > > + * 10000000 xxxxxxxx, TOAST pointer (struct varatt_external) > > > > This implies that the header is two bytes, which is not accurate. That > > next byte is a type tag: > > [...] > > ...and does not always represent the on-disk length: > > Well, the comments don't say what is the header and what is the type > tag. Because the comments explain the following macros that read bits in the *first* byte of a 1- or 4-byte header to determine what kind it is. > They merely describe the bit layouts. The patch doesn't seem to > make things worse in this respect. Do you think we should address this > too? I suspect that describing the difference between the header and > the type tag here will create even more confusion. I said nothing about describing the difference between the header and type tag. The patch added xxx's for the type tag in a comment about the header. This is more misleading than what is there now. -- John Naylor EDB: http://www.enterprisedb.com
pgsql-hackers by date: