Thread: remove ATTRIBUTE_FIXED_PART_SIZE
Since the introduction of the CATALOG_VARLEN stuff, the fixed size of pg_attribute is exactly sizeof(FormData_pg_attribute), so the ancient mechanism to track the fixed size manually using ATTRIBUTE_FIXED_PART_SIZE can be removed. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > Since the introduction of the CATALOG_VARLEN stuff, the fixed size of > pg_attribute is exactly sizeof(FormData_pg_attribute), so the ancient > mechanism to track the fixed size manually using > ATTRIBUTE_FIXED_PART_SIZE can be removed. Meh, I'm not sure about this. What about the possibility of trailing padding after the last fixed column, as the comment you propose to remove is talking about? Sure, we don't have that today, but we might again the next time somebody adds a column to pg_attribute. regards, tom lane
On August 18, 2018 8:37:00 PM GMT+02:00, Tom Lane <tgl@sss.pgh.pa.us> wrote: >Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >> Since the introduction of the CATALOG_VARLEN stuff, the fixed size of >> pg_attribute is exactly sizeof(FormData_pg_attribute), so the ancient >> mechanism to track the fixed size manually using >> ATTRIBUTE_FIXED_PART_SIZE can be removed. > >Meh, I'm not sure about this. What about the possibility of trailing >padding after the last fixed column, as the comment you propose to >remove is talking about? Sure, we don't have that today, but we >might again the next time somebody adds a column to pg_attribute. Where would avoiding that padding be an important thing her? Strictly speaking it's not even safe to access the last memberswithout the padding present - the compiler is free to store with a wider write if that just affects padding. GCC doesthat in some cases IIRC. Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Andres Freund <andres@anarazel.de> writes: > On August 18, 2018 8:37:00 PM GMT+02:00, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Meh, I'm not sure about this. What about the possibility of trailing >> padding after the last fixed column, as the comment you propose to >> remove is talking about? Sure, we don't have that today, but we >> might again the next time somebody adds a column to pg_attribute. > Where would avoiding that padding be an important thing her? Strictly speaking it's not even safe to access the last memberswithout the padding present - the compiler is free to store with a wider write if that just affects padding. GCC doesthat in some cases IIRC. If that's true, it's causing problems already, no? If you've got something like int somefield; bool someflag; #ifdef CATALOG_VARLEN text somevarlena; #endif then our tuple packing rules will feel free to put the text value immediately adjacent to the bool, when it's short-header. According to what you're saying, code that tries to update "someflag" in a tuple risks corrupting the text value. Contrariwise, if the varlena field is NULL, there won't be any valid data at all after the bool. You would get away with copying the tuple using a word-aligned struct definition anyway, physically, but I bet valgrind would be unhappy with you. Possibly we need to be more careful than we are now about whether there's padding at the end of the fixed-size fields ... but just ripping out the code that attempts to deal with that is hardly an improvement. regards, tom lane
On 18/08/2018 23:05, Tom Lane wrote: > Possibly we need to be more careful than we are now about whether > there's padding at the end of the fixed-size fields ... but just > ripping out the code that attempts to deal with that is hardly > an improvement. I don't think the tuple packing issue has to do with the tuple descriptor code. The tuple descriptors already use allocations of size sizeof(FormData_pg_attribute) (CreateTemplateTupleDesc), just the memcpy and memset calls use (potentially) less. That might have saved a few bytes for omitting the varlena fields, but I don't think it affects alignment correctness. If we, say, added a trailing char field now, the only thing this code -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 20/08/2018 12:32, Peter Eisentraut wrote: > On 18/08/2018 23:05, Tom Lane wrote: >> Possibly we need to be more careful than we are now about whether >> there's padding at the end of the fixed-size fields ... but just >> ripping out the code that attempts to deal with that is hardly >> an improvement. > > I don't think the tuple packing issue has to do with the tuple > descriptor code. The tuple descriptors already use allocations of size > sizeof(FormData_pg_attribute) (CreateTemplateTupleDesc), just the memcpy > and memset calls use (potentially) less. That might have saved a few > bytes for omitting the varlena fields, but I don't think it affects > alignment correctness. If we, say, added a trailing char field now, the > only thing this code [oops] ... the only thing the current code would accomplish is not copying the last three padding bytes, which might even be slower than copying all four. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 2018-08-20 12:34:15 +0200, Peter Eisentraut wrote: > On 20/08/2018 12:32, Peter Eisentraut wrote: > > On 18/08/2018 23:05, Tom Lane wrote: > >> Possibly we need to be more careful than we are now about whether > >> there's padding at the end of the fixed-size fields ... but just > >> ripping out the code that attempts to deal with that is hardly > >> an improvement. > > > > I don't think the tuple packing issue has to do with the tuple > > descriptor code. The tuple descriptors already use allocations of size > > sizeof(FormData_pg_attribute) (CreateTemplateTupleDesc), just the memcpy > > and memset calls use (potentially) less. That might have saved a few > > bytes for omitting the varlena fields, but I don't think it affects > > alignment correctness. If we, say, added a trailing char field now, the > > only thing this code > > [oops] > > ... the only thing the current code would accomplish is not copying the > last three padding bytes, which might even be slower than copying all four. That's not generally the case though, there's code like: static VacAttrStats * examine_attribute(Relation onerel, int attnum, Node *index_expr) { ... /* * Create the VacAttrStats struct. Note that we only have a copy of the * fixed fields of the pg_attribute tuple. */ stats = (VacAttrStats *) palloc0(sizeof(VacAttrStats)); stats->attr = (Form_pg_attribute) palloc(ATTRIBUTE_FIXED_PART_SIZE); memcpy(stats->attr, attr, ATTRIBUTE_FIXED_PART_SIZE); Accesses to stats->attr can legitimately assume that the padding at the tail end of attr is present (i.e. widening a load / store). Greetings, Andres Freund
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: >> I don't think the tuple packing issue has to do with the tuple >> descriptor code. The tuple descriptors already use allocations of size >> sizeof(FormData_pg_attribute) (CreateTemplateTupleDesc), just the memcpy >> and memset calls use (potentially) less. That might have saved a few >> bytes for omitting the varlena fields, but I don't think it affects >> alignment correctness. If we, say, added a trailing char field now, the >> only thing this code > ... the only thing the current code would accomplish is not copying the > last three padding bytes, which might even be slower than copying all four. But, if the varlena fields are all null, those bytes might not be there ... at least, not according to valgrind. I agree this is all moot as long as there's no pad bytes. What I'm trying to figure out is if we need to put in place some provisions to prevent there from being pad bytes at the end of any catalog struct. According to what Andres is saying, it seems like we do (at least for ones with varlena fields). regards, tom lane
On 20/08/2018 12:46, Andres Freund wrote: > static VacAttrStats * > examine_attribute(Relation onerel, int attnum, Node *index_expr) > { > ... > /* > * Create the VacAttrStats struct. Note that we only have a copy of the > * fixed fields of the pg_attribute tuple. > */ > stats = (VacAttrStats *) palloc0(sizeof(VacAttrStats)); > stats->attr = (Form_pg_attribute) palloc(ATTRIBUTE_FIXED_PART_SIZE); > memcpy(stats->attr, attr, ATTRIBUTE_FIXED_PART_SIZE); > > Accesses to stats->attr can legitimately assume that the padding at the > tail end of attr is present (i.e. widening a load / store). Yeah, that's probably just broken. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 20/08/2018 15:14, Tom Lane wrote: > I agree this is all moot as long as there's no pad bytes. What I'm > trying to figure out is if we need to put in place some provisions > to prevent there from being pad bytes at the end of any catalog struct. > According to what Andres is saying, it seems like we do (at least for > ones with varlena fields). Yes, I think there could be a problem. I took a brief look through the catalogs, and while there are plenty of catalogs with trailing padding, finding that in combination with trailing varlena fields that might legitimately be all null in practice might require a closer look. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 21/08/2018 17:38, Peter Eisentraut wrote: > On 20/08/2018 15:14, Tom Lane wrote: >> I agree this is all moot as long as there's no pad bytes. What I'm >> trying to figure out is if we need to put in place some provisions >> to prevent there from being pad bytes at the end of any catalog struct. >> According to what Andres is saying, it seems like we do (at least for >> ones with varlena fields). > > Yes, I think there could be a problem. I took a brief look through the > catalogs, and while there are plenty of catalogs with trailing padding, > finding that in combination with trailing varlena fields that might > legitimately be all null in practice might require a closer look. Looking into this a bit more, a few catalogs could use some BKI_FORCE_NOT_NULL settings, which then avoids the described situation. See attached patch. That leaves pg_constraint and pg_event_trigger where you can construct legitimate tuples where the fixed portion has trailing padding and the variable fields can all be null. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes: > On 21/08/2018 17:38, Peter Eisentraut wrote: >> On 20/08/2018 15:14, Tom Lane wrote: >>> I agree this is all moot as long as there's no pad bytes. What I'm >>> trying to figure out is if we need to put in place some provisions >>> to prevent there from being pad bytes at the end of any catalog struct. >>> According to what Andres is saying, it seems like we do (at least for >>> ones with varlena fields). >> Yes, I think there could be a problem. I took a brief look through the >> catalogs, and while there are plenty of catalogs with trailing padding, >> finding that in combination with trailing varlena fields that might >> legitimately be all null in practice might require a closer look. > Looking into this a bit more, a few catalogs could use some > BKI_FORCE_NOT_NULL settings, which then avoids the described situation. > See attached patch. Um ... this would be enough to document that we don't think there's a *read* hazard, but Andres was claiming that there's also a *write* hazard. That is, if you store into (say) a bool in the last declared column, the compiler will think it's free to trash what it believes are padding bytes at the end of the struct. This is problematic: if there's a short-header varlena there, the overwrite clobbers data directly; and even if the varlena is aligned, our tuple traversal code requires the alignment pad bytes to be zero. I think what we might need to do, in places where the end of struct is not already aligned, is to do something like bool is_instead; #ifdef CATALOG_VARLEN /* variable-length fields start here */ pg_node_tree ev_qual; pg_node_tree ev_action; #else uint8 dummy; #endif and then teach the Catalog.pm routines to ignore the "dummy" fields while emitting the .bki data. However, I'd want some automatic verification that we'd added dummy fields where and only where needed. BTW, I have no objection to documenting should-not-be-null varlena fields as you suggest. But it's cosmetic and doesn't really fix the problem at hand. regards, tom lane
Hi, On 2018-08-24 11:47:43 -0400, Tom Lane wrote: > Um ... this would be enough to document that we don't think there's a > *read* hazard, but Andres was claiming that there's also a *write* hazard. > That is, if you store into (say) a bool in the last declared column, > the compiler will think it's free to trash what it believes are padding > bytes at the end of the struct. This is problematic: if there's a > short-header varlena there, the overwrite clobbers data directly; and even > if the varlena is aligned, our tuple traversal code requires the alignment > pad bytes to be zero. Right. The relevant standardese, in C11 (C99 very similar), is: 6.2.6.1 General, 6): "When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values." I don't have the references at hand, but I'm fairly sure that at least gcc and clang can be made to exploit that. > I think what we might need to do, in places where the end of struct is > not already aligned, is to do something like > > bool is_instead; > > #ifdef CATALOG_VARLEN /* variable-length fields start here */ > pg_node_tree ev_qual; > pg_node_tree ev_action; > #else > uint8 dummy; > #endif > > and then teach the Catalog.pm routines to ignore the "dummy" fields while > emitting the .bki data. However, I'd want some automatic verification > that we'd added dummy fields where and only where needed. > > BTW, I have no objection to documenting should-not-be-null varlena > fields as you suggest. But it's cosmetic and doesn't really fix > the problem at hand. Hm, this is fairly ugly. I can't quite come up with something better right now tho :(. I occasionally wonder when it's time to give up 1:1 mapping between the structs and the catalog and force explicit conversion inbetween. But obviously that's a large change.. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2018-08-24 11:47:43 -0400, Tom Lane wrote: >> Um ... this would be enough to document that we don't think there's a >> *read* hazard, but Andres was claiming that there's also a *write* hazard. > Right. The relevant standardese, in C11 (C99 very similar), is: > 6.2.6.1 General, 6): > "When a value is stored in an object of structure or union type, including in a member > object, the bytes of the object representation that correspond to any padding bytes take > unspecified values." > I don't have the references at hand, but I'm fairly sure that at least > gcc and clang can be made to exploit that. Thing is, if that's true, why have we not seen field reports of catalog corruption problems? Maybe we're just fortunate that we don't try to update the last fixed field of any catalog that way? regards, tom lane
On August 24, 2018 9:16:27 AM PDT, Tom Lane <tgl@sss.pgh.pa.us> wrote: >Andres Freund <andres@anarazel.de> writes: >> On 2018-08-24 11:47:43 -0400, Tom Lane wrote: >>> Um ... this would be enough to document that we don't think there's >a >>> *read* hazard, but Andres was claiming that there's also a *write* >hazard. > >> Right. The relevant standardese, in C11 (C99 very similar), is: >> 6.2.6.1 General, 6): >> "When a value is stored in an object of structure or union type, >including in a member >> object, the bytes of the object representation that correspond to any >padding bytes take >> unspecified values." > >> I don't have the references at hand, but I'm fairly sure that at >least >> gcc and clang can be made to exploit that. > >Thing is, if that's true, why have we not seen field reports of catalog >corruption problems? Maybe we're just fortunate that we don't try to >update the last fixed field of any catalog that way? I suspect that the code doing so usually is "too boring" to present many chances for optimization. And that, as you say,we largely update earlier fields (without resorting to deforming and forming the tuple at least). Andres -- Sent from my Android device with K-9 Mail. Please excuse my brevity.