Thread: fix_PGSTAT_NUM_TABENTRIES_macro patch
In src/include/pgstat.h, the PGSTAT_NUM_TABENTRIES macro
attempts to subtract off the size of the PgStat_MsgTabstat
struct up to the m_entry[] field. This macro was correct up
until the fields m_block_read_time and m_block_write_time
were added to that struct, as the macro was not changed to
include their size.
This patch fixes the macro.
As an aside, the PGSTAT_MSG_PAYLOAD define from which
these field sizes are being subtracted is a bit of a WAG, if
I am reading the code correctly, in which case whether the
two additional fields are subtracted from that WAG is perhaps
not critical. But the code is certainly easier to understand if
the macro matches the definition of the struct.
Mark Dilger
attempts to subtract off the size of the PgStat_MsgTabstat
struct up to the m_entry[] field. This macro was correct up
until the fields m_block_read_time and m_block_write_time
were added to that struct, as the macro was not changed to
include their size.
This patch fixes the macro.
As an aside, the PGSTAT_MSG_PAYLOAD define from which
these field sizes are being subtracted is a bit of a WAG, if
I am reading the code correctly, in which case whether the
two additional fields are subtracted from that WAG is perhaps
not critical. But the code is certainly easier to understand if
the macro matches the definition of the struct.
Mark Dilger
Attachment
Mark Dilger <markdilger@yahoo.com> writes: > In src/include/pgstat.h, the PGSTAT_NUM_TABENTRIES macro > attempts to subtract off the size of the PgStat_MsgTabstat > struct up to the m_entry[] field.� This macro was correct up > until the fields m_block_read_time and m_block_write_time > were added to that struct, as the macro was not changed to > include their size. Yeah, that's a bug. > This patch fixes the macro. I'm inclined to think that we should look for a less breakable way to manage the message size limit; if Robert missed this issue in that patch, other people are going to miss it in future patches. The existing coding isn't really right anyway when you consider possible alignment padding. (I think the present struct definitions probably don't involve any padding, but that's not a very future-proof assumption either.) It'd be better to somehow drive the calculation from offsetof(m_entry). I'm not seeing any way to do that that doesn't require some notational changes in the code, though. One way would be to rejigger it like this: #define PGSTAT_MAX_MSG_SIZE 1000 typedef union PgStat_MsgTabstat { struct { PgStat_MsgHdr hdr; Oid databaseid; int nentries; int xact_commit; int xact_rollback; PgStat_Counter block_read_time; /* times in microseconds */ PgStat_Counter block_write_time; PgStat_TableEntry entry[FLEXIBLE_ARRAY_MEMBER]; } m; char padding[PGSTAT_MAX_MSG_SIZE]; /* pad sizeof this union to target */ } PgStat_MsgTabstat; #define PGSTAT_NUM_TABENTRIES ((PGSTAT_MAX_MSG_SIZE - offsetof(PgStat_MsgTabstat, m.entry)) / sizeof(PgStat_TableEntry)) but then we'd have to run around and replace "m_hdr" with "m.hdr" etc in the code referencing the message types that use this mechanism. Kind of annoying. > As an aside, the PGSTAT_MSG_PAYLOAD define from which > these field sizes are being subtracted is a bit of a WAG, if > I am reading the code correctly, in which case whether the > two additional fields are subtracted from that WAG is perhaps > not critical.� But the code is certainly easier to understand if > the macro matches the definition of the struct. Yeah, it seems unlikely that exceeding the 1000-byte target, even by quite a lot, would cause any performance problem on most platforms --- a quick check says that the MTU on the loopback interface is near 16K on the machines I have handy. So that dampens my enthusiasm for making invasive code changes to ensure we meet the target exactly. Still, the next oversight of this sort might introduce a larger error. Does anyone have any other ideas about making this coding more robust? regards, tom lane
I don't care for the places in the code that say things like
foo - sizeof(int)
where "int" refers implicitly to a specific variable or struct field, but
you have to remember that and change it by hand if you change the
type of the variable or struct.
But this sort of code is quite common in postgres, and not
at all unique to src/include/pgstat.h. I did not try to tackle
that project-wide, as it would turn a one-line patch (which has
a good chance of being accepted) into a huge patch that
would likely be rejected.
On the other hand, if the community feels this kind of code
cleanup is needed, I might jump in.....
Mark Dilger
foo - sizeof(int)
where "int" refers implicitly to a specific variable or struct field, but
you have to remember that and change it by hand if you change the
type of the variable or struct.
But this sort of code is quite common in postgres, and not
at all unique to src/include/pgstat.h. I did not try to tackle
that project-wide, as it would turn a one-line patch (which has
a good chance of being accepted) into a huge patch that
would likely be rejected.
On the other hand, if the community feels this kind of code
cleanup is needed, I might jump in.....
Mark Dilger
On Tuesday, December 31, 2013 9:21 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mark Dilger <markdilger@yahoo.com> writes:
> In src/include/pgstat.h, the PGSTAT_NUM_TABENTRIES macro
> attempts to subtract off the size of the PgStat_MsgTabstat
> struct up to the m_entry[] field. This macro was correct up
> until the fields m_block_read_time and m_block_write_time
> were added to that struct, as the macro was not changed to
> include their size.
Yeah, that's a bug.
> This patch fixes the macro.
I'm inclined to think that we should look for a less breakable way to
manage the message size limit; if Robert missed this issue in that patch,
other people are going to miss it in future patches. The existing coding
isn't really right anyway when you consider possible alignment padding.
(I think the present struct definitions probably don't involve any
padding, but that's not a very future-proof assumption either.) It'd be
better to somehow drive the calculation from offsetof(m_entry). I'm not
seeing any way to do that that doesn't require some notational changes
in the code, though. One way would be to rejigger it like this:
#define PGSTAT_MAX_MSG_SIZE 1000
typedef union PgStat_MsgTabstat
{
struct {
PgStat_MsgHdr hdr;
Oid databaseid;
int nentries;
int xact_commit;
int xact_rollback;
PgStat_Counter block_read_time; /* times in microseconds */
PgStat_Counter block_write_time;
PgStat_TableEntry entry[FLEXIBLE_ARRAY_MEMBER];
} m;
char padding[PGSTAT_MAX_MSG_SIZE]; /* pad sizeof this union to target */
} PgStat_MsgTabstat;
#define PGSTAT_NUM_TABENTRIES ((PGSTAT_MAX_MSG_SIZE - offsetof(PgStat_MsgTabstat, m.entry)) / sizeof(PgStat_TableEntry))
but then we'd have to run around and replace "m_hdr" with "m.hdr" etc
in the code referencing the message types that use this mechanism.
Kind of annoying.
> As an aside, the PGSTAT_MSG_PAYLOAD define from which
> these field sizes are being subtracted is a bit of a WAG, if
> I am reading the code correctly, in which case whether the
> two additional fields are subtracted from that WAG is perhaps
> not critical. But the code is certainly easier to understand if
> the macro matches the definition of the struct.
Yeah, it seems unlikely that exceeding the 1000-byte target, even by quite
a lot, would cause any performance problem on most platforms --- a quick
check says that the MTU on the loopback interface is near 16K on the
machines I have handy. So that dampens my enthusiasm for making invasive
code changes to ensure we meet the target exactly. Still, the next
oversight of this sort might introduce a larger error.
Does anyone have any other ideas about making this coding more robust?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
> In src/include/pgstat.h, the PGSTAT_NUM_TABENTRIES macro
> attempts to subtract off the size of the PgStat_MsgTabstat
> struct up to the m_entry[] field. This macro was correct up
> until the fields m_block_read_time and m_block_write_time
> were added to that struct, as the macro was not changed to
> include their size.
Yeah, that's a bug.
> This patch fixes the macro.
I'm inclined to think that we should look for a less breakable way to
manage the message size limit; if Robert missed this issue in that patch,
other people are going to miss it in future patches. The existing coding
isn't really right anyway when you consider possible alignment padding.
(I think the present struct definitions probably don't involve any
padding, but that's not a very future-proof assumption either.) It'd be
better to somehow drive the calculation from offsetof(m_entry). I'm not
seeing any way to do that that doesn't require some notational changes
in the code, though. One way would be to rejigger it like this:
#define PGSTAT_MAX_MSG_SIZE 1000
typedef union PgStat_MsgTabstat
{
struct {
PgStat_MsgHdr hdr;
Oid databaseid;
int nentries;
int xact_commit;
int xact_rollback;
PgStat_Counter block_read_time; /* times in microseconds */
PgStat_Counter block_write_time;
PgStat_TableEntry entry[FLEXIBLE_ARRAY_MEMBER];
} m;
char padding[PGSTAT_MAX_MSG_SIZE]; /* pad sizeof this union to target */
} PgStat_MsgTabstat;
#define PGSTAT_NUM_TABENTRIES ((PGSTAT_MAX_MSG_SIZE - offsetof(PgStat_MsgTabstat, m.entry)) / sizeof(PgStat_TableEntry))
but then we'd have to run around and replace "m_hdr" with "m.hdr" etc
in the code referencing the message types that use this mechanism.
Kind of annoying.
> As an aside, the PGSTAT_MSG_PAYLOAD define from which
> these field sizes are being subtracted is a bit of a WAG, if
> I am reading the code correctly, in which case whether the
> two additional fields are subtracted from that WAG is perhaps
> not critical. But the code is certainly easier to understand if
> the macro matches the definition of the struct.
Yeah, it seems unlikely that exceeding the 1000-byte target, even by quite
a lot, would cause any performance problem on most platforms --- a quick
check says that the MTU on the loopback interface is near 16K on the
machines I have handy. So that dampens my enthusiasm for making invasive
code changes to ensure we meet the target exactly. Still, the next
oversight of this sort might introduce a larger error.
Does anyone have any other ideas about making this coding more robust?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Mark Dilger <markdilger@yahoo.com> writes: > I don't care for the places in the code that say things like > ��� foo - sizeof(int) > where "int" refers implicitly to a specific variable or struct field, but > you have to remember that and change it by hand if you change the > type of the variable or struct. > But this sort of code is quite common in postgres, and not > at all unique to src/include/pgstat.h. Really? I think we're using offsetof for this type of thing in most places. regards, tom lane
A quick grep through the code reveals lots of examples,
so I'll just paste the first ones I notice. There are
references to sizeof(Oid), sizeof(uint32), sizeof(bool),
and sizeof(uint8) that clearly refer to fields in structs that
the macros refer to implicitly, but there is no way for the
compiler to detect if you change the struct but not the
macro.
I see these as similar to what I was talking about in
src/include/pgstat.h.
Mark Dilger
src/include/pg_attribute.h:
------------------------------
#define ATTRIBUTE_FIXED_PART_SIZE \
(offsetof(FormData_pg_attribute,attcollation) + sizeof(Oid))
src/include/access/tuptoaster.h:
-------------------------------------
#define MaximumBytesPerTuple(tuplesPerPage) \
MAXALIGN_DOWN((BLCKSZ - \
MAXALIGN(SizeOfPageHeaderData + (tuplesPerPage) * sizeof(ItemIdData))) \
/ (tuplesPerPage))
#define TOAST_MAX_CHUNK_SIZE \
(EXTERN_TUPLE_MAX_SIZE - \
MAXALIGN(offsetof(HeapTupleHeaderData, t_bits)) - \
sizeof(Oid) - \
sizeof(int32) - \
VARHDRSZ)
src/include/access/htup.h:
------------------------------
#define HeapTupleHeaderGetOid(tup) \
( \
((tup)->t_infomask & HEAP_HASOID) ? \
*((Oid *) ((char *)(tup) + (tup)->t_hoff - sizeof(Oid))) \
: \
InvalidOid \
)
#define HeapTupleHeaderSetOid(tup, oid) \
do { \
Assert((tup)->t_infomask & HEAP_HASOID); \
*((Oid *) ((char *)(tup) + (tup)->t_hoff - sizeof(Oid))) = (oid); \
} while (0)
#define MINIMAL_TUPLE_OFFSET \
((offsetof(HeapTupleHeaderData, t_infomask2) - sizeof(uint32)) / MAXIMUM_ALIGNOF * MAXIMUM_ALIGNOF)
#define MINIMAL_TUPLE_PADDING \
((offsetof(HeapTupleHeaderData, t_infomask2) - sizeof(uint32)) % MAXIMUM_ALIGNOF)
#define MINIMAL_TUPLE_DATA_OFFSET \
offsetof(MinimalTupleData, t_infomask2)
#define SizeOfHeapDelete (offsetof(xl_heap_delete, all_visible_cleared) + sizeof(bool))
#define SizeOfHeapHeader (offsetof(xl_heap_header, t_hoff) + sizeof(uint8))
so I'll just paste the first ones I notice. There are
references to sizeof(Oid), sizeof(uint32), sizeof(bool),
and sizeof(uint8) that clearly refer to fields in structs that
the macros refer to implicitly, but there is no way for the
compiler to detect if you change the struct but not the
macro.
I see these as similar to what I was talking about in
src/include/pgstat.h.
Mark Dilger
src/include/pg_attribute.h:
------------------------------
#define ATTRIBUTE_FIXED_PART_SIZE \
(offsetof(FormData_pg_attribute,attcollation) + sizeof(Oid))
src/include/access/tuptoaster.h:
-------------------------------------
#define MaximumBytesPerTuple(tuplesPerPage) \
MAXALIGN_DOWN((BLCKSZ - \
MAXALIGN(SizeOfPageHeaderData + (tuplesPerPage) * sizeof(ItemIdData))) \
/ (tuplesPerPage))
#define TOAST_MAX_CHUNK_SIZE \
(EXTERN_TUPLE_MAX_SIZE - \
MAXALIGN(offsetof(HeapTupleHeaderData, t_bits)) - \
sizeof(Oid) - \
sizeof(int32) - \
VARHDRSZ)
src/include/access/htup.h:
------------------------------
#define HeapTupleHeaderGetOid(tup) \
( \
((tup)->t_infomask & HEAP_HASOID) ? \
*((Oid *) ((char *)(tup) + (tup)->t_hoff - sizeof(Oid))) \
: \
InvalidOid \
)
#define HeapTupleHeaderSetOid(tup, oid) \
do { \
Assert((tup)->t_infomask & HEAP_HASOID); \
*((Oid *) ((char *)(tup) + (tup)->t_hoff - sizeof(Oid))) = (oid); \
} while (0)
#define MINIMAL_TUPLE_OFFSET \
((offsetof(HeapTupleHeaderData, t_infomask2) - sizeof(uint32)) / MAXIMUM_ALIGNOF * MAXIMUM_ALIGNOF)
#define MINIMAL_TUPLE_PADDING \
((offsetof(HeapTupleHeaderData, t_infomask2) - sizeof(uint32)) % MAXIMUM_ALIGNOF)
#define MINIMAL_TUPLE_DATA_OFFSET \
offsetof(MinimalTupleData, t_infomask2)
#define SizeOfHeapDelete (offsetof(xl_heap_delete, all_visible_cleared) + sizeof(bool))
#define SizeOfHeapHeader (offsetof(xl_heap_header, t_hoff) + sizeof(uint8))
On Tuesday, December 31, 2013 10:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mark Dilger <markdilger@yahoo.com> writes:
> I don't care for the places in the code that say things like
> foo - sizeof(int)
> where "int" refers implicitly to a specific variable or struct field, but
> you have to remember that and change it by hand if you change the
> type of the variable or struct.
> But this sort of code is quite common in postgres, and not
> at all unique to src/include/pgstat.h.
Really? I think we're using offsetof for this type of thing in most
places.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
> I don't care for the places in the code that say things like
> foo - sizeof(int)
> where "int" refers implicitly to a specific variable or struct field, but
> you have to remember that and change it by hand if you change the
> type of the variable or struct.
> But this sort of code is quite common in postgres, and not
> at all unique to src/include/pgstat.h.
Really? I think we're using offsetof for this type of thing in most
places.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Dec 31, 2013 at 12:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Mark Dilger <markdilger@yahoo.com> writes: >> In src/include/pgstat.h, the PGSTAT_NUM_TABENTRIES macro >> attempts to subtract off the size of the PgStat_MsgTabstat >> struct up to the m_entry[] field. This macro was correct up >> until the fields m_block_read_time and m_block_write_time >> were added to that struct, as the macro was not changed to >> include their size. > > Yeah, that's a bug. Ick. >> This patch fixes the macro. > > I'm inclined to think that we should look for a less breakable way to > manage the message size limit; if Robert missed this issue in that patch, > other people are going to miss it in future patches. The existing coding > isn't really right anyway when you consider possible alignment padding. > (I think the present struct definitions probably don't involve any > padding, but that's not a very future-proof assumption either.) It'd be > better to somehow drive the calculation from offsetof(m_entry). I'm not > seeing any way to do that that doesn't require some notational changes > in the code, though. One way would be to rejigger it like this: > > #define PGSTAT_MAX_MSG_SIZE 1000 > > typedef union PgStat_MsgTabstat > { > struct { > PgStat_MsgHdr hdr; > Oid databaseid; > int nentries; > int xact_commit; > int xact_rollback; > PgStat_Counter block_read_time; /* times in microseconds */ > PgStat_Counter block_write_time; > PgStat_TableEntry entry[FLEXIBLE_ARRAY_MEMBER]; > } m; > char padding[PGSTAT_MAX_MSG_SIZE]; /* pad sizeof this union to target */ > } PgStat_MsgTabstat; > > #define PGSTAT_NUM_TABENTRIES ((PGSTAT_MAX_MSG_SIZE - offsetof(PgStat_MsgTabstat, m.entry)) / sizeof(PgStat_TableEntry)) > > but then we'd have to run around and replace "m_hdr" with "m.hdr" etc > in the code referencing the message types that use this mechanism. > Kind of annoying. Rather than using a union, I'd be inclined to declare one type that's just the header (PgStat_MsgTabstat_Hdr), and then work out PGSTAT_NUM_TABENTRIES based on sizeof(PgStat_MsgTabstat_Hdr), and then declare PgStat_MsgTabstat as a two-element struct, the header struct followed by an array of entries. That is indeed a bit of notational churn but it seems worth it to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Dec 31, 2013 at 12:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> I'm inclined to think that we should look for a less breakable way to >> manage the message size limit; if Robert missed this issue in that patch, >> other people are going to miss it in future patches. The existing coding >> isn't really right anyway when you consider possible alignment padding. >> (I think the present struct definitions probably don't involve any >> padding, but that's not a very future-proof assumption either.) It'd be >> better to somehow drive the calculation from offsetof(m_entry). I'm not >> seeing any way to do that that doesn't require some notational changes >> in the code, though. One way would be to rejigger it like this: >> ... > Rather than using a union, I'd be inclined to declare one type that's > just the header (PgStat_MsgTabstat_Hdr), and then work out > PGSTAT_NUM_TABENTRIES based on sizeof(PgStat_MsgTabstat_Hdr), and then > declare PgStat_MsgTabstat as a two-element struct, the header struct > followed by an array of entries. That is indeed a bit of notational > churn but it seems worth it to me. That approach would fail to account for any alignment padding occurring just before the m_entry array, though. That could happen if the array members contain some 8-byte fields while there are none in the header part. I think it would net out to the same amount of notational change in the code proper as my approach, since either way you end up with a nested struct containing the header fields. It occurs to me that, rather than trying to improve the struct definition methodology, maybe we should just add static asserts to catch any inconsistency here. It wouldn't be that hard: #define PGSTAT_MAX_MSG_SIZE 1000 #define PGSTAT_MSG_PAYLOAD (PGSTAT_MAX_MSG_SIZE - sizeof(PgStat_MsgHdr)) ... all else in pgstat.h as before ... StaticAssertStmt(sizeof(PgStat_MsgTabstat) <= PGSTAT_MAX_MSG_SIZE, 'bad PgStat_MsgTabstat size'); ... and similarly for other pgstat message structs ... This would possibly lead to machine-dependent results if alignment comes into play, but it's reasonable to suppose that we'd find out about any mistakes as soon as they hit the buildfarm. So this might be an acceptable tradeoff for not having to change source code. regards, tom lane
I wrote: > It occurs to me that, rather than trying to improve the struct definition > methodology, maybe we should just add static asserts to catch any > inconsistency here. It wouldn't be that hard: > #define PGSTAT_MAX_MSG_SIZE 1000 > #define PGSTAT_MSG_PAYLOAD (PGSTAT_MAX_MSG_SIZE - sizeof(PgStat_MsgHdr)) > ... all else in pgstat.h as before ... > StaticAssertStmt(sizeof(PgStat_MsgTabstat) <= PGSTAT_MAX_MSG_SIZE, > 'bad PgStat_MsgTabstat size'); > ... and similarly for other pgstat message structs ... After further thought it seems to me that this is a desirable approach, because it is a direct check of the property we want, and will complain about *any* mistake that results in too-large struct sizes. The other ideas that were kicked around upthread still left a lot of ways to mess up the array size calculations, for instance referencing the wrong array element type in the sizeof calculation. So unless anyone has a concrete objection, I'll go put in something like this along with Mark's fix. regards, tom lane
<div style="color:#000; background-color:#fff; font-family:HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande,sans-serif;font-size:12pt">I still don't understand why this case in src/include/pgstat.h<br />is different from caseselsewhere in the code. Taken from<br />src/include/access/heapam_xlog.h:<br /><br /><br />typedef struct xl_heap_header<br/>{<br /> uint16 t_infomask2;<br /> uint16 t_infomask;<br /> uint8 t_hoff;<br />}xl_heap_header;<br /><br />#define SizeOfHeapHeader (offsetof(xl_heap_header, t_hoff) + sizeof(uint8))<br /><br /><br/><br />Now, if somebody changed t_hoff to be a uint16, that SizeOfHeapHeader<br />macro would be wrong. Should weput a static assert in the code for that?<br />I have no objection, but it seems you like the static assert in one place<br/>and not the other, and (perhaps due to some incredible ignorance on my<br />part) I cannot see why. I tried lookingfor an assert of this kind already in<br />the code. The use of this macro is in src/backend/access/heap/heapam.c,<br/>but I didn't see any asserts for it, though there are lots of asserts for other<br/>stuff. Maybe I just didn't recognize it?<br /><br /><br />mark<br /><div class="yahoo_quoted" style="display:block;"><br /><br /><div style="font-family: HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande,sans-serif; font-size: 12pt;"><div style="font-family: HelveticaNeue, Helvetica Neue, Helvetica, Arial, Lucida Grande,sans-serif; font-size: 12pt;"><div dir="ltr"><font face="Arial" size="2"> On Thursday, January 2, 2014 2:05 PM, TomLane <tgl@sss.pgh.pa.us> wrote:<br /></font></div><div class="y_msg_container">I wrote:<br clear="none" />> Itoccurs to me that, rather than trying to improve the struct definition<br clear="none" />> methodology, maybe we shouldjust add static asserts to catch any<br clear="none" />> inconsistency here. It wouldn't be that hard:<br clear="none"/><br clear="none" />> #define PGSTAT_MAX_MSG_SIZE 1000<br clear="none" />> #define PGSTAT_MSG_PAYLOAD (PGSTAT_MAX_MSG_SIZE - sizeof(PgStat_MsgHdr))<br clear="none" />> ... all else in pgstat.h as before...<br clear="none" /><br clear="none" />> StaticAssertStmt(sizeof(PgStat_MsgTabstat) <= PGSTAT_MAX_MSG_SIZE,<brclear="none" />> 'bad PgStat_MsgTabstat size');<br clear="none" />> ... and similarlyfor other pgstat message structs ...<br clear="none" /><br clear="none" />After further thought it seems to me thatthis is a desirable approach,<br clear="none" />because it is a direct check of the property we want, and will complain<brclear="none" />about *any* mistake that results in too-large struct sizes. The other<br clear="none" />ideasthat were kicked around upthread still left a lot of ways to mess up<br clear="none" />the array size calculations,for instance referencing the wrong array<br clear="none" />element type in the sizeof calculation. So unlessanyone has a concrete<br clear="none" />objection, I'll go put in something like this along with Mark's fix.<div class="yqt0876931955"id="yqtfd25444"><br clear="none" /><br clear="none" /> regards, tom lane<br clear="none"/><br clear="none" /><br clear="none" />-- <br clear="none" />Sent via pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org"shape="rect" ymailto="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<brclear="none" />To make changes to yoursubscription:<br clear="none" /><a href="http://www.postgresql.org/mailpref/pgsql-hackers" shape="rect" target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><brclear="none" /></div><br /><br /></div></div></div></div></div>
On 2014-01-02 15:15:58 -0800, Mark Dilger wrote: > I still don't understand why this case in src/include/pgstat.h > is different from cases elsewhere in the code. Taken from > src/include/access/heapam_xlog.h: > > > typedef struct xl_heap_header > { > uint16 t_infomask2; > uint16 t_infomask; > uint8 t_hoff; > } xl_heap_header; > > #define SizeOfHeapHeader (offsetof(xl_heap_header, t_hoff) + sizeof(uint8)) > > > > Now, if somebody changed t_hoff to be a uint16, that SizeOfHeapHeader > macro would be wrong. Should we put a static assert in the code for that? The reason the various SizeOfHeapHeader are written that way is that we do not want to uselessly store trailing padding in the WAL. E.g. sizeof(xl_heap_header) will be 6bytes, but SizeOfHeapHeader will be 5. I don't see an easy way to guarantee this with asserts and I think you'd notice pretty fast if you got things wrong there because WAL replay will just have incomplete data. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Mark Dilger <markdilger@yahoo.com> writes: > I still don't understand why this case in src/include/pgstat.h > is different from cases elsewhere in the code. The reason why I'm exercised about it is that (a) somebody actually made a mistake of this type, and (b) it wasn't caught by any automated testing. The catalog and WAL-related examples you cite would probably crash and burn in fairly obvious ways if somebody broke them --- for instance, the most likely way to break SizeOfHeapHeader would be by adding another field after t_hoff, but we'd notice that before long because said field would be corrupted on arrival at a replication slave. In contrast, messing up the pgstats message sizes would have no consequences worse than a hard-to-detect, and probably platform-specific, performance penalty for stats transmission. So unless we think that's of absolutely zero concern, adding a mechanism to make such bugs more apparent seems useful. I'm not against adding more assertions elsewhere, but it's a bit hard to see what those asserts should test. I don't see any practical way to assert that field X is the last one in its struct, for instance. regards, tom lane
I completely understand the padding issues that
you are dealing with. I was mostly curious why
Tom wanted to use asserts to double-check the
code in one place, while happily not doing so in
what seemed the same kind of situation elsewhere.
He has since made the reason for that clear.
you are dealing with. I was mostly curious why
Tom wanted to use asserts to double-check the
code in one place, while happily not doing so in
what seemed the same kind of situation elsewhere.
He has since made the reason for that clear.
On Thursday, January 2, 2014 3:27 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2014-01-02 15:15:58 -0800, Mark Dilger wrote:
> I still don't understand why this case in src/include/pgstat.h
> is different from cases elsewhere in the code. Taken from
> src/include/access/heapam_xlog.h:
>
>
> typedef struct xl_heap_header
> {
> uint16 t_infomask2;
> uint16 t_infomask;
> uint8 t_hoff;
> } xl_heap_header;
>
> #define SizeOfHeapHeader (offsetof(xl_heap_header, t_hoff) + sizeof(uint8))
>
>
>
> Now, if somebody changed t_hoff to be a uint16, that SizeOfHeapHeader
> macro would be wrong. Should we put a static assert in the code for that?
The reason the various SizeOfHeapHeader are written that way is that we
do not want to uselessly store trailing padding in the
WAL. E.g. sizeof(xl_heap_header) will be 6bytes, but SizeOfHeapHeader
will be 5.
I don't see an easy way to guarantee this with asserts and I think you'd
notice pretty fast if you got things wrong there because WAL replay will
just have incomplete data.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
> I still don't understand why this case in src/include/pgstat.h
> is different from cases elsewhere in the code. Taken from
> src/include/access/heapam_xlog.h:
>
>
> typedef struct xl_heap_header
> {
> uint16 t_infomask2;
> uint16 t_infomask;
> uint8 t_hoff;
> } xl_heap_header;
>
> #define SizeOfHeapHeader (offsetof(xl_heap_header, t_hoff) + sizeof(uint8))
>
>
>
> Now, if somebody changed t_hoff to be a uint16, that SizeOfHeapHeader
> macro would be wrong. Should we put a static assert in the code for that?
The reason the various SizeOfHeapHeader are written that way is that we
do not want to uselessly store trailing padding in the
WAL. E.g. sizeof(xl_heap_header) will be 6bytes, but SizeOfHeapHeader
will be 5.
I don't see an easy way to guarantee this with asserts and I think you'd
notice pretty fast if you got things wrong there because WAL replay will
just have incomplete data.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
The mechanism that occurs to me (and I'm not wedded to
this idea) is:
typedef uint8 T_HOFF_TYPE;
typedef struct xl_heap_header
{
uint16 t_infomask2;
uint16 t_infomask;
T_HOFF_TYPE t_hoff;
} xl_heap_header;
#define SizeOfHeapHeader (offsetof(xl_heap_header, t_hoff) + sizeof(T_HOFF_TYPE))
this idea) is:
typedef uint8 T_HOFF_TYPE;
typedef struct xl_heap_header
{
uint16 t_infomask2;
uint16 t_infomask;
T_HOFF_TYPE t_hoff;
} xl_heap_header;
#define SizeOfHeapHeader (offsetof(xl_heap_header, t_hoff) + sizeof(T_HOFF_TYPE))
On Thursday, January 2, 2014 3:39 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Mark Dilger <markdilger@yahoo.com> writes:
> I still don't understand why this case in src/include/pgstat.h
> is different from cases elsewhere in the code.
The reason why I'm exercised about it is that (a) somebody actually made a
mistake of this type, and (b) it wasn't caught by any automated testing.
The catalog and WAL-related examples you cite would probably crash
and burn in fairly obvious ways if somebody broke them --- for instance,
the most likely way to break SizeOfHeapHeader would be by adding another
field after t_hoff, but we'd notice that before long because said field
would be corrupted on arrival at a replication slave.
In contrast, messing up the pgstats message sizes would have no
consequences worse than a hard-to-detect, and probably platform-specific,
performance penalty for stats transmission. So unless we think that's
of absolutely zero concern, adding a mechanism to make such bugs more
apparent seems useful.
I'm not against adding more assertions elsewhere, but it's a bit hard to
see what those asserts should test. I don't see any practical way to
assert that field X is the last one in its struct, for instance.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
> I still don't understand why this case in src/include/pgstat.h
> is different from cases elsewhere in the code.
The reason why I'm exercised about it is that (a) somebody actually made a
mistake of this type, and (b) it wasn't caught by any automated testing.
The catalog and WAL-related examples you cite would probably crash
and burn in fairly obvious ways if somebody broke them --- for instance,
the most likely way to break SizeOfHeapHeader would be by adding another
field after t_hoff, but we'd notice that before long because said field
would be corrupted on arrival at a replication slave.
In contrast, messing up the pgstats message sizes would have no
consequences worse than a hard-to-detect, and probably platform-specific,
performance penalty for stats transmission. So unless we think that's
of absolutely zero concern, adding a mechanism to make such bugs more
apparent seems useful.
I'm not against adding more assertions elsewhere, but it's a bit hard to
see what those asserts should test. I don't see any practical way to
assert that field X is the last one in its struct, for instance.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Mark Dilger <markdilger@yahoo.com> writes: > The mechanism that occurs to me (and I'm not wedded to > this idea) is: > typedef uint8 T_HOFF_TYPE; > typedef struct xl_heap_header > { > ������� uint16��������� t_infomask2; > ������� uint16��������� t_infomask; > ������� T_HOFF_TYPE������������ t_hoff; > } xl_heap_header; > #define SizeOfHeapHeader������� (offsetof(xl_heap_header, t_hoff) + sizeof(T_HOFF_TYPE)) Meh. That does nothing for the "add a field in the wrong place" risk. Yes, it would prevent people from changing the type of t_hoff without updating the macro --- but I'm not convinced that defending against that alone is worth any notational pain. If you're changing t_hoff's type without looking fairly closely at every reference to t_hoff, you're practicing unsafe programming to begin with. I wonder though if we could invent a macro that produces the sizeof a struct field, and then use that in macros like this. Something like #define field_sizeof(typename, fieldname) \sizeof(((typename *) NULL)->fieldname) Compare the default definition of offsetof ... regards, tom lane