Thread: Change xl_hash_vacuum_one_page.ntuples from int to uint16
Hi hackers, While working on [1], I noticed that xl_hash_vacuum_one_page.ntuples is an int. Unless I'm missing something, It seems to me that it would make more sense to use an uint16 (like this is done for gistxlogDelete.ntodelete for example). Please find attached a patch proposal to do so. While that does not currently change the struct size: No patch: (gdb) ptype /o struct xl_hash_vacuum_one_page /* offset | size */ type = struct xl_hash_vacuum_one_page { /* 0 | 4 */ TransactionId snapshotConflictHorizon; /* 4 | 4 */ int ntuples; /* total size (bytes): 8 */ } With patch: (gdb) ptype /o struct xl_hash_vacuum_one_page /* offset | size */ type = struct xl_hash_vacuum_one_page { /* 0 | 4 */ TransactionId snapshotConflictHorizon; /* 4 | 2 */ uint16 ntuples; /* XXX 2-byte padding */ /* total size (bytes): 8 */ } It could reduce it when adding new fields (like this is is done in [1]). We would get: No patch: (gdb) ptype /o struct xl_hash_vacuum_one_page /* offset | size */ type = struct xl_hash_vacuum_one_page { /* 0 | 4 */ TransactionId snapshotConflictHorizon; /* 4 | 4 */ int ntuples; /* 8 | 1 */ _Bool isCatalogRel; /* XXX 3-byte padding */ /* total size (bytes): 12 */ } With patch: (gdb) ptype /o struct xl_hash_vacuum_one_page /* offset | size */ type = struct xl_hash_vacuum_one_page { /* 0 | 4 */ TransactionId snapshotConflictHorizon; /* 4 | 2 */ uint16 ntuples; /* 6 | 1 */ _Bool isCatalogRel; /* XXX 1-byte padding */ /* total size (bytes): 8 */ } Means saving 4 bytes in that case. Looking forward to your feedback, Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Tue, Jan 10, 2023 at 11:08:33AM +0100, Drouvot, Bertrand wrote: > While working on [1], I noticed that xl_hash_vacuum_one_page.ntuples is an int. > > Unless I'm missing something, It seems to me that it would make more sense to use an uint16 (like this is done for > gistxlogDelete.ntodelete for example). I think that is correct. This value is determined by looping through offsets, which are uint16 as well. Should we also change the related variables (e.g., ndeletable in _hash_vacuum_one_page()) to uint16? -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
Hi, On 1/20/23 9:01 PM, Nathan Bossart wrote: > On Tue, Jan 10, 2023 at 11:08:33AM +0100, Drouvot, Bertrand wrote: >> While working on [1], I noticed that xl_hash_vacuum_one_page.ntuples is an int. >> >> Unless I'm missing something, It seems to me that it would make more sense to use an uint16 (like this is done for >> gistxlogDelete.ntodelete for example). > > I think that is correct. This value is determined by looping through > offsets, which are uint16 as well. Thanks for the review! > Should we also change the related > variables (e.g., ndeletable in _hash_vacuum_one_page()) to uint16? > Yeah, I thought about it too. What I saw is that there is other places that would be good candidates for the same kind of changes (see the int ntodelete argument in gistXLogDelete() being assigned to gistxlogDelete.ntodelete (uint16) forexample). So, what do you think about: 1) keep this patch as it is (to "only" address the struct field and avoid possible future "useless" padding size increase) and 2) create a new patch (once this one is committed) to align the types for variables/arguments with the structs (related toXLOG records) fields when they are not? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Sat, Jan 21, 2023 at 06:42:08AM +0100, Drouvot, Bertrand wrote: > On 1/20/23 9:01 PM, Nathan Bossart wrote: >> Should we also change the related >> variables (e.g., ndeletable in _hash_vacuum_one_page()) to uint16? > > Yeah, I thought about it too. What I saw is that there is other places that would be good candidates for the same > kind of changes (see the int ntodelete argument in gistXLogDelete() being assigned to gistxlogDelete.ntodelete (uint16)for example). > > So, what do you think about: > > 1) keep this patch as it is (to "only" address the struct field and avoid possible future "useless" padding size increase) > and > 2) create a new patch (once this one is committed) to align the types for variables/arguments with the structs (relatedto XLOG records) fields when they are not? Okay. I've marked this one as ready-for-committer, then. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
On Wed, Feb 15, 2023 at 3:35 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Sat, Jan 21, 2023 at 06:42:08AM +0100, Drouvot, Bertrand wrote: > > On 1/20/23 9:01 PM, Nathan Bossart wrote: > >> Should we also change the related > >> variables (e.g., ndeletable in _hash_vacuum_one_page()) to uint16? > > > > Yeah, I thought about it too. What I saw is that there is other places that would be good candidates for the same > > kind of changes (see the int ntodelete argument in gistXLogDelete() being assigned to gistxlogDelete.ntodelete (uint16)for example). > > > > So, what do you think about: > > > > 1) keep this patch as it is (to "only" address the struct field and avoid possible future "useless" padding size increase) > > and > > 2) create a new patch (once this one is committed) to align the types for variables/arguments with the structs (relatedto XLOG records) fields when they are not? > > Okay. I've marked this one as ready-for-committer, then. > LGTM. I think the padding space we are trying to save here can be used for the patch [1], right? BTW, feel free to create the second patch (to align the types for variables/arguments) as that would be really helpful after we commit this one. I think this would require XLOG_PAGE_MAGIC as it changes the WAL record. BTW, how about a commit message like: Change xl_hash_vacuum_one_page.ntuples from int to uint16. This will create two bytes of padding space in xl_hash_vacuum_one_page which can be used for future patches. This makes the datatype of xl_hash_vacuum_one_page.ntuples same as gistxlogDelete.ntodelete which is advisable as both are used for the same purpose. [1] - https://www.postgresql.org/message-id/2d62f212-fce6-d639-b9eb-2a5bc4bec3b4%40gmail.com -- With Regards, Amit Kapila.
Hi, On 2/16/23 12:00 PM, Amit Kapila wrote: > On Wed, Feb 15, 2023 at 3:35 AM Nathan Bossart <nathandbossart@gmail.com> wrote: >> >> On Sat, Jan 21, 2023 at 06:42:08AM +0100, Drouvot, Bertrand wrote: >>> On 1/20/23 9:01 PM, Nathan Bossart wrote: >>>> Should we also change the related >>>> variables (e.g., ndeletable in _hash_vacuum_one_page()) to uint16? >>> >>> Yeah, I thought about it too. What I saw is that there is other places that would be good candidates for the same >>> kind of changes (see the int ntodelete argument in gistXLogDelete() being assigned to gistxlogDelete.ntodelete (uint16)for example). >>> >>> So, what do you think about: >>> >>> 1) keep this patch as it is (to "only" address the struct field and avoid possible future "useless" padding size increase) >>> and >>> 2) create a new patch (once this one is committed) to align the types for variables/arguments with the structs (relatedto XLOG records) fields when they are not? >> >> Okay. I've marked this one as ready-for-committer, then. >> > > LGTM. Thanks for looking at it! > I think the padding space we are trying to save here can be used > for the patch [1], right? Yes exactly, without the current patch and adding isCatalogRel (from the patch [1] you mentioned) we would get: (gdb) ptype /o struct xl_hash_vacuum_one_page /* offset | size */ type = struct xl_hash_vacuum_one_page { /* 0 | 4 */ TransactionId snapshotConflictHorizon; /* 4 | 4 */ int ntuples; /* 8 | 1 */ _Bool isCatalogRel; /* XXX 3-byte padding */ /* total size (bytes): 12 */ } While with the proposed patch: (gdb) ptype /o struct xl_hash_vacuum_one_page /* offset | size */ type = struct xl_hash_vacuum_one_page { /* 0 | 4 */ TransactionId snapshotConflictHorizon; /* 4 | 2 */ uint16 ntuples; /* 6 | 1 */ _Bool isCatalogRel; /* XXX 1-byte padding */ /* total size (bytes): 8 */ } > BTW, feel free to create the second patch > (to align the types for variables/arguments) as that would be really > helpful after we commit this one. > Yes, will do. > I think this would require XLOG_PAGE_MAGIC as it changes the WAL record. > Oh, I Was not aware about it, thanks! Will do in V2 (and in the logical decoding on standby patch as it adds the new field mentioned above). > BTW, how about a commit message like: > Change xl_hash_vacuum_one_page.ntuples from int to uint16. > > This will create two bytes of padding space in xl_hash_vacuum_one_page > which can be used for future patches. This makes the datatype of > xl_hash_vacuum_one_page.ntuples same as gistxlogDelete.ntodelete which > is advisable as both are used for the same purpose. > LGTM, will add it to V2! Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Hi, On 2/16/23 1:26 PM, Drouvot, Bertrand wrote: > Hi, > > On 2/16/23 12:00 PM, Amit Kapila wrote: >> I think this would require XLOG_PAGE_MAGIC as it changes the WAL record. >> > > Oh, I Was not aware about it, thanks! Will do in V2 (and in the logical > decoding on standby patch as it adds the new field mentioned above). > >> BTW, how about a commit message like: >> Change xl_hash_vacuum_one_page.ntuples from int to uint16. >> >> This will create two bytes of padding space in xl_hash_vacuum_one_page >> which can be used for future patches. This makes the datatype of >> xl_hash_vacuum_one_page.ntuples same as gistxlogDelete.ntodelete which >> is advisable as both are used for the same purpose. >> > > LGTM, will add it to V2! > Please find V2 attached. The commit message also mention the XLOG_PAGE_MAGIC bump. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Thu, Feb 16, 2023 at 8:39 PM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote: > > On 2/16/23 1:26 PM, Drouvot, Bertrand wrote: > > Hi, > > > > On 2/16/23 12:00 PM, Amit Kapila wrote: > >> I think this would require XLOG_PAGE_MAGIC as it changes the WAL record. > >> > > > > Oh, I Was not aware about it, thanks! Will do in V2 (and in the logical > > decoding on standby patch as it adds the new field mentioned above). > > > >> BTW, how about a commit message like: > >> Change xl_hash_vacuum_one_page.ntuples from int to uint16. > >> > >> This will create two bytes of padding space in xl_hash_vacuum_one_page > >> which can be used for future patches. This makes the datatype of > >> xl_hash_vacuum_one_page.ntuples same as gistxlogDelete.ntodelete which > >> is advisable as both are used for the same purpose. > >> > > > > LGTM, will add it to V2! > > > Please find V2 attached. > The commit message also mention the XLOG_PAGE_MAGIC bump. > Thanks, I was not completely sure about whether we need to bump XLOG_PAGE_MAGIC for this patch as this makes the additional space just by changing the datatype of one of the members of the existing WAL record. We normally change it for the addition/removal of new fields aka change in WAL record format, or addition of a new type of WAL record. Does anyone else have an opinion/suggestion on this matter? -- With Regards, Amit Kapila.
Hi On 2023-02-17 08:30:09 +0530, Amit Kapila wrote: > Thanks, I was not completely sure about whether we need to bump > XLOG_PAGE_MAGIC for this patch as this makes the additional space just > by changing the datatype of one of the members of the existing WAL > record. We normally change it for the addition/removal of new fields > aka change in WAL record format, or addition of a new type of WAL > record. Does anyone else have an opinion/suggestion on this matter? I'd definitely change it - the width of a field still means you can't really parse the old WAL sensibly anymore. Greetings, Andres Freund
Hi, On 2/16/23 1:26 PM, Drouvot, Bertrand wrote: > Hi, > > On 2/16/23 12:00 PM, Amit Kapila wrote: > >> BTW, feel free to create the second patch >> (to align the types for variables/arguments) as that would be really >> helpful after we commit this one. Please find attached a patch proposal to do so. It looks like a Pandora's box as it produces those cascading changes: _hash_vacuum_one_page index_compute_xid_horizon_for_tuples gistprunepage PageIndexMultiDelete gistXLogDelete PageIndexMultiDelete spgRedoVacuumRedirect vacuumRedirectAndPlaceholder spgPageIndexMultiDelete moveLeafs doPickSplit _bt_delitems_vacuum btvacuumpage _bt_delitems_delete _bt_delitems_delete_check hash_xlog_move_page_contents gistvacuumpage gistXLogUpdate gistplacetopage hashbucketcleanup Which makes me: - wonder it is not too intrusive (we could reduce the scope and keep the PageIndexMultiDelete()'s nitems argument as an int for example). - worry if there is no side effects (like the one I'm mentioning as a comment in PageIndexMultiDelete()) even if it passes the CI tests. - wonder if we should not change MaxIndexTuplesPerPage from int to uint16 too (given the fact that the maximum block size is 32 KB. I'm sharing this version but I still need to think about it and I'm curious about your thoughts too. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Fri, Feb 17, 2023 at 9:43 AM Andres Freund <andres@anarazel.de> wrote: > > On 2023-02-17 08:30:09 +0530, Amit Kapila wrote: > > Thanks, I was not completely sure about whether we need to bump > > XLOG_PAGE_MAGIC for this patch as this makes the additional space just > > by changing the datatype of one of the members of the existing WAL > > record. We normally change it for the addition/removal of new fields > > aka change in WAL record format, or addition of a new type of WAL > > record. Does anyone else have an opinion/suggestion on this matter? > > I'd definitely change it - the width of a field still means you can't really > parse the old WAL sensibly anymore. > Okay, thanks for your input. I'll push this patch early next week. -- With Regards, Amit Kapila.
On Fri, Feb 17, 2023 at 7:44 PM Drouvot, Bertrand <bertranddrouvot.pg@gmail.com> wrote: > > On 2/16/23 1:26 PM, Drouvot, Bertrand wrote: > > Hi, > > > > On 2/16/23 12:00 PM, Amit Kapila wrote: > > > >> BTW, feel free to create the second patch > >> (to align the types for variables/arguments) as that would be really > >> helpful after we commit this one. > Pushed the first patch. > Please find attached a patch proposal to do so. > > It looks like a Pandora's box as it produces > those cascading changes: > > _hash_vacuum_one_page > index_compute_xid_horizon_for_tuples > gistprunepage > PageIndexMultiDelete > gistXLogDelete > PageIndexMultiDelete > spgRedoVacuumRedirect > vacuumRedirectAndPlaceholder > spgPageIndexMultiDelete > moveLeafs > doPickSplit > _bt_delitems_vacuum > btvacuumpage > _bt_delitems_delete > _bt_delitems_delete_check > hash_xlog_move_page_contents > gistvacuumpage > gistXLogUpdate > gistplacetopage > hashbucketcleanup > > > Which makes me: > > - wonder it is not too intrusive (we could reduce the scope and keep the > PageIndexMultiDelete()'s nitems argument as an int for example). > > - worry if there is no side effects (like the one I'm mentioning as a comment > in PageIndexMultiDelete()) even if it passes the CI tests. > > - wonder if we should not change MaxIndexTuplesPerPage from int to uint16 too (given > the fact that the maximum block size is 32 KB. > > I'm sharing this version but I still need to think about it and > I'm curious about your thoughts too. > @@ -591,11 +591,11 @@ hash_xlog_move_page_contents(XLogReaderState *record) if (len > 0) { - OffsetNumber *unused; - OffsetNumber *unend; + uint16 *unused; + uint16 *unend; - unused = (OffsetNumber *) ptr; - unend = (OffsetNumber *) ((char *) ptr + len); + unused = (uint16 *) ptr; + unend = (uint16 *) ((char *) ptr + len); It doesn't seem useful to me to make such changes. About other changes in the second patch, it is not clear whether there is much value addition by those even though I don't see anything wrong with them. So, let's see if Nathan or others see the value in the proposed patch or any subset of these changes. -- With Regards, Amit Kapila.
Hi, On 2/27/23 6:27 AM, Amit Kapila wrote: > On Fri, Feb 17, 2023 at 7:44 PM Drouvot, Bertrand > <bertranddrouvot.pg@gmail.com> wrote: >> >> On 2/16/23 1:26 PM, Drouvot, Bertrand wrote: >>> Hi, >>> >>> On 2/16/23 12:00 PM, Amit Kapila wrote: >>> >>>> BTW, feel free to create the second patch >>>> (to align the types for variables/arguments) as that would be really >>>> helpful after we commit this one. >> > > Pushed the first patch. Thanks! > >> Please find attached a patch proposal to do so. >> >> It looks like a Pandora's box as it produces >> those cascading changes: >> >> _hash_vacuum_one_page >> index_compute_xid_horizon_for_tuples >> gistprunepage >> PageIndexMultiDelete >> gistXLogDelete >> PageIndexMultiDelete >> spgRedoVacuumRedirect >> vacuumRedirectAndPlaceholder >> spgPageIndexMultiDelete >> moveLeafs >> doPickSplit >> _bt_delitems_vacuum >> btvacuumpage >> _bt_delitems_delete >> _bt_delitems_delete_check >> hash_xlog_move_page_contents >> gistvacuumpage >> gistXLogUpdate >> gistplacetopage >> hashbucketcleanup >> >> >> Which makes me: >> >> - wonder it is not too intrusive (we could reduce the scope and keep the >> PageIndexMultiDelete()'s nitems argument as an int for example). >> >> - worry if there is no side effects (like the one I'm mentioning as a comment >> in PageIndexMultiDelete()) even if it passes the CI tests. >> >> - wonder if we should not change MaxIndexTuplesPerPage from int to uint16 too (given >> the fact that the maximum block size is 32 KB. >> >> I'm sharing this version but I still need to think about it and >> I'm curious about your thoughts too. >> > > @@ -591,11 +591,11 @@ hash_xlog_move_page_contents(XLogReaderState *record) > > if (len > 0) > { > - OffsetNumber *unused; > - OffsetNumber *unend; > + uint16 *unused; > + uint16 *unend; > > - unused = (OffsetNumber *) ptr; > - unend = (OffsetNumber *) ((char *) ptr + len); > + unused = (uint16 *) ptr; > + unend = (uint16 *) ((char *) ptr + len); > > It doesn't seem useful to me to make such changes. Yeah, the OffsetNumber is currently defined as uint16, but I wonder if it's not better that those matches the functions arguments types they are linked to (should OffsetNumber or the functions arguments types change). > About other changes > in the second patch, it is not clear whether there is much value > addition by those even though I don't see anything wrong with them. > So, let's see if Nathan or others see the value in the proposed patch > or any subset of these changes. > +1. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com