Thread: Change xl_hash_vacuum_one_page.ntuples from int to uint16

Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
"Drouvot, Bertrand"
Date:
Hi hackers,

While working on [1], I noticed that xl_hash_vacuum_one_page.ntuples is an int.

Unless I'm missing something, It seems to me that it would make more sense to use an uint16 (like this is done for
gistxlogDelete.ntodelete for example).

Please find attached a patch proposal to do so.

While that does not currently change the struct size:

No patch:

(gdb) ptype /o struct xl_hash_vacuum_one_page
/* offset      |    size */  type = struct xl_hash_vacuum_one_page {
/*      0      |       4 */    TransactionId snapshotConflictHorizon;
/*      4      |       4 */    int ntuples;

                                /* total size (bytes):    8 */
                              }
With patch:

(gdb) ptype /o struct xl_hash_vacuum_one_page
/* offset      |    size */  type = struct xl_hash_vacuum_one_page {
/*      0      |       4 */    TransactionId snapshotConflictHorizon;
/*      4      |       2 */    uint16 ntuples;
/* XXX  2-byte padding   */

                                /* total size (bytes):    8 */
                              }

It could reduce it when adding new fields (like this is is done in [1]).

We would get:

No patch:

(gdb) ptype /o struct xl_hash_vacuum_one_page
/* offset      |    size */  type = struct xl_hash_vacuum_one_page {
/*      0      |       4 */    TransactionId snapshotConflictHorizon;
/*      4      |       4 */    int ntuples;
/*      8      |       1 */    _Bool isCatalogRel;
/* XXX  3-byte padding   */

                                /* total size (bytes):   12 */
                              }

With patch:

(gdb) ptype /o struct xl_hash_vacuum_one_page
/* offset      |    size */  type = struct xl_hash_vacuum_one_page {
/*      0      |       4 */    TransactionId snapshotConflictHorizon;
/*      4      |       2 */    uint16 ntuples;
/*      6      |       1 */    _Bool isCatalogRel;
/* XXX  1-byte padding   */

                                /* total size (bytes):    8 */
                              }

Means saving 4 bytes in that case.

Looking forward to your feedback,

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
Nathan Bossart
Date:
On Tue, Jan 10, 2023 at 11:08:33AM +0100, Drouvot, Bertrand wrote:
> While working on [1], I noticed that xl_hash_vacuum_one_page.ntuples is an int.
> 
> Unless I'm missing something, It seems to me that it would make more sense to use an uint16 (like this is done for
> gistxlogDelete.ntodelete for example).

I think that is correct.  This value is determined by looping through
offsets, which are uint16 as well.  Should we also change the related
variables (e.g., ndeletable in _hash_vacuum_one_page()) to uint16?

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
"Drouvot, Bertrand"
Date:
Hi,

On 1/20/23 9:01 PM, Nathan Bossart wrote:
> On Tue, Jan 10, 2023 at 11:08:33AM +0100, Drouvot, Bertrand wrote:
>> While working on [1], I noticed that xl_hash_vacuum_one_page.ntuples is an int.
>>
>> Unless I'm missing something, It seems to me that it would make more sense to use an uint16 (like this is done for
>> gistxlogDelete.ntodelete for example).
> 
> I think that is correct.  This value is determined by looping through
> offsets, which are uint16 as well. 

Thanks for the review!

> Should we also change the related
> variables (e.g., ndeletable in _hash_vacuum_one_page()) to uint16?
> 

Yeah, I thought about it too. What I saw is that there is other places that would be good candidates for the same
kind of changes (see the int ntodelete argument in gistXLogDelete() being assigned to gistxlogDelete.ntodelete (uint16)
forexample).
 

So, what do you think about:

1) keep this patch as it is (to "only" address the struct field and avoid possible future "useless" padding size
increase)
and
2) create a new patch (once this one is committed) to align the types for variables/arguments with the structs (related
toXLOG records) fields when they are not?
 

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
Nathan Bossart
Date:
On Sat, Jan 21, 2023 at 06:42:08AM +0100, Drouvot, Bertrand wrote:
> On 1/20/23 9:01 PM, Nathan Bossart wrote:
>> Should we also change the related
>> variables (e.g., ndeletable in _hash_vacuum_one_page()) to uint16?
> 
> Yeah, I thought about it too. What I saw is that there is other places that would be good candidates for the same
> kind of changes (see the int ntodelete argument in gistXLogDelete() being assigned to gistxlogDelete.ntodelete
(uint16)for example).
 
> 
> So, what do you think about:
> 
> 1) keep this patch as it is (to "only" address the struct field and avoid possible future "useless" padding size
increase)
> and
> 2) create a new patch (once this one is committed) to align the types for variables/arguments with the structs
(relatedto XLOG records) fields when they are not?
 

Okay.  I've marked this one as ready-for-committer, then.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
Amit Kapila
Date:
On Wed, Feb 15, 2023 at 3:35 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Sat, Jan 21, 2023 at 06:42:08AM +0100, Drouvot, Bertrand wrote:
> > On 1/20/23 9:01 PM, Nathan Bossart wrote:
> >> Should we also change the related
> >> variables (e.g., ndeletable in _hash_vacuum_one_page()) to uint16?
> >
> > Yeah, I thought about it too. What I saw is that there is other places that would be good candidates for the same
> > kind of changes (see the int ntodelete argument in gistXLogDelete() being assigned to gistxlogDelete.ntodelete
(uint16)for example).
 
> >
> > So, what do you think about:
> >
> > 1) keep this patch as it is (to "only" address the struct field and avoid possible future "useless" padding size
increase)
> > and
> > 2) create a new patch (once this one is committed) to align the types for variables/arguments with the structs
(relatedto XLOG records) fields when they are not?
 
>
> Okay.  I've marked this one as ready-for-committer, then.
>

LGTM. I think the padding space we are trying to save here can be used
for the patch [1], right? BTW, feel free to create the second patch
(to align the types for variables/arguments) as that would be really
helpful after we commit this one.

I think this would require XLOG_PAGE_MAGIC as it changes the WAL record.

BTW, how about a commit message like:
Change xl_hash_vacuum_one_page.ntuples from int to uint16.

This will create two bytes of padding space in xl_hash_vacuum_one_page
which can be used for future patches. This makes the datatype of
xl_hash_vacuum_one_page.ntuples same as gistxlogDelete.ntodelete which
is advisable as both are used for the same purpose.

[1] - https://www.postgresql.org/message-id/2d62f212-fce6-d639-b9eb-2a5bc4bec3b4%40gmail.com

-- 
With Regards,
Amit Kapila.



Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
"Drouvot, Bertrand"
Date:
Hi,

On 2/16/23 12:00 PM, Amit Kapila wrote:
> On Wed, Feb 15, 2023 at 3:35 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>>
>> On Sat, Jan 21, 2023 at 06:42:08AM +0100, Drouvot, Bertrand wrote:
>>> On 1/20/23 9:01 PM, Nathan Bossart wrote:
>>>> Should we also change the related
>>>> variables (e.g., ndeletable in _hash_vacuum_one_page()) to uint16?
>>>
>>> Yeah, I thought about it too. What I saw is that there is other places that would be good candidates for the same
>>> kind of changes (see the int ntodelete argument in gistXLogDelete() being assigned to gistxlogDelete.ntodelete
(uint16)for example).
 
>>>
>>> So, what do you think about:
>>>
>>> 1) keep this patch as it is (to "only" address the struct field and avoid possible future "useless" padding size
increase)
>>> and
>>> 2) create a new patch (once this one is committed) to align the types for variables/arguments with the structs
(relatedto XLOG records) fields when they are not?
 
>>
>> Okay.  I've marked this one as ready-for-committer, then.
>>
> 
> LGTM. 

Thanks for looking at it!

> I think the padding space we are trying to save here can be used
> for the patch [1], right?

Yes exactly, without the current patch and adding isCatalogRel (from
the patch [1] you mentioned) we would get:

(gdb) ptype /o struct xl_hash_vacuum_one_page
/* offset      |    size */  type = struct xl_hash_vacuum_one_page {
/*      0      |       4 */    TransactionId snapshotConflictHorizon;
/*      4      |       4 */    int ntuples;
/*      8      |       1 */    _Bool isCatalogRel;
/* XXX  3-byte padding   */


                                 /* total size (bytes):   12 */
                               }

While with the proposed patch:

(gdb) ptype /o struct xl_hash_vacuum_one_page
/* offset      |    size */  type = struct xl_hash_vacuum_one_page {
/*      0      |       4 */    TransactionId snapshotConflictHorizon;
/*      4      |       2 */    uint16 ntuples;
/*      6      |       1 */    _Bool isCatalogRel;
/* XXX  1-byte padding   */


                                 /* total size (bytes):    8 */
                               }

> BTW, feel free to create the second patch
> (to align the types for variables/arguments) as that would be really
> helpful after we commit this one.
> 

Yes, will do.

> I think this would require XLOG_PAGE_MAGIC as it changes the WAL record.
> 

Oh, I Was not aware about it, thanks! Will do in V2 (and in the logical
decoding on standby patch as it adds the new field mentioned above).

> BTW, how about a commit message like:
> Change xl_hash_vacuum_one_page.ntuples from int to uint16.
> 
> This will create two bytes of padding space in xl_hash_vacuum_one_page
> which can be used for future patches. This makes the datatype of
> xl_hash_vacuum_one_page.ntuples same as gistxlogDelete.ntodelete which
> is advisable as both are used for the same purpose.
> 

LGTM, will add it to V2!

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
"Drouvot, Bertrand"
Date:
Hi,

On 2/16/23 1:26 PM, Drouvot, Bertrand wrote:
> Hi,
> 
> On 2/16/23 12:00 PM, Amit Kapila wrote:
>> I think this would require XLOG_PAGE_MAGIC as it changes the WAL record.
>>
> 
> Oh, I Was not aware about it, thanks! Will do in V2 (and in the logical
> decoding on standby patch as it adds the new field mentioned above).
> 
>> BTW, how about a commit message like:
>> Change xl_hash_vacuum_one_page.ntuples from int to uint16.
>>
>> This will create two bytes of padding space in xl_hash_vacuum_one_page
>> which can be used for future patches. This makes the datatype of
>> xl_hash_vacuum_one_page.ntuples same as gistxlogDelete.ntodelete which
>> is advisable as both are used for the same purpose.
>>
> 
> LGTM, will add it to V2!
> 
Please find V2 attached.
The commit message also mention the XLOG_PAGE_MAGIC bump.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
Amit Kapila
Date:
On Thu, Feb 16, 2023 at 8:39 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> On 2/16/23 1:26 PM, Drouvot, Bertrand wrote:
> > Hi,
> >
> > On 2/16/23 12:00 PM, Amit Kapila wrote:
> >> I think this would require XLOG_PAGE_MAGIC as it changes the WAL record.
> >>
> >
> > Oh, I Was not aware about it, thanks! Will do in V2 (and in the logical
> > decoding on standby patch as it adds the new field mentioned above).
> >
> >> BTW, how about a commit message like:
> >> Change xl_hash_vacuum_one_page.ntuples from int to uint16.
> >>
> >> This will create two bytes of padding space in xl_hash_vacuum_one_page
> >> which can be used for future patches. This makes the datatype of
> >> xl_hash_vacuum_one_page.ntuples same as gistxlogDelete.ntodelete which
> >> is advisable as both are used for the same purpose.
> >>
> >
> > LGTM, will add it to V2!
> >
> Please find V2 attached.
> The commit message also mention the XLOG_PAGE_MAGIC bump.
>

Thanks, I was not completely sure about whether we need to bump
XLOG_PAGE_MAGIC for this patch as this makes the additional space just
by changing the datatype of one of the members of the existing WAL
record. We normally change it for the addition/removal of new fields
aka change in WAL record format, or addition of a new type of WAL
record. Does anyone else have an opinion/suggestion on this matter?


-- 
With Regards,
Amit Kapila.



Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
Andres Freund
Date:
Hi

On 2023-02-17 08:30:09 +0530, Amit Kapila wrote:
> Thanks, I was not completely sure about whether we need to bump
> XLOG_PAGE_MAGIC for this patch as this makes the additional space just
> by changing the datatype of one of the members of the existing WAL
> record. We normally change it for the addition/removal of new fields
> aka change in WAL record format, or addition of a new type of WAL
> record. Does anyone else have an opinion/suggestion on this matter?

I'd definitely change it - the width of a field still means you can't really
parse the old WAL sensibly anymore.

Greetings,

Andres Freund



Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
"Drouvot, Bertrand"
Date:
Hi,

On 2/16/23 1:26 PM, Drouvot, Bertrand wrote:
> Hi,
> 
> On 2/16/23 12:00 PM, Amit Kapila wrote:
> 
>> BTW, feel free to create the second patch
>> (to align the types for variables/arguments) as that would be really
>> helpful after we commit this one.

Please find attached a patch proposal to do so.

It looks like a Pandora's box as it produces
those cascading changes:

     _hash_vacuum_one_page
             index_compute_xid_horizon_for_tuples
                     gistprunepage
                             PageIndexMultiDelete
                             gistXLogDelete
             PageIndexMultiDelete
                     spgRedoVacuumRedirect
                     vacuumRedirectAndPlaceholder
                     spgPageIndexMultiDelete
                             moveLeafs
                             doPickSplit
                     _bt_delitems_vacuum
                             btvacuumpage
                     _bt_delitems_delete
                             _bt_delitems_delete_check
                     hash_xlog_move_page_contents
                     gistvacuumpage
                             gistXLogUpdate
                                     gistplacetopage
                     hashbucketcleanup


Which makes me:

- wonder it is not too intrusive (we could reduce the scope and keep the
PageIndexMultiDelete()'s nitems argument as an int for example).

- worry if there is no side effects (like the one I'm mentioning as a comment
in PageIndexMultiDelete()) even if it passes the CI tests.

- wonder if we should not change MaxIndexTuplesPerPage from int to uint16 too (given
the fact that the maximum block size is 32 KB.

I'm sharing this version but I still need to think about it and
I'm curious about your thoughts too.
  
Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment

Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
Amit Kapila
Date:
On Fri, Feb 17, 2023 at 9:43 AM Andres Freund <andres@anarazel.de> wrote:
>
> On 2023-02-17 08:30:09 +0530, Amit Kapila wrote:
> > Thanks, I was not completely sure about whether we need to bump
> > XLOG_PAGE_MAGIC for this patch as this makes the additional space just
> > by changing the datatype of one of the members of the existing WAL
> > record. We normally change it for the addition/removal of new fields
> > aka change in WAL record format, or addition of a new type of WAL
> > record. Does anyone else have an opinion/suggestion on this matter?
>
> I'd definitely change it - the width of a field still means you can't really
> parse the old WAL sensibly anymore.
>

Okay, thanks for your input. I'll push this patch early next week.

-- 
With Regards,
Amit Kapila.



Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
Amit Kapila
Date:
On Fri, Feb 17, 2023 at 7:44 PM Drouvot, Bertrand
<bertranddrouvot.pg@gmail.com> wrote:
>
> On 2/16/23 1:26 PM, Drouvot, Bertrand wrote:
> > Hi,
> >
> > On 2/16/23 12:00 PM, Amit Kapila wrote:
> >
> >> BTW, feel free to create the second patch
> >> (to align the types for variables/arguments) as that would be really
> >> helpful after we commit this one.
>

Pushed the first patch.

> Please find attached a patch proposal to do so.
>
> It looks like a Pandora's box as it produces
> those cascading changes:
>
>      _hash_vacuum_one_page
>              index_compute_xid_horizon_for_tuples
>                      gistprunepage
>                              PageIndexMultiDelete
>                              gistXLogDelete
>              PageIndexMultiDelete
>                      spgRedoVacuumRedirect
>                      vacuumRedirectAndPlaceholder
>                      spgPageIndexMultiDelete
>                              moveLeafs
>                              doPickSplit
>                      _bt_delitems_vacuum
>                              btvacuumpage
>                      _bt_delitems_delete
>                              _bt_delitems_delete_check
>                      hash_xlog_move_page_contents
>                      gistvacuumpage
>                              gistXLogUpdate
>                                      gistplacetopage
>                      hashbucketcleanup
>
>
> Which makes me:
>
> - wonder it is not too intrusive (we could reduce the scope and keep the
> PageIndexMultiDelete()'s nitems argument as an int for example).
>
> - worry if there is no side effects (like the one I'm mentioning as a comment
> in PageIndexMultiDelete()) even if it passes the CI tests.
>
> - wonder if we should not change MaxIndexTuplesPerPage from int to uint16 too (given
> the fact that the maximum block size is 32 KB.
>
> I'm sharing this version but I still need to think about it and
> I'm curious about your thoughts too.
>

@@ -591,11 +591,11 @@ hash_xlog_move_page_contents(XLogReaderState *record)

  if (len > 0)
  {
- OffsetNumber *unused;
- OffsetNumber *unend;
+ uint16 *unused;
+ uint16 *unend;

- unused = (OffsetNumber *) ptr;
- unend = (OffsetNumber *) ((char *) ptr + len);
+ unused = (uint16 *) ptr;
+ unend = (uint16 *) ((char *) ptr + len);

It doesn't seem useful to me to make such changes. About other changes
in the second patch, it is not clear whether there is much value
addition by those even though I don't see anything wrong with them.
So, let's see if Nathan or others see the value in the proposed patch
or any subset of these changes.

--
With Regards,
Amit Kapila.



Re: Change xl_hash_vacuum_one_page.ntuples from int to uint16

From
"Drouvot, Bertrand"
Date:
Hi,

On 2/27/23 6:27 AM, Amit Kapila wrote:
> On Fri, Feb 17, 2023 at 7:44 PM Drouvot, Bertrand
> <bertranddrouvot.pg@gmail.com> wrote:
>>
>> On 2/16/23 1:26 PM, Drouvot, Bertrand wrote:
>>> Hi,
>>>
>>> On 2/16/23 12:00 PM, Amit Kapila wrote:
>>>
>>>> BTW, feel free to create the second patch
>>>> (to align the types for variables/arguments) as that would be really
>>>> helpful after we commit this one.
>>
> 
> Pushed the first patch.

Thanks!

> 
>> Please find attached a patch proposal to do so.
>>
>> It looks like a Pandora's box as it produces
>> those cascading changes:
>>
>>       _hash_vacuum_one_page
>>               index_compute_xid_horizon_for_tuples
>>                       gistprunepage
>>                               PageIndexMultiDelete
>>                               gistXLogDelete
>>               PageIndexMultiDelete
>>                       spgRedoVacuumRedirect
>>                       vacuumRedirectAndPlaceholder
>>                       spgPageIndexMultiDelete
>>                               moveLeafs
>>                               doPickSplit
>>                       _bt_delitems_vacuum
>>                               btvacuumpage
>>                       _bt_delitems_delete
>>                               _bt_delitems_delete_check
>>                       hash_xlog_move_page_contents
>>                       gistvacuumpage
>>                               gistXLogUpdate
>>                                       gistplacetopage
>>                       hashbucketcleanup
>>
>>
>> Which makes me:
>>
>> - wonder it is not too intrusive (we could reduce the scope and keep the
>> PageIndexMultiDelete()'s nitems argument as an int for example).
>>
>> - worry if there is no side effects (like the one I'm mentioning as a comment
>> in PageIndexMultiDelete()) even if it passes the CI tests.
>>
>> - wonder if we should not change MaxIndexTuplesPerPage from int to uint16 too (given
>> the fact that the maximum block size is 32 KB.
>>
>> I'm sharing this version but I still need to think about it and
>> I'm curious about your thoughts too.
>>
> 
> @@ -591,11 +591,11 @@ hash_xlog_move_page_contents(XLogReaderState *record)
> 
>    if (len > 0)
>    {
> - OffsetNumber *unused;
> - OffsetNumber *unend;
> + uint16 *unused;
> + uint16 *unend;
> 
> - unused = (OffsetNumber *) ptr;
> - unend = (OffsetNumber *) ((char *) ptr + len);
> + unused = (uint16 *) ptr;
> + unend = (uint16 *) ((char *) ptr + len);
> 
> It doesn't seem useful to me to make such changes.

Yeah, the OffsetNumber is currently defined as uint16, but I wonder if it's
not better that those matches the functions arguments types they are linked to (should OffsetNumber
or the functions arguments types change).

> About other changes
> in the second patch, it is not clear whether there is much value
> addition by those even though I don't see anything wrong with them.
> So, let's see if Nathan or others see the value in the proposed patch
> or any subset of these changes.
> 

+1.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com