Thread: Add support for unit "B" to pg_size_pretty()
This patch adds support for the unit "B" to pg_size_pretty(). This makes it consistent with the units support in GUC. (pg_size_pretty() only supports "bytes", but GUC only supports "B". -- I opted against adding support for "bytes" to GUC.)
Attachment
On Mon, Feb 20, 2023 at 07:44:15AM +0100, Peter Eisentraut wrote: > This patch adds support for the unit "B" to pg_size_pretty(). This makes it It seems like what it actually does is to support "B" in pg_size_bytes() - is that what you meant ? pg_size_pretty() already supports "bytes", so this doesn't actually make sizes any more pretty, or evidently change its output at all. > diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c > index dbd404101f..9ecd5428c3 100644 > --- a/src/backend/utils/adt/dbsize.c > +++ b/src/backend/utils/adt/dbsize.c > @@ -49,6 +49,7 @@ struct size_pretty_unit > /* When adding units here also update the error message in pg_size_bytes */ > static const struct size_pretty_unit size_pretty_units[] = { > {"bytes", 10 * 1024, false, 0}, > + {"B", 10 * 1024, false, 0}, This adds a duplicate line (unitbits=0) where no other existing line uses duplicates. If that's intentional, I think it deserves a comment highlighting that it's an /*alias*/, and about why that does the right thing, either here about or in the commit message. > {"kB", 20 * 1024 - 1, true, 10}, > {"MB", 20 * 1024 - 1, true, 20}, > {"GB", 20 * 1024 - 1, true, 30}, -- Justin
On 20.02.23 15:34, Justin Pryzby wrote: > On Mon, Feb 20, 2023 at 07:44:15AM +0100, Peter Eisentraut wrote: >> This patch adds support for the unit "B" to pg_size_pretty(). This makes it > > It seems like what it actually does is to support "B" in pg_size_bytes() > - is that what you meant ? yes > pg_size_pretty() already supports "bytes", so this doesn't actually make > sizes any more pretty, or evidently change its output at all. Right, this is for the input side. >> diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c >> index dbd404101f..9ecd5428c3 100644 >> --- a/src/backend/utils/adt/dbsize.c >> +++ b/src/backend/utils/adt/dbsize.c >> @@ -49,6 +49,7 @@ struct size_pretty_unit >> /* When adding units here also update the error message in pg_size_bytes */ >> static const struct size_pretty_unit size_pretty_units[] = { >> {"bytes", 10 * 1024, false, 0}, >> + {"B", 10 * 1024, false, 0}, > > This adds a duplicate line (unitbits=0) where no other existing line > uses duplicates. If that's intentional, I think it deserves a comment > highlighting that it's an /*alias*/, and about why that does the right > thing, either here about or in the commit message. I have added a comment about that.
Attachment
On Wed, 22 Feb 2023 at 12:47, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > >> diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c > >> index dbd404101f..9ecd5428c3 100644 > >> --- a/src/backend/utils/adt/dbsize.c > >> +++ b/src/backend/utils/adt/dbsize.c > >> @@ -49,6 +49,7 @@ struct size_pretty_unit > >> /* When adding units here also update the error message in pg_size_bytes */ > >> static const struct size_pretty_unit size_pretty_units[] = { > >> {"bytes", 10 * 1024, false, 0}, > >> + {"B", 10 * 1024, false, 0}, > > > > This adds a duplicate line (unitbits=0) where no other existing line > > uses duplicates. If that's intentional, I think it deserves a comment > > highlighting that it's an /*alias*/, and about why that does the right > > thing, either here about or in the commit message. > > I have added a comment about that. hmm. I didn't really code pg_size_pretty with aliases in mind. I don't think you can do this. There's code in pg_size_pretty() and pg_size_pretty_numeric() that'll not work correctly. We look ahead to the next unit to check if there is one so we know we must use this unit if there are no other units to convert to. Let's assume someone in the future reads your comment about aliases and thinks we can just go and add an alias for any unit. Here we'll add PiB for PB. diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c index dbd404101f..8e22969a76 100644 --- a/src/backend/utils/adt/dbsize.c +++ b/src/backend/utils/adt/dbsize.c @@ -54,6 +54,7 @@ static const struct size_pretty_unit size_pretty_units[] = { {"GB", 20 * 1024 - 1, true, 30}, {"TB", 20 * 1024 - 1, true, 40}, {"PB", 20 * 1024 - 1, true, 50}, + {"PiB", 20 * 1024 - 1, true, 50}, {NULL, 0, false, 0} }; testing it, I see: postgres=# select pg_size_pretty(10000::numeric * 1024*1024*1024*1024*1024); pg_size_pretty ---------------- 10000 PB (1 row) postgres=# select pg_size_pretty(20000::numeric * 1024*1024*1024*1024*1024); pg_size_pretty ---------------- 20000 PiB (1 row) I think we'll likely get complaints about PB being used sometimes and PiB being used at other times. I think you'll need to find another way to make the aliases work. Maybe another array with the name and an int to reference the corresponding index in size_pretty_units. David
On 22.02.23 03:39, David Rowley wrote: > hmm. I didn't really code pg_size_pretty with aliases in mind. I don't > think you can do this. There's code in pg_size_pretty() and > pg_size_pretty_numeric() that'll not work correctly. We look ahead to > the next unit to check if there is one so we know we must use this > unit if there are no other units to convert to. > I think you'll need to find another way to make the aliases work. > Maybe another array with the name and an int to reference the > corresponding index in size_pretty_units. Ok, here is a new patch with a separate table of aliases. (Might look like overkill, but I think the "PiB" etc. example you had could actually be a good use case for this as well.)
Attachment
On Mon, 27 Feb 2023 at 21:34, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > On 22.02.23 03:39, David Rowley wrote: > > I think you'll need to find another way to make the aliases work. > > Maybe another array with the name and an int to reference the > > corresponding index in size_pretty_units. > > Ok, here is a new patch with a separate table of aliases. (Might look > like overkill, but I think the "PiB" etc. example you had could actually > be a good use case for this as well.) I think I'd prefer to see the size_bytes_unit_alias struct have an index into size_pretty_units[] array. i.e: struct size_bytes_unit_alias { const char *alias; /* aliased unit name */ const int unit_index; /* corresponding size_pretty_units element */ }; then the pg_size_bytes code can be simplified to: /* If not found, look in the table of aliases */ if (unit->name == NULL) { for (const struct size_bytes_unit_alias *a = size_bytes_aliases; a->alias != NULL; a++) { if (pg_strcasecmp(strptr, a->alias) == 0) { unit = &size_pretty_units[a->unit_index]; break; } } } which saves having to have the additional and slower nested loop code. Apart from that, the patch looks fine. David
On Thu, 2 Mar 2023 at 19:58, David Rowley <dgrowleyml@gmail.com> wrote: > > I think I'd prefer to see the size_bytes_unit_alias struct have an > index into size_pretty_units[] array. i.e: > > struct size_bytes_unit_alias > { > const char *alias; /* aliased unit name */ > const int unit_index; /* corresponding size_pretty_units element */ > }; > > then the pg_size_bytes code can be simplified to: > > /* If not found, look in the table of aliases */ > if (unit->name == NULL) > { > for (const struct size_bytes_unit_alias *a = size_bytes_aliases; > a->alias != NULL; a++) > { > if (pg_strcasecmp(strptr, a->alias) == 0) > { > unit = &size_pretty_units[a->unit_index]; > break; > } > } > } > > which saves having to have the additional and slower nested loop code. > Hmm, I think it would be easier to just have a separate table for pg_size_bytes(), rather than reusing pg_size_pretty()'s table. I.e., size_bytes_units[], which would only need name and multiplier columns (not round and limit). Done that way, it would be easier to add other units later (e.g., non-base-2 units). Also, it looks to me as though the doc change is for pg_size_pretty() instead of pg_size_bytes(). Regards, Dean
On Fri, 3 Mar 2023 at 09:32, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > Hmm, I think it would be easier to just have a separate table for > pg_size_bytes(), rather than reusing pg_size_pretty()'s table. I.e., > size_bytes_units[], which would only need name and multiplier columns > (not round and limit). Done that way, it would be easier to add other > units later (e.g., non-base-2 units). Maybe that's worthwhile if we were actually thinking of adding any non-base 2 units in the future, but if we're not, perhaps it's better just to have the smaller alias array which for Peter's needs will just require 1 element + the NULL one instead of 6 + NULL. In any case, I'm not really sure I see what the path forward would be to add something like base-10 units would be for pg_size_bytes(). If we were to change MB to mean 10^6 rather than 2^20 I think many people would get upset. David
On Fri, 3 Mar 2023 at 11:23, David Rowley <dgrowleyml@gmail.com> wrote: > > On Fri, 3 Mar 2023 at 09:32, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > Hmm, I think it would be easier to just have a separate table for > > pg_size_bytes(), rather than reusing pg_size_pretty()'s table. > > Maybe that's worthwhile if we were actually thinking of adding any > non-base 2 units in the future, but if we're not, perhaps it's better > just to have the smaller alias array which for Peter's needs will just > require 1 element + the NULL one instead of 6 + NULL. > Maybe. It's the tradeoff between having a smaller array and more code (2 loops) vs a larger array and less code (1 loop). > In any case, I'm not really sure I see what the path forward would be > to add something like base-10 units would be for pg_size_bytes(). If > we were to change MB to mean 10^6 rather than 2^20 I think many people > would get upset. > Yeah, that's probably true. Given the way this and configuration parameters currently work, I think we're stuck with 1MB meaning 2^20 bytes. Regards, Dean
On 02.03.23 20:58, David Rowley wrote: > On Mon, 27 Feb 2023 at 21:34, Peter Eisentraut > <peter.eisentraut@enterprisedb.com> wrote: >> >> On 22.02.23 03:39, David Rowley wrote: >>> I think you'll need to find another way to make the aliases work. >>> Maybe another array with the name and an int to reference the >>> corresponding index in size_pretty_units. >> >> Ok, here is a new patch with a separate table of aliases. (Might look >> like overkill, but I think the "PiB" etc. example you had could actually >> be a good use case for this as well.) > > I think I'd prefer to see the size_bytes_unit_alias struct have an > index into size_pretty_units[] array. i.e: Ok, done that way. (I had thought about that, but I was worried that that would be too error-prone to maintain. But I suppose the tables don't change that often, and test cases would easily catch mistakes.) I also updated the documentation a bit more.
Attachment
On Mon, 6 Mar 2023 at 21:13, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > On 02.03.23 20:58, David Rowley wrote: > > I think I'd prefer to see the size_bytes_unit_alias struct have an > > index into size_pretty_units[] array. i.e: > > Ok, done that way. (I had thought about that, but I was worried that > that would be too error-prone to maintain. But I suppose the tables > don't change that often, and test cases would easily catch mistakes.) Patch looks pretty good. I just see a small spelling mistake in: +/* Additional unit aliases acceted by pg_size_bytes */ > I also updated the documentation a bit more. I see I must have forgotten to add PB to the docs when pg_size_pretty had that unit added. I guess you added the "etc" to fix that? I'm wondering if that's the right choice. You modified the comment above size_pretty_units[] to remind us to update the docs when adding units, but the docs now say "etc", so do we need to? I'd likely have gone with just adding "PB" to the docs, that way it's pretty clear that new units need to be mentioned in the docs. David
On 06.03.23 09:27, David Rowley wrote: > On Mon, 6 Mar 2023 at 21:13, Peter Eisentraut > <peter.eisentraut@enterprisedb.com> wrote: >> >> On 02.03.23 20:58, David Rowley wrote: >>> I think I'd prefer to see the size_bytes_unit_alias struct have an >>> index into size_pretty_units[] array. i.e: >> >> Ok, done that way. (I had thought about that, but I was worried that >> that would be too error-prone to maintain. But I suppose the tables >> don't change that often, and test cases would easily catch mistakes.) > > Patch looks pretty good. I just see a small spelling mistake in: > > +/* Additional unit aliases acceted by pg_size_bytes */ > >> I also updated the documentation a bit more. > > I see I must have forgotten to add PB to the docs when pg_size_pretty > had that unit added. I guess you added the "etc" to fix that? I'm > wondering if that's the right choice. You modified the comment above > size_pretty_units[] to remind us to update the docs when adding units, > but the docs now say "etc", so do we need to? I'd likely have gone > with just adding "PB" to the docs, that way it's pretty clear that new > units need to be mentioned in the docs. Ok, I have fixed the original documentation to that effect and backpatched it. The remaining patch has been updated accordingly and committed also.
On Wed, 8 Mar 2023 at 09:22, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > Ok, I have fixed the original documentation to that effect and > backpatched it. Thanks for fixing that. David