Thread: Add support for unit "B" to pg_size_pretty()

Add support for unit "B" to pg_size_pretty()

From

Peter Eisentraut

Date:

20 February 2023, 06:44:15

This patch adds support for the unit "B" to pg_size_pretty().  This 
makes it consistent with the units support in GUC.  (pg_size_pretty() 
only supports "bytes", but GUC only supports "B". -- I opted against 
adding support for "bytes" to GUC.)

Attachment

0001-Add-support-for-unit-B-to-pg_size_pretty.patch

Re: Add support for unit "B" to pg_size_pretty()

From

Justin Pryzby

Date:

20 February 2023, 14:34:52

On Mon, Feb 20, 2023 at 07:44:15AM +0100, Peter Eisentraut wrote:
> This patch adds support for the unit "B" to pg_size_pretty().  This makes it

It seems like what it actually does is to support "B" in pg_size_bytes()
- is that what you meant ?

pg_size_pretty() already supports "bytes", so this doesn't actually make
sizes any more pretty, or evidently change its output at all.

> diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
> index dbd404101f..9ecd5428c3 100644
> --- a/src/backend/utils/adt/dbsize.c
> +++ b/src/backend/utils/adt/dbsize.c
> @@ -49,6 +49,7 @@ struct size_pretty_unit
>  /* When adding units here also update the error message in pg_size_bytes */
>  static const struct size_pretty_unit size_pretty_units[] = {
>      {"bytes", 10 * 1024, false, 0},
> +    {"B", 10 * 1024, false, 0},

This adds a duplicate line (unitbits=0) where no other existing line
uses duplicates.  If that's intentional, I think it deserves a comment
highlighting that it's an /*alias*/, and about why that does the right
thing, either here about or in the commit message.

>      {"kB", 20 * 1024 - 1, true, 10},
>      {"MB", 20 * 1024 - 1, true, 20},
>      {"GB", 20 * 1024 - 1, true, 30},

-- 
Justin

Re: Add support for unit "B" to pg_size_pretty()

From

Peter Eisentraut

Date:

21 February 2023, 23:47:04

On 20.02.23 15:34, Justin Pryzby wrote:
> On Mon, Feb 20, 2023 at 07:44:15AM +0100, Peter Eisentraut wrote:
>> This patch adds support for the unit "B" to pg_size_pretty().  This makes it
> 
> It seems like what it actually does is to support "B" in pg_size_bytes()
> - is that what you meant ?

yes

> pg_size_pretty() already supports "bytes", so this doesn't actually make
> sizes any more pretty, or evidently change its output at all.

Right, this is for the input side.

>> diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
>> index dbd404101f..9ecd5428c3 100644
>> --- a/src/backend/utils/adt/dbsize.c
>> +++ b/src/backend/utils/adt/dbsize.c
>> @@ -49,6 +49,7 @@ struct size_pretty_unit
>>   /* When adding units here also update the error message in pg_size_bytes */
>>   static const struct size_pretty_unit size_pretty_units[] = {
>>       {"bytes", 10 * 1024, false, 0},
>> +    {"B", 10 * 1024, false, 0},
> 
> This adds a duplicate line (unitbits=0) where no other existing line
> uses duplicates.  If that's intentional, I think it deserves a comment
> highlighting that it's an /*alias*/, and about why that does the right
> thing, either here about or in the commit message.

I have added a comment about that.

Attachment

v2-0001-Add-support-for-unit-B-to-pg_size_bytes.patch

Re: Add support for unit "B" to pg_size_pretty()

From

David Rowley

Date:

22 February 2023, 02:39:15

On Wed, 22 Feb 2023 at 12:47, Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> >> diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
> >> index dbd404101f..9ecd5428c3 100644
> >> --- a/src/backend/utils/adt/dbsize.c
> >> +++ b/src/backend/utils/adt/dbsize.c
> >> @@ -49,6 +49,7 @@ struct size_pretty_unit
> >>   /* When adding units here also update the error message in pg_size_bytes */
> >>   static const struct size_pretty_unit size_pretty_units[] = {
> >>      {"bytes", 10 * 1024, false, 0},
> >> +    {"B", 10 * 1024, false, 0},
> >
> > This adds a duplicate line (unitbits=0) where no other existing line
> > uses duplicates.  If that's intentional, I think it deserves a comment
> > highlighting that it's an /*alias*/, and about why that does the right
> > thing, either here about or in the commit message.
>
> I have added a comment about that.

hmm. I didn't really code pg_size_pretty with aliases in mind. I don't
think you can do this. There's code in pg_size_pretty() and
pg_size_pretty_numeric() that'll not work correctly. We look ahead to
the next unit to check if there is one so we know we must use this
unit if there are no other units to convert to.

Let's assume someone in the future reads your comment about aliases
and thinks we can just go and add an alias for any unit.  Here we'll
add PiB for PB.

diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index dbd404101f..8e22969a76 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -54,6 +54,7 @@ static const struct size_pretty_unit size_pretty_units[] = {
        {"GB", 20 * 1024 - 1, true, 30},
        {"TB", 20 * 1024 - 1, true, 40},
        {"PB", 20 * 1024 - 1, true, 50},
+       {"PiB", 20 * 1024 - 1, true, 50},
        {NULL, 0, false, 0}
 };

testing it, I see:

postgres=# select pg_size_pretty(10000::numeric * 1024*1024*1024*1024*1024);
 pg_size_pretty
----------------
 10000 PB
(1 row)

postgres=# select pg_size_pretty(20000::numeric * 1024*1024*1024*1024*1024);
 pg_size_pretty
----------------
 20000 PiB
(1 row)

I think we'll likely get complaints about PB being used sometimes and
PiB being used at other times.

I think you'll need to find another way to make the aliases work.
Maybe another array with the name and an int to reference the
corresponding index in size_pretty_units.

David

Re: Add support for unit "B" to pg_size_pretty()

From

Peter Eisentraut

Date:

27 February 2023, 08:34:10

On 22.02.23 03:39, David Rowley wrote:
> hmm. I didn't really code pg_size_pretty with aliases in mind. I don't
> think you can do this. There's code in pg_size_pretty() and
> pg_size_pretty_numeric() that'll not work correctly. We look ahead to
> the next unit to check if there is one so we know we must use this
> unit if there are no other units to convert to.

> I think you'll need to find another way to make the aliases work.
> Maybe another array with the name and an int to reference the
> corresponding index in size_pretty_units.

Ok, here is a new patch with a separate table of aliases.  (Might look 
like overkill, but I think the "PiB" etc. example you had could actually 
be a good use case for this as well.)

Attachment

v3-0001-Add-support-for-unit-B-to-pg_size_bytes.patch

Re: Add support for unit "B" to pg_size_pretty()

From

David Rowley

Date:

02 March 2023, 19:58:02

On Mon, 27 Feb 2023 at 21:34, Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 22.02.23 03:39, David Rowley wrote:
> > I think you'll need to find another way to make the aliases work.
> > Maybe another array with the name and an int to reference the
> > corresponding index in size_pretty_units.
>
> Ok, here is a new patch with a separate table of aliases.  (Might look
> like overkill, but I think the "PiB" etc. example you had could actually
> be a good use case for this as well.)

I think I'd prefer to see the size_bytes_unit_alias struct have an
index into size_pretty_units[] array. i.e:

struct size_bytes_unit_alias
{
    const char *alias; /* aliased unit name */
    const int unit_index; /* corresponding size_pretty_units element */
};

then the pg_size_bytes code can be simplified to:

/* If not found, look in the table of aliases */
if (unit->name == NULL)
{
    for (const struct size_bytes_unit_alias *a = size_bytes_aliases;
a->alias != NULL; a++)
    {
        if (pg_strcasecmp(strptr, a->alias) == 0)
        {
            unit = &size_pretty_units[a->unit_index];
            break;
        }
    }
}

which saves having to have the additional and slower nested loop code.

Apart from that, the patch looks fine.

David

Re: Add support for unit "B" to pg_size_pretty()

From

Dean Rasheed

Date:

02 March 2023, 20:32:26

On Thu, 2 Mar 2023 at 19:58, David Rowley <dgrowleyml@gmail.com> wrote:
>
> I think I'd prefer to see the size_bytes_unit_alias struct have an
> index into size_pretty_units[] array. i.e:
>
> struct size_bytes_unit_alias
> {
>     const char *alias; /* aliased unit name */
>     const int unit_index; /* corresponding size_pretty_units element */
> };
>
> then the pg_size_bytes code can be simplified to:
>
> /* If not found, look in the table of aliases */
> if (unit->name == NULL)
> {
>     for (const struct size_bytes_unit_alias *a = size_bytes_aliases;
> a->alias != NULL; a++)
>     {
>         if (pg_strcasecmp(strptr, a->alias) == 0)
>         {
>             unit = &size_pretty_units[a->unit_index];
>             break;
>         }
>     }
> }
>
> which saves having to have the additional and slower nested loop code.
>

Hmm, I think it would be easier to just have a separate table for
pg_size_bytes(), rather than reusing pg_size_pretty()'s table. I.e.,
size_bytes_units[], which would only need name and multiplier columns
(not round and limit). Done that way, it would be easier to add other
units later (e.g., non-base-2 units).

Also, it looks to me as though the doc change is for pg_size_pretty()
instead of pg_size_bytes().

Regards,
Dean

Re: Add support for unit "B" to pg_size_pretty()

From

David Rowley

Date:

03 March 2023, 11:23:21

On Fri, 3 Mar 2023 at 09:32, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> Hmm, I think it would be easier to just have a separate table for
> pg_size_bytes(), rather than reusing pg_size_pretty()'s table. I.e.,
> size_bytes_units[], which would only need name and multiplier columns
> (not round and limit). Done that way, it would be easier to add other
> units later (e.g., non-base-2 units).

Maybe that's worthwhile if we were actually thinking of adding any
non-base 2 units in the future, but if we're not, perhaps it's better
just to have the smaller alias array which for Peter's needs will just
require 1 element + the NULL one instead of 6 + NULL.

In any case, I'm not really sure I see what the path forward would be
to add something like base-10 units would be for pg_size_bytes(). If
we were to change MB to mean 10^6 rather than 2^20 I think many people
would get upset.

David

Re: Add support for unit "B" to pg_size_pretty()

From

Dean Rasheed

Date:

03 March 2023, 12:22:38

On Fri, 3 Mar 2023 at 11:23, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Fri, 3 Mar 2023 at 09:32, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> > Hmm, I think it would be easier to just have a separate table for
> > pg_size_bytes(), rather than reusing pg_size_pretty()'s table.
>
> Maybe that's worthwhile if we were actually thinking of adding any
> non-base 2 units in the future, but if we're not, perhaps it's better
> just to have the smaller alias array which for Peter's needs will just
> require 1 element + the NULL one instead of 6 + NULL.
>

Maybe. It's the tradeoff between having a smaller array and more code
(2 loops) vs a larger array and less code (1 loop).

> In any case, I'm not really sure I see what the path forward would be
> to add something like base-10 units would be for pg_size_bytes(). If
> we were to change MB to mean 10^6 rather than 2^20 I think many people
> would get upset.
>

Yeah, that's probably true. Given the way this and configuration
parameters currently work, I think we're stuck with 1MB meaning 2^20
bytes.

Regards,
Dean

Re: Add support for unit "B" to pg_size_pretty()

From

Peter Eisentraut

Date:

06 March 2023, 08:13:41

On 02.03.23 20:58, David Rowley wrote:
> On Mon, 27 Feb 2023 at 21:34, Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
>>
>> On 22.02.23 03:39, David Rowley wrote:
>>> I think you'll need to find another way to make the aliases work.
>>> Maybe another array with the name and an int to reference the
>>> corresponding index in size_pretty_units.
>>
>> Ok, here is a new patch with a separate table of aliases.  (Might look
>> like overkill, but I think the "PiB" etc. example you had could actually
>> be a good use case for this as well.)
> 
> I think I'd prefer to see the size_bytes_unit_alias struct have an
> index into size_pretty_units[] array. i.e:

Ok, done that way.  (I had thought about that, but I was worried that 
that would be too error-prone to maintain.  But I suppose the tables 
don't change that often, and test cases would easily catch mistakes.)

I also updated the documentation a bit more.

Attachment

v4-0001-Add-support-for-unit-B-to-pg_size_bytes.patch

Re: Add support for unit "B" to pg_size_pretty()

From

David Rowley

Date:

06 March 2023, 08:27:36

On Mon, 6 Mar 2023 at 21:13, Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
>
> On 02.03.23 20:58, David Rowley wrote:
> > I think I'd prefer to see the size_bytes_unit_alias struct have an
> > index into size_pretty_units[] array. i.e:
>
> Ok, done that way.  (I had thought about that, but I was worried that
> that would be too error-prone to maintain.  But I suppose the tables
> don't change that often, and test cases would easily catch mistakes.)

Patch looks pretty good. I just see a small spelling mistake in:

+/* Additional unit aliases acceted by pg_size_bytes */

> I also updated the documentation a bit more.

I see I must have forgotten to add PB to the docs when pg_size_pretty
had that unit added.  I guess you added the "etc" to fix that?  I'm
wondering if that's the right choice. You modified the comment above
size_pretty_units[] to remind us to update the docs when adding units,
but the docs now say "etc", so do we need to?  I'd likely have gone
with just adding "PB" to the docs, that way it's pretty clear that new
units need to be mentioned in the docs.

David

Re: Add support for unit "B" to pg_size_pretty()

From

Peter Eisentraut

Date:

07 March 2023, 20:22:30

On 06.03.23 09:27, David Rowley wrote:
> On Mon, 6 Mar 2023 at 21:13, Peter Eisentraut
> <peter.eisentraut@enterprisedb.com> wrote:
>>
>> On 02.03.23 20:58, David Rowley wrote:
>>> I think I'd prefer to see the size_bytes_unit_alias struct have an
>>> index into size_pretty_units[] array. i.e:
>>
>> Ok, done that way.  (I had thought about that, but I was worried that
>> that would be too error-prone to maintain.  But I suppose the tables
>> don't change that often, and test cases would easily catch mistakes.)
> 
> Patch looks pretty good. I just see a small spelling mistake in:
> 
> +/* Additional unit aliases acceted by pg_size_bytes */
> 
>> I also updated the documentation a bit more.
> 
> I see I must have forgotten to add PB to the docs when pg_size_pretty
> had that unit added.  I guess you added the "etc" to fix that?  I'm
> wondering if that's the right choice. You modified the comment above
> size_pretty_units[] to remind us to update the docs when adding units,
> but the docs now say "etc", so do we need to?  I'd likely have gone
> with just adding "PB" to the docs, that way it's pretty clear that new
> units need to be mentioned in the docs.

Ok, I have fixed the original documentation to that effect and 
backpatched it.

The remaining patch has been updated accordingly and committed also.

Re: Add support for unit "B" to pg_size_pretty()

From

David Rowley

Date:

07 March 2023, 21:25:34

On Wed, 8 Mar 2023 at 09:22, Peter Eisentraut
<peter.eisentraut@enterprisedb.com> wrote:
> Ok, I have fixed the original documentation to that effect and
> backpatched it.

Thanks for fixing that.

David