Thread: Extend pgbench partitioning to pgbench_history

Extend pgbench partitioning to pgbench_history

From
Gabriele Bartolini
Date:
Hi there,

While benchmarking a new feature involving tablespace support in CloudNativePG (Kubernetes operator), I wanted to try out the partitioning feature of pgbench. I saw it supporting both range and hash partitioning, but limited to pgbench_accounts.

With the attached patch, I extend the partitioning capability to the pgbench_history table too.

I have been thinking of adding an option to control this, but I preferred to ask in this list whether it really makes sense or not (I struggle indeed to see use cases where accounts is partitioned and history is not).

Please let me know what you think.

Thanks,
Gabriele
--
Gabriele Bartolini
Vice President, Cloud Native at EDB
Attachment

Re: Extend pgbench partitioning to pgbench_history

From
Gabriele Bartolini
Date:
Please discard the previous patch and use this one (it had a leftover comment from an initial attempt to limit this to hash case).

Thanks,
Gabriele

On Thu, 30 Nov 2023 at 11:29, Gabriele Bartolini <gabriele.bartolini@enterprisedb.com> wrote:
Hi there,

While benchmarking a new feature involving tablespace support in CloudNativePG (Kubernetes operator), I wanted to try out the partitioning feature of pgbench. I saw it supporting both range and hash partitioning, but limited to pgbench_accounts.

With the attached patch, I extend the partitioning capability to the pgbench_history table too.

I have been thinking of adding an option to control this, but I preferred to ask in this list whether it really makes sense or not (I struggle indeed to see use cases where accounts is partitioned and history is not).

Please let me know what you think.

Thanks,
Gabriele
--
Gabriele Bartolini
Vice President, Cloud Native at EDB


--
Gabriele Bartolini
Vice President, Cloud Native at EDB
Attachment

Re: Extend pgbench partitioning to pgbench_history

From
Shlok Kyal
Date:
Hi,

There are some test failures reported by Cfbot at [1]:

[09:15:01.794] 192/276 postgresql:pgbench /
pgbench/001_pgbench_with_server ERROR 7.48s exit status 3
[09:15:01.794] >>>
INITDB_TEMPLATE=/tmp/cirrus-ci-build/build/tmp_install/initdb-template
LD_LIBRARY_PATH=/tmp/cirrus-ci-build/build/tmp_install//usr/local/pgsql/lib
REGRESS_SHLIB=/tmp/cirrus-ci-build/build/src/test/regress/regress.so

PATH=/tmp/cirrus-ci-build/build/tmp_install//usr/local/pgsql/bin:/tmp/cirrus-ci-build/build/src/bin/pgbench:/tmp/cirrus-ci-build/build/src/bin/pgbench/test:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
PG_REGRESS=/tmp/cirrus-ci-build/build/src/test/regress/pg_regress
MALLOC_PERTURB_=67 /usr/local/bin/python3
/tmp/cirrus-ci-build/build/../src/tools/testwrap --basedir
/tmp/cirrus-ci-build/build --srcdir
/tmp/cirrus-ci-build/src/bin/pgbench --testgroup pgbench --testname
001_pgbench_with_server -- /usr/local/bin/perl -I
/tmp/cirrus-ci-build/src/test/perl -I
/tmp/cirrus-ci-build/src/bin/pgbench
/tmp/cirrus-ci-build/src/bin/pgbench/t/001_pgbench_with_server.pl
[09:15:01.794] ――――――――――――――――――――――――――――――――――――― ✀
―――――――――――――――――――――――――――――――――――――
[09:15:01.794] stderr:
[09:15:01.794] # Failed test 'transaction format for 001_pgbench_log_2'
[09:15:01.794] # at
/tmp/cirrus-ci-build/src/bin/pgbench/t/001_pgbench_with_server.pl line
1247.
[09:15:01.794] # Failed test 'transaction count for

/tmp/cirrus-ci-build/build/testrun/pgbench/001_pgbench_with_server/data/t_001_pgbench_with_server_main_data/001_pgbench_log_3.25193
(11)'
[09:15:01.794] # at
/tmp/cirrus-ci-build/src/bin/pgbench/t/001_pgbench_with_server.pl line
1257.
[09:15:01.794] # Failed test 'transaction format for 001_pgbench_log_3'
[09:15:01.794] # at
/tmp/cirrus-ci-build/src/bin/pgbench/t/001_pgbench_with_server.pl line
1257.
[09:15:01.794] # Looks like you failed 3 tests of 439.
[09:15:01.794]
[09:15:01.794] (test program exited with status code 3)

[1] - https://cirrus-ci.com/task/5139049757802496

Thanks and regards
Shlok Kyal



Re: Extend pgbench partitioning to pgbench_history

From
Shlok Kyal
Date:
Sorry, I did not intend to send this message for this email. I by
mistake sent this mail. Please ignore this mail

On Wed, 10 Jan 2024 at 15:27, Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> Hi,
>
> There are some test failures reported by Cfbot at [1]:
>
> [09:15:01.794] 192/276 postgresql:pgbench /
> pgbench/001_pgbench_with_server ERROR 7.48s exit status 3
> [09:15:01.794] >>>
> INITDB_TEMPLATE=/tmp/cirrus-ci-build/build/tmp_install/initdb-template
> LD_LIBRARY_PATH=/tmp/cirrus-ci-build/build/tmp_install//usr/local/pgsql/lib
> REGRESS_SHLIB=/tmp/cirrus-ci-build/build/src/test/regress/regress.so
>
PATH=/tmp/cirrus-ci-build/build/tmp_install//usr/local/pgsql/bin:/tmp/cirrus-ci-build/build/src/bin/pgbench:/tmp/cirrus-ci-build/build/src/bin/pgbench/test:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
> PG_REGRESS=/tmp/cirrus-ci-build/build/src/test/regress/pg_regress
> MALLOC_PERTURB_=67 /usr/local/bin/python3
> /tmp/cirrus-ci-build/build/../src/tools/testwrap --basedir
> /tmp/cirrus-ci-build/build --srcdir
> /tmp/cirrus-ci-build/src/bin/pgbench --testgroup pgbench --testname
> 001_pgbench_with_server -- /usr/local/bin/perl -I
> /tmp/cirrus-ci-build/src/test/perl -I
> /tmp/cirrus-ci-build/src/bin/pgbench
> /tmp/cirrus-ci-build/src/bin/pgbench/t/001_pgbench_with_server.pl
> [09:15:01.794] ――――――――――――――――――――――――――――――――――――― ✀
> ―――――――――――――――――――――――――――――――――――――
> [09:15:01.794] stderr:
> [09:15:01.794] # Failed test 'transaction format for 001_pgbench_log_2'
> [09:15:01.794] # at
> /tmp/cirrus-ci-build/src/bin/pgbench/t/001_pgbench_with_server.pl line
> 1247.
> [09:15:01.794] # Failed test 'transaction count for
>
/tmp/cirrus-ci-build/build/testrun/pgbench/001_pgbench_with_server/data/t_001_pgbench_with_server_main_data/001_pgbench_log_3.25193
> (11)'
> [09:15:01.794] # at
> /tmp/cirrus-ci-build/src/bin/pgbench/t/001_pgbench_with_server.pl line
> 1257.
> [09:15:01.794] # Failed test 'transaction format for 001_pgbench_log_3'
> [09:15:01.794] # at
> /tmp/cirrus-ci-build/src/bin/pgbench/t/001_pgbench_with_server.pl line
> 1257.
> [09:15:01.794] # Looks like you failed 3 tests of 439.
> [09:15:01.794]
> [09:15:01.794] (test program exited with status code 3)
>
> [1] - https://cirrus-ci.com/task/5139049757802496
>
> Thanks and regards
> Shlok Kyal



Re: Extend pgbench partitioning to pgbench_history

From
Abhijit Menon-Sen
Date:
At 2023-11-30 11:29:15 +0100, gabriele.bartolini@enterprisedb.com wrote:
>
> With the attached patch, I extend the partitioning capability to the
> pgbench_history table too.
> 
> I have been thinking of adding an option to control this, but I preferred
> to ask in this list whether it really makes sense or not (I struggle indeed
> to see use cases where accounts is partitioned and history is not).

I don't have a strong opinion about this, but I also can't think of a
reason to want to create partitions for pgbench_accounts but leave out
pgbench_history.

> From ba8f507b126a9c5bd22dd40bb8ce0c1f0c43ac59 Mon Sep 17 00:00:00 2001
> From: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
> Date: Thu, 30 Nov 2023 11:02:39 +0100
> Subject: [PATCH] Include pgbench_history in partitioning method for pgbench
> 
> In case partitioning, make sure that pgbench_history is also partitioned with
> the same criteria.

I think "If partitioning" or "If we're creating partitions" would read
better here. Also, same criteria as what? Maybe you could just add "as
pgbench_accounts" to the end.

> diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
> index 05d3f81619..4c02d2a61d 100644
> --- a/doc/src/sgml/ref/pgbench.sgml
> +++ b/doc/src/sgml/ref/pgbench.sgml
> […]
> @@ -378,9 +378,9 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
>        <term><option>--partitions=<replaceable>NUM</replaceable></option></term>
>        <listitem>
>         <para>
> -        Create a partitioned <literal>pgbench_accounts</literal> table with
> -        <replaceable>NUM</replaceable> partitions of nearly equal size for
> -        the scaled number of accounts.
> +        Create partitioned <literal>pgbench_accounts</literal> and <literal>pgbench_history</literal>
> +        tables with <replaceable>NUM</replaceable> partitions of nearly equal size for
> +        the scaled number of accounts - and future history records.
>          Default is <literal>0</literal>, meaning no partitioning.
>         </para>

I would just leave out the "-" and write "number of accounts and future
history records".

> diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
> index 2e1650d0ad..87adaf4d8f 100644
> --- a/src/bin/pgbench/pgbench.c
> +++ b/src/bin/pgbench/pgbench.c
> […]
> @@ -889,8 +889,10 @@ usage(void)
>             "  --index-tablespace=TABLESPACE\n"
>             "                           create indexes in the specified tablespace\n"
>             "  --partition-method=(range|hash)\n"
> -           "                           partition pgbench_accounts with this method (default: range)\n"
> -           "  --partitions=NUM         partition pgbench_accounts into NUM parts (default: 0)\n"
> +           "                           partition pgbench_accounts and pgbench_history with this method"
> +           "                           (default: range)."
> +           "  --partitions=NUM         partition pgbench_accounts and pgbench_history into NUM parts"
> +           "                           (default: 0)\n"
>             "  --tablespace=TABLESPACE  create tables in the specified tablespace\n"
>             "  --unlogged-tables        create tables as unlogged tables\n"
>             "\nOptions to select what to run:\n"

There's a missing newline after "(default: range).".

I read through the rest of the patch closely. It looks fine to me. It
applies, builds, and does create the partitions as intended.

-- Abhijit



Re: Extend pgbench partitioning to pgbench_history

From
Gabriele Bartolini
Date:
Hi Abhijit,

Thanks for your input. Please accept my updated patch.

Ciao,
Gabriele

On Tue, 16 Jan 2024 at 12:53, Abhijit Menon-Sen <ams@toroid.org> wrote:
At 2023-11-30 11:29:15 +0100, gabriele.bartolini@enterprisedb.com wrote:
>
> With the attached patch, I extend the partitioning capability to the
> pgbench_history table too.
>
> I have been thinking of adding an option to control this, but I preferred
> to ask in this list whether it really makes sense or not (I struggle indeed
> to see use cases where accounts is partitioned and history is not).

I don't have a strong opinion about this, but I also can't think of a
reason to want to create partitions for pgbench_accounts but leave out
pgbench_history.

> From ba8f507b126a9c5bd22dd40bb8ce0c1f0c43ac59 Mon Sep 17 00:00:00 2001
> From: Gabriele Bartolini <gabriele.bartolini@enterprisedb.com>
> Date: Thu, 30 Nov 2023 11:02:39 +0100
> Subject: [PATCH] Include pgbench_history in partitioning method for pgbench
>
> In case partitioning, make sure that pgbench_history is also partitioned with
> the same criteria.

I think "If partitioning" or "If we're creating partitions" would read
better here. Also, same criteria as what? Maybe you could just add "as
pgbench_accounts" to the end.

> diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
> index 05d3f81619..4c02d2a61d 100644
> --- a/doc/src/sgml/ref/pgbench.sgml
> +++ b/doc/src/sgml/ref/pgbench.sgml
> […]
> @@ -378,9 +378,9 @@ pgbench <optional> <replaceable>options</replaceable> </optional> <replaceable>d
>        <term><option>--partitions=<replaceable>NUM</replaceable></option></term>
>        <listitem>
>         <para>
> -        Create a partitioned <literal>pgbench_accounts</literal> table with
> -        <replaceable>NUM</replaceable> partitions of nearly equal size for
> -        the scaled number of accounts.
> +        Create partitioned <literal>pgbench_accounts</literal> and <literal>pgbench_history</literal>
> +        tables with <replaceable>NUM</replaceable> partitions of nearly equal size for
> +        the scaled number of accounts - and future history records.
>          Default is <literal>0</literal>, meaning no partitioning.
>         </para>

I would just leave out the "-" and write "number of accounts and future
history records".

> diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
> index 2e1650d0ad..87adaf4d8f 100644
> --- a/src/bin/pgbench/pgbench.c
> +++ b/src/bin/pgbench/pgbench.c
> […]
> @@ -889,8 +889,10 @@ usage(void)
>                  "  --index-tablespace=TABLESPACE\n"
>                  "                           create indexes in the specified tablespace\n"
>                  "  --partition-method=(range|hash)\n"
> -                "                           partition pgbench_accounts with this method (default: range)\n"
> -                "  --partitions=NUM         partition pgbench_accounts into NUM parts (default: 0)\n"
> +                "                           partition pgbench_accounts and pgbench_history with this method"
> +                "                           (default: range)."
> +                "  --partitions=NUM         partition pgbench_accounts and pgbench_history into NUM parts"
> +                "                           (default: 0)\n"
>                  "  --tablespace=TABLESPACE  create tables in the specified tablespace\n"
>                  "  --unlogged-tables        create tables as unlogged tables\n"
>                  "\nOptions to select what to run:\n"

There's a missing newline after "(default: range).".

I read through the rest of the patch closely. It looks fine to me. It
applies, builds, and does create the partitions as intended.

-- Abhijit


--
Gabriele Bartolini
Vice President, Cloud Native at EDB
Attachment

Re: Extend pgbench partitioning to pgbench_history

From
Tomas Vondra
Date:
Hello Gabriele,

I think the improvement makes sense (it's indeed a bit strange to not
partition the history table), and the patch looks good.

I did think about whether this should be optional in some way - that is,
separate from partitioning the accounts table, and users would have to
explicitly enable (or disable) it. But I don't think we need to do that.

The vast majority of users simply want to partition everything. And this
is just one way to partition the database anyway, it's our opinion on
how to do that, but there's many other options how we might partition
the tables, and we don't (and don't want too) have options for that.

The only case that I can think of where this might matter is when
running a benchmarks that will be compared to some earlier results
(executed using an older pgbench version). That will be affected by
this, but I don't think we make many promises about compatibility in
this regard ... it's probably better to always compare results only from
the same pgbench version, I guess.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Extend pgbench partitioning to pgbench_history

From
Melanie Plageman
Date:
On Fri, Feb 16, 2024 at 12:50 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> Hello Gabriele,
>
> I think the improvement makes sense (it's indeed a bit strange to not
> partition the history table), and the patch looks good.
>
> I did think about whether this should be optional in some way - that is,
> separate from partitioning the accounts table, and users would have to
> explicitly enable (or disable) it. But I don't think we need to do that.
>
> The vast majority of users simply want to partition everything. And this
> is just one way to partition the database anyway, it's our opinion on
> how to do that, but there's many other options how we might partition
> the tables, and we don't (and don't want too) have options for that.

I wonder how common it would be to partition a history table by
account ID? I sort of imagined the most common kind of partitioning
for an audit table is by time (range). Anyway, I'm not objecting to
doing it by account ID, just asking if there is a reason to do so.

Speaking of which, Tomas said users might want to "partition
everything" -- so any reason not to also partition tellers and
branches?

This change to the docs seems a bit misleading:

       <listitem>
        <para>
-        Create a partitioned <literal>pgbench_accounts</literal> table with
-        <replaceable>NUM</replaceable> partitions of nearly equal size for
-        the scaled number of accounts.
+        Create partitioned <literal>pgbench_accounts</literal> and
<literal>pgbench_history</literal>
+        tables with <replaceable>NUM</replaceable> partitions of
nearly equal size for
+        the scaled number of accounts and future history records.
         Default is <literal>0</literal>, meaning no partitioning.
        </para>
       </listitem>

It says that partitions of "future history records" will be equal in
size. While it's true that at the end of a pgbench run, if you use a
random distribution for aid, the pgbench_history partitions should be
roughly equally sized, it is confusing to say it will "create
pgbench_history with partitions of equal size". Maybe it would be
better to write a new sentence about partitioning pgbench_history
without worrying about mirroring the sentence structure of the
existing sentence.

> The only case that I can think of where this might matter is when
> running a benchmarks that will be compared to some earlier results
> (executed using an older pgbench version). That will be affected by
> this, but I don't think we make many promises about compatibility in
> this regard ... it's probably better to always compare results only from
> the same pgbench version, I guess.

As a frequent pgbench user, I always use the same pgbench version even
when comparing different versions of Postgres. Other changes have made
it difficult to compare results across pgbench versions without
providing it as an option (see 06ba4a63b85e). So, I don't think it is
a problem if it is noted in release notes.

- Melanie



Re: Extend pgbench partitioning to pgbench_history

From
Robert Haas
Date:
On Fri, Feb 16, 2024 at 3:14 PM Melanie Plageman
<melanieplageman@gmail.com> wrote:
> [ review comments ]

Since there has been no response to these review comments for more
than 3 months, I have set https://commitfest.postgresql.org/48/4679/
to Returned with Feedback. Please feel free to update the status when
there is a new version of the patch (or at least a response to these
comments).

--
Robert Haas
EDB: http://www.enterprisedb.com