Re: pgbench - allow to create partitioned tables - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: pgbench - allow to create partitioned tables
Date
Msg-id CAA4eK1L8OjSJ7BRgoU-Up7j+FF0vZ6rCy4=ydXi7rztFVH1b_g@mail.gmail.com
Whole thread Raw
In response to Re: pgbench - allow to create partitioned tables  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: pgbench - allow to create partitioned tables
List pgsql-hackers
On Mon, Aug 26, 2019 at 11:04 PM Fabien COELHO <coelho@cri.ensmp.fr> wrote:
>
>
> > Just one suggestion --partition-method option should be made dependent
> > on --partitions, because it has no use unless used with --partitions.
> > What do you think?
>

Some comments:
*
+ case 11: /* partitions */
+ initialization_option_set = true;
+ partitions = atoi(optarg);
+ if (partitions < 0)
+ {
+ fprintf(stderr, "invalid number of partitions: \"%s\"\n",
+ optarg);
+ exit(1);
+ }
+ break;

Is there a reason why we treat "partitions = 0" as a valid value?
Also, shouldn't we keep some max limit for this parameter as well?
Forex. how realistic it will be if the user gives the value of
partitions the same or greater than the number of rows in
pgbench_accounts table?  I understand it is not sensible to give such
a value, but I guess the API should behave sanely in such cases as
well.  I am not sure what will be the good max value for it, but I
think there should be one.  Anyone else have any better suggestions
for this?

*
@@ -3625,6 +3644,7 @@ initCreateTables(PGconn *con)
  const char *bigcols; /* column decls if accountIDs are 64 bits */
  int declare_fillfactor;
  };
+
  static const struct ddlinfo DDLs[] = {

Spurious line change.

*
+    "  --partitions=NUM         partition account table in NUM parts
(defaults: 0)\n"
+    "  --partition-method=(range|hash)\n"
+    "                           partition account table with this
method (default: range)\n"

Refer complete table name like pgbench_accounts instead of just
account. It will be clear and in sync with what we display in some
other options like --skip-some-updates.

*
+    "  --partitions=NUM         partition account table in NUM parts
(defaults: 0)\n"

/defaults/default.

*
I think we should print the information about partitions in
printResults.  It can help users while analyzing results.

*
+enum { PART_NONE, PART_RANGE, PART_HASH }
+ partition_method = PART_NONE;
+

I think it is better to follow the style of QueryMode enum by using
typedef here, that will make look code in sync with nearby code.

*
- int i;

  fprintf(stderr, "creating tables...\n");

- for (i = 0; i < lengthof(DDLs); i++)
+ for (int i = 0; i < lengthof(DDLs); i++)

This is unnecessary change as far as this patch is concerned.  I
understand there is no problem in writing either way, but let's not
change the coding pattern here as part of this patch.

*
+ if (partitions >= 1)
+ {
+ int64 part_size = (naccounts * (int64) scale + partitions - 1) / partitions;
+ char ff[64];
+ ff[0] = '\0';
+ append_fillfactor(ff, sizeof(ff));
+
+ fprintf(stderr, "creating %d partitions...\n", partitions);
+
+ for (int p = 1; p <= partitions; p++)
+ {
+ char query[256];
+
+ if (partition_method == PART_RANGE)
+ {

part_size can be defined inside "if (partition_method == PART_RANGE)"
as it is used here.  In general, this part of the code can use some
comments.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Change atoi to strtol in same place
Next
From: Michael Paquier
Date:
Subject: Re: Replication & recovery_min_apply_delay