Re: pgbench - allow to create partitioned tables - Mailing list pgsql-hackers
From | Fabien COELHO |
---|---|
Subject | Re: pgbench - allow to create partitioned tables |
Date | |
Msg-id | alpine.DEB.2.21.1909100818130.1895@lancre Whole thread Raw |
In response to | Re: pgbench - allow to create partitioned tables (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: pgbench - allow to create partitioned tables
Re: pgbench - allow to create partitioned tables |
List | pgsql-hackers |
Hello Amit, Thanks for the feedback. > + case 11: /* partitions */ > + initialization_option_set = true; > + partitions = atoi(optarg); > + if (partitions < 0) > + { > + fprintf(stderr, "invalid number of partitions: \"%s\"\n", > + optarg); > + exit(1); > + } > + break; > > Is there a reason why we treat "partitions = 0" as a valid value? Yes. It is an explicit "do not create partitioned tables", which differ from 1 which says "create a partitionned table with just one partition". > Also, shouldn't we keep some max limit for this parameter as well? I do not think so. If someone wants to test how terrible it is to use 100000 partitions, we should not prevent it. > Forex. how realistic it will be if the user gives the value of > partitions the same or greater than the number of rows in > pgbench_accounts table? Although I agree that it does not make much sense, for testing purposes why not, to test overheads in critical cases for instance. > I understand it is not sensible to give such a value, but I guess the > API should behave sanely in such cases as well. Yep, it should work. > I am not sure what will be the good max value for it, but I > think there should be one. I disagree. Pgbench is a tool for testing performance for given parameters. If postgres accepts a parameter there is no reason why pgbench should reject it. > @@ -3625,6 +3644,7 @@ initCreateTables(PGconn *con) > const char *bigcols; /* column decls if accountIDs are 64 bits */ > int declare_fillfactor; > }; > + > static const struct ddlinfo DDLs[] = { > > Spurious line change. Indeed. > * > + " --partitions=NUM partition account table in NUM parts > (defaults: 0)\n" > + " --partition-method=(range|hash)\n" > + " partition account table with this > method (default: range)\n" > > Refer complete table name like pgbench_accounts instead of just account. > It will be clear and in sync with what we display in some other options > like --skip-some-updates. Ok. > * > + " --partitions=NUM partition account table in NUM parts > (defaults: 0)\n" > > /defaults/default. Ok. > I think we should print the information about partitions in > printResults. It can help users while analyzing results. Hmmm. Why not, with some hocus-pocus to get the information out of pg_catalog, and trying to fail gracefully so that if pgbench is run against a no partitioning-support version. > * > +enum { PART_NONE, PART_RANGE, PART_HASH } > + partition_method = PART_NONE; > + > > I think it is better to follow the style of QueryMode enum by using > typedef here, that will make look code in sync with nearby code. Hmmm. Why not. This means inventing a used-once type name for partition_method. My great creativity lead to partition_method_t. > * > - int i; > > fprintf(stderr, "creating tables...\n"); > > - for (i = 0; i < lengthof(DDLs); i++) > + for (int i = 0; i < lengthof(DDLs); i++) > > This is unnecessary change as far as this patch is concerned. I > understand there is no problem in writing either way, but let's not > change the coding pattern here as part of this patch. The reason I did that is that I had a stupid bug in a development version which was due to an accidental reuse of this index, which would have been prevented by this declaration style. Removed anyway. > + if (partitions >= 1) > + { > + int64 part_size = (naccounts * (int64) scale + partitions - 1) / partitions; > + char ff[64]; > + ff[0] = '\0'; > + append_fillfactor(ff, sizeof(ff)); > + > + fprintf(stderr, "creating %d partitions...\n", partitions); > + > + for (int p = 1; p <= partitions; p++) > + { > + char query[256]; > + > + if (partition_method == PART_RANGE) > + { > > part_size can be defined inside "if (partition_method == PART_RANGE)" > as it is used here. I just wanted to avoid recomputing the value in the loop, but indeed it may be computed needlessly. Moved. > In general, this part of the code can use some comments. Ok. Attached an updated version. -- Fabien.
Attachment
pgsql-hackers by date: