Re: pgbench - extend initialization phase control - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: pgbench - extend initialization phase control
Date
Msg-id alpine.DEB.2.21.1910310904420.27369@lancre
Whole thread Raw
In response to Re: pgbench - extend initialization phase control  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: pgbench - extend initialization phase control
Re: pgbench - extend initialization phase control
List pgsql-hackers
Hello Masao-san,

> + snprintf(sql, sizeof(sql),
> + "insert into pgbench_branches(bid,bbalance) "
> + "select bid, 0 "
> + "from generate_series(1, %d) as bid", scale);
>
> "scale" should be "nbranches * scale".

Yep, even if nbranches is 1, it should be there.

> + snprintf(sql, sizeof(sql),
> + "insert into pgbench_accounts(aid,bid,abalance,filler) "
> + "select aid, (aid - 1) / %d + 1, 0, '' "
> + "from generate_series(1, %d) as aid", naccounts, scale * naccounts);
>
> Like client-side data generation, INT64_FORMAT should be used here
> instead of %d?

Indeed.

> If large scale factor is specified, the query for generating pgbench_accounts
> data can take a very long time. While that query is running, operators may be
> likely to do Ctrl-C to cancel the data generation. In this case, IMO pgbench
> should cancel the query, i.e., call PQcancel(). Otherwise, the query will keep
> running to the end.

Hmmm. Why not. Now the infra to do that seems to already exists twice, 
once in "src/bin/psql/common.c" and once in "src/bin/scripts/common.c".

I cannot say I'm thrilled to replicate this once more. I think that the 
reasonable option is to share this in fe-utils and then to reuse it from 
there. However, ISTM that such a restructuring patch which not belong to 
this feature.

> - for (step = initialize_steps; *step != '\0'; step++)
> + for (const char *step = initialize_steps; *step != '\0'; step++)
>
> Per PostgreSQL basic coding style,

C99 (20 years ago) is new the norm, and this style is now allowed, there 
are over a hundred instances of these already. I tend to use that where
appropriate.

> - fprintf(stderr, "unrecognized initialization step \"%c\"\n",
> + fprintf(stderr,
> + "unrecognized initialization step \"%c\"\n"
> + "Allowed step characters are: \"" ALL_INIT_STEPS "\".\n",
>  *step);
> - fprintf(stderr, "allowed steps are: \"d\", \"t\", \"g\", \"v\",
> \"p\", \"f\"\n");
>
> The original message seems better to me. So what about just appending "G"
> into the above latter message? That is,
> "allowed steps are: \"d\", \"t\", \"g\", \"G\", \"v\", \"p\", \"f\"\n"

I needed this list in several places, so it makes sense to share the 
definition, and frankly the list of half a dozen comma-separated chars 
does not strike me as much better than just giving the allowed chars 
directly. So the simpler the better, from my point of view.

> Isn't it better to explain a bit more what "client-side / server-side data
> generation" is? For example, something like

Ok.

Attached v7 does most of the above, but the list of char message and the 
signal handling. The first one does not look really better to me, and the 
second one belongs to a restructuring patch that I'll try to submit.

-- 
Fabien.
Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: function calls optimization
Next
From: Ibrar Ahmed
Date:
Subject: Re: [BUG] Partition creation fails after dropping a column and addinga partial index