Thread: [HACKERS] pgbench: Skipping the creating primary keys after initialization
[HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
Hi all, I'd like to propose a new option -I for pgbench command which skips the creating primary keys after initialized tables. This option is useful for users who want to do bench marking with no index or indexes other than btree primary index. If we initialize pgbench tables at a large number scale factor the primary key index creation takes a long time even if we're going to use other types of indexes. With this option, the initialization time is reduced and you can create indexes as you want. Feedback is very welcome. I'll add this patch to the next CF. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Robert Haas
Date:
On Tue, Aug 1, 2017 at 9:49 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > I'd like to propose a new option -I for pgbench command which skips > the creating primary keys after initialized tables. This option is > useful for users who want to do bench marking with no index or indexes > other than btree primary index. If we initialize pgbench tables at a > large number scale factor the primary key index creation takes a long > time even if we're going to use other types of indexes. With this > option, the initialization time is reduced and you can create indexes > as you want. > > Feedback is very welcome. I'll add this patch to the next CF. I support adding an option for this, but I propose that we just make it a long-form option, similar to --log-prefix or --index-tablespace. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Wed, Aug 2, 2017 at 10:25 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Tue, Aug 1, 2017 at 9:49 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> I'd like to propose a new option -I for pgbench command which skips >> the creating primary keys after initialized tables. This option is >> useful for users who want to do bench marking with no index or indexes >> other than btree primary index. If we initialize pgbench tables at a >> large number scale factor the primary key index creation takes a long >> time even if we're going to use other types of indexes. With this >> option, the initialization time is reduced and you can create indexes >> as you want. >> >> Feedback is very welcome. I'll add this patch to the next CF. > > I support adding an option for this, but I propose that we just make > it a long-form option, similar to --log-prefix or --index-tablespace. > Yeah, that's better. I'll update the patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Aug 1, 2017 at 9:49 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >> I'd like to propose a new option -I for pgbench command which skips >> the creating primary keys after initialized tables. > I support adding an option for this, but I propose that we just make > it a long-form option, similar to --log-prefix or --index-tablespace. I think we could probably do without this ... if you want a non-default test setup, why do you need to use "pgbench -i" to create it? It's not that there's anything greatly wrong with this particular idea, it's just that pgbench has too many switches already, and omitting random subsets of the initialization actions doesn't seem like it contributes fundamental new benchmarking capability. I could get behind a proposal that generalized pgbench's "-i" behavior in some meaningful way. I wonder whether it would be possible to convert that behavior into a script. Some of what it does is just SQL commands with injected parameters, which pgbench does already. There's also data-loading actions, which could be converted to backslash commands perhaps. Then desires like this could be addressed by invoking a customized script instead of complicating pgbench's option set. regards, tom lane
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Tatsuo Ishii
Date:
> I think we could probably do without this ... if you want a non-default > test setup, why do you need to use "pgbench -i" to create it? > > It's not that there's anything greatly wrong with this particular idea, > it's just that pgbench has too many switches already, and omitting random > subsets of the initialization actions doesn't seem like it contributes > fundamental new benchmarking capability. > > I could get behind a proposal that generalized pgbench's "-i" behavior > in some meaningful way. I wonder whether it would be possible to convert > that behavior into a script. Some of what it does is just SQL commands > with injected parameters, which pgbench does already. There's also > data-loading actions, which could be converted to backslash commands > perhaps. Then desires like this could be addressed by invoking a > customized script instead of complicating pgbench's option set. +1. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Robert Haas
Date:
On Wed, Aug 2, 2017 at 9:41 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Aug 1, 2017 at 9:49 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >>> I'd like to propose a new option -I for pgbench command which skips >>> the creating primary keys after initialized tables. > >> I support adding an option for this, but I propose that we just make >> it a long-form option, similar to --log-prefix or --index-tablespace. > > I think we could probably do without this ... if you want a non-default > test setup, why do you need to use "pgbench -i" to create it? > > It's not that there's anything greatly wrong with this particular idea, > it's just that pgbench has too many switches already, and omitting random > subsets of the initialization actions doesn't seem like it contributes > fundamental new benchmarking capability. > > I could get behind a proposal that generalized pgbench's "-i" behavior > in some meaningful way. I wonder whether it would be possible to convert > that behavior into a script. Some of what it does is just SQL commands > with injected parameters, which pgbench does already. There's also > data-loading actions, which could be converted to backslash commands > perhaps. Then desires like this could be addressed by invoking a > customized script instead of complicating pgbench's option set. I've actually wanted this exact thing multiple times: most recently, to make a non-unique btree index instead of a unique one, and to make a hash index instead of a btree one. I don't object to a modest effort at coming up with a more general mechanism here, but I also think the switch as proposed is something that would have met my real needs on multiple occasions. I've probably had 10 different occasions when I wanted all of the standard pgbench initialization *except for* something different about the indexes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I've actually wanted this exact thing multiple times: most recently, > to make a non-unique btree index instead of a unique one, and to make > a hash index instead of a btree one. I don't object to a modest > effort at coming up with a more general mechanism here, but I also > think the switch as proposed is something that would have met my real > needs on multiple occasions. I've probably had 10 different occasions > when I wanted all of the standard pgbench initialization *except for* > something different about the indexes. Sure, but "no indexes at all" is hardly ever the real goal, is it? So the switch as proposed is only solving part of your problem. I'd rather see a solution that addresses a larger range of desires. Or in other words, this looks to me quite a bit like the hackery that resulted in pgbench's -S and -N options, before we figured out that making it scriptable was a better answer. regards, tom lane
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Robert Haas
Date:
On Wed, Aug 2, 2017 at 10:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Sure, but "no indexes at all" is hardly ever the real goal, is it? Right. > So the switch as proposed is only solving part of your problem. > I'd rather see a solution that addresses a larger range of desires. That's reasonable. > Or in other words, this looks to me quite a bit like the hackery > that resulted in pgbench's -S and -N options, before we figured out > that making it scriptable was a better answer. But it's not very clear to me how we could make this case scriptable, and it would probably not be much different from just using the proposed option and then running the script afterwards yourself via psql. The thing about -N and -S is that those scripts are being run repeatedly, so pgbench has to be involved. If you just want to create different/extra indexes, you can do that yourself. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Aug 2, 2017 at 10:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Or in other words, this looks to me quite a bit like the hackery >> that resulted in pgbench's -S and -N options, before we figured out >> that making it scriptable was a better answer. > But it's not very clear to me how we could make this case scriptable, Well, I'm imagining that "-i" would essentially become a short form of "-b initialize", as already happened for -S and -N, where the script looks something like drop table if exists pgbench_branches; create table pgbench_branches ( bid int not null,bbalance int,filler char(88) ); \load_data pgbench_branches [ other parameters to-be-determined ] alter table pgbench_branches add primary key (bid); ... repeat for other tables ... and we'd document that the same way we do for the existing built-in scripts. Then, if there's something you don't like about it, you just paste the script into a file and edit to taste. I'm sure there's complexities that would only become apparent when someone tries to write the patch, but that seems to me like a better foundation for this class of desires than extending the option set with various one-off options having no discernible architecture. > If you just want to create > different/extra indexes, you can do that yourself. Sure, but there's no end to the number of small variations on this theme that somebody might want. For example, we realized years ago that the "filler" fields as-implemented don't really meet the intent of the TPC-B spec (cf comment in the init() function). If someone comes along with a patch adding a "--really-tpc-b" option to change the table declarations and/or data loading code to fix that, will we take that patch? What about one that wants all the id fields (not just accounts.aid) to be bigint, or one that wants the balance fields to be numeric? You can say "let 'em set up the tables manually if they want that", but I don't see why a nonstandard set of indexes is much different. regards, tom lane
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Robert Haas
Date:
On Wed, Aug 2, 2017 at 11:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Aug 2, 2017 at 10:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Or in other words, this looks to me quite a bit like the hackery >>> that resulted in pgbench's -S and -N options, before we figured out >>> that making it scriptable was a better answer. > >> But it's not very clear to me how we could make this case scriptable, > > Well, I'm imagining that "-i" would essentially become a short form > of "-b initialize", as already happened for -S and -N, where the script > looks something like > > drop table if exists pgbench_branches; > create table pgbench_branches ( > bid int not null,bbalance int,filler char(88) > ); > \load_data pgbench_branches [ other parameters to-be-determined ] > alter table pgbench_branches add primary key (bid); > ... repeat for other tables ... > > and we'd document that the same way we do for the existing built-in > scripts. Then, if there's something you don't like about it, you > just paste the script into a file and edit to taste. I imagine that would be useful for some use cases, but it's a heck of a lot more work than just writing --no-indexes-please. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Aug 2, 2017 at 11:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Well, I'm imagining that "-i" would essentially become a short form >> of "-b initialize", as already happened for -S and -N, where the script >> looks something like ... > I imagine that would be useful for some use cases, but it's a heck of > a lot more work than just writing --no-indexes-please. Of course. It's also a heck of a lot more flexible. Adding on another ad-hoc option that does the minimum possible amount of work needed to address one specific problem is always going to be less work; but after we repeat that process five or ten times, we're going to have a mess. regards, tom lane
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Robert Haas
Date:
On Wed, Aug 2, 2017 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Of course. It's also a heck of a lot more flexible. Adding on another > ad-hoc option that does the minimum possible amount of work needed to > address one specific problem is always going to be less work; but after > we repeat that process five or ten times, we're going to have a mess. Well, I still like Masahiko-san's proposal, but I'm not prepared to keep arguing about it right now. Maybe some other people will weigh in with an opinion. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Michael Paquier
Date:
On Wed, Aug 2, 2017 at 5:50 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Wed, Aug 2, 2017 at 10:17 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Or in other words, this looks to me quite a bit like the hackery >>> that resulted in pgbench's -S and -N options, before we figured out >>> that making it scriptable was a better answer. > >> But it's not very clear to me how we could make this case scriptable, > > Well, I'm imagining that "-i" would essentially become a short form > of "-b initialize", as already happened for -S and -N, where the script > looks something like Yes, I would imagine a facility where one could do pgbench $script and issue a complete test set. Here is for example a funky idea: let's separate each script with a set of meta-commands, \init being what is used just for initialization, and then use \script to define a set of commands with a custom weight. Say: \init CREATE TABLE foo(a int); \script select_query [weight N] SELECT count(*) FROM foo; \script insert_query [weight N] INSERT INTO foo VALUES ('1'); That may be over-engineering things, but personally I don't like much having just a switch to remove indexes. Next time we will come with another option that only selects a portion of the indexes created. -- Michael
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Thu, Aug 3, 2017 at 2:00 AM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Aug 2, 2017 at 12:34 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Of course. It's also a heck of a lot more flexible. Adding on another >> ad-hoc option that does the minimum possible amount of work needed to >> address one specific problem is always going to be less work; but after >> we repeat that process five or ten times, we're going to have a mess. > > Well, I still like Masahiko-san's proposal, but I'm not prepared to > keep arguing about it right now. Maybe some other people will weigh > in with an opinion. > My motivation of this proposal is same as what Robert has. I understand that ad-hoc option can solve only the part of big problem and it could be cause of mess. However It seems me that the script especially for table initialization will not be flexible than we expected. I mean, even if we provide some meta commands for table initialization or data loading, these meta commands work for only pgbench tables (i.g., pgbench_accounts, pgbench_branches and so on). If we want to create other tables and load data to them as we want we can do that using psql -f. So an alternative ways is having a flexible style option for example --custom-initialize = { [load, create_pkey, create_fkey, vacuum], ... }. That would solve this in a better way. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello, > My motivation of this proposal is same as what Robert has. I > understand that ad-hoc option can solve only the part of big problem > and it could be cause of mess. However It seems me that the script > especially for table initialization will not be flexible than we > expected. I mean, even if we provide some meta commands for table > initialization or data loading, these meta commands work for only > pgbench tables (i.g., pgbench_accounts, pgbench_branches and so on). > If we want to create other tables and load data to them as we want we > can do that using psql -f. So an alternative ways is having a flexible > style option for example --custom-initialize = { [load, create_pkey, > create_fkey, vacuum], ... }. That would solve this in a better way. Personnaly, I could be fine with a limited number of long options to adjust pgbench initialization to various needs, eg --use-hash-index, --skip-whetever-index, etc. The flexible --custom-init idea outlined above looks nice as well. As for a more generic solution, the easy part are the "CREATE" stuff and the transaction script stuff (existing pgbench scripts). For the CREATE stuff, the script language is SQL, the command to use it is "psql"... The real and hard part is to fill tables with meaningful pseudo-random test data which do not violate constraints for any non trivial schema involving foreign keys and various unique constraints. The solution for this is SQL for trivial cases, think of: "INSERT INTO Foo() SELECT ... FROM generate_series(...);" For instance the pgbench initialization is really close to: psql -Dscale=10 <<EOF CREATE TABLE ... ; INSERT INTO pgbench_account(...) SELECT ... FROM generate_series(1, 100000* :scale) AS i; INSERT ... CREATE INDEX ...; VACUUM FULL ANALYZE; EOF And all existing options could probably be implemented easilly with the recently added conditional (\if). So my 0.02€ is that if something is to be done, I would suggest to turn the creation and initialization stuff into a standard "psql" script that could be called from pgbench instead of integrating much more ad-hoc stuff into pgbench. Note that non trivial schema initialization requires more general programming, so I do not believe in doing a lot at pgbench or psql levels. The best I could come with is a data generator which takes as input the schema with added directives on how to generate the various attributes (tool named "datafiller", that some people use:-). -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: > As for a more generic solution, the easy part are the "CREATE" stuff and > the transaction script stuff (existing pgbench scripts). > For the CREATE stuff, the script language is SQL, the command to use it is > "psql"... > The real and hard part is to fill tables with meaningful pseudo-random > test data which do not violate constraints for any non trivial schema > involving foreign keys and various unique constraints. > The solution for this is SQL for trivial cases, think of: > "INSERT INTO Foo() SELECT ... FROM generate_series(...);" Yeah. I was also thinking that complicated data-generation requirements could be handled with plpgsql DO blocks, avoiding the need for hard-wired code inside pgbench. regards, tom lane
Masahiko Sawada <sawada.mshk@gmail.com> writes: > If we want to create other tables and load data to them as we want we > can do that using psql -f. So an alternative ways is having a flexible > style option for example --custom-initialize = { [load, create_pkey, > create_fkey, vacuum], ... }. That would solve this in a better way. FWIW, I like that significantly better than your original proposal. It'd allow people to execute parts of pgbench's standard initialization sequence and then do other things in between (in psql). Realistically, that's probably about as much win as we need here --- if you're veering far enough away from the standard scenario that that doesn't do it for you, you might as well just write an all-custom setup script in psql. regards, tom lane
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
>> For the CREATE stuff, the script language is SQL, the command to use it is >> "psql"... > >> The real and hard part is to fill tables with meaningful pseudo-random >> test data which do not violate constraints for any non trivial schema >> involving foreign keys and various unique constraints. > >> The solution for this is SQL for trivial cases, think of: >> "INSERT INTO Foo() SELECT ... FROM generate_series(...);" > > Yeah. I was also thinking that complicated data-generation requirements > could be handled with plpgsql DO blocks, avoiding the need for hard-wired > code inside pgbench. I do not thing that it is really be needed for what pgbench does, though. See attached attempt, including a no_foreign_keys option. The only tricky thing is to have the elapsed/remaining advancement report on stdout, maybe with some PL/pgSQL. Timings are very similar compared to "pgbench -i". -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Fri, Aug 4, 2017 at 3:24 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > >>> For the CREATE stuff, the script language is SQL, the command to use it >>> is >>> "psql"... >> >> >>> The real and hard part is to fill tables with meaningful pseudo-random >>> test data which do not violate constraints for any non trivial schema >>> involving foreign keys and various unique constraints. >> >> >>> The solution for this is SQL for trivial cases, think of: >>> "INSERT INTO Foo() SELECT ... FROM generate_series(...);" >> >> >> Yeah. I was also thinking that complicated data-generation requirements >> could be handled with plpgsql DO blocks, avoiding the need for hard-wired >> code inside pgbench. > > > I do not thing that it is really be needed for what pgbench does, though. > See attached attempt, including a no_foreign_keys option. > > The only tricky thing is to have the elapsed/remaining advancement report on > stdout, maybe with some PL/pgSQL. > > Timings are very similar compared to "pgbench -i". > The generating data with plpgsql DO blocks means that we do the data-generation on sever side rather than on client side. I think it's preferable in a sense because could speed up initialization time by reducing the network traffic. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Thu, Aug 3, 2017 at 11:31 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Masahiko Sawada <sawada.mshk@gmail.com> writes: >> If we want to create other tables and load data to them as we want we >> can do that using psql -f. So an alternative ways is having a flexible >> style option for example --custom-initialize = { [load, create_pkey, >> create_fkey, vacuum], ... }. That would solve this in a better way. > > FWIW, I like that significantly better than your original proposal. > It'd allow people to execute parts of pgbench's standard initialization > sequence and then do other things in between (in psql). Realistically, > that's probably about as much win as we need here --- if you're veering > far enough away from the standard scenario that that doesn't do it for > you, you might as well just write an all-custom setup script in psql. > Attached patch introduces --custom-initialize option that allows us to specify the initialization step and its order. For example, If you want to skip building primary keys you can specify --custom-initialize="create_table, load_data, vacuum". Since each custom initialization commands is invoked in specified order, for example we also can create primary keys *before* loading data. The data-generation is doing on client side, so progress information for initialization is still supported. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
> Attached patch I'll look into it. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Mahahiko-san, My 0.02€ about the patch & feature, which only reflect my point of view: Please add a number to patches to avoid ambiguities. This was patch was the second sent on the thread. There is no need to have both custom_init & init functions. I'll suggest to remove function "init", rename "custom_init" as "init", and make the option default to what is appropriate so that it initialize the schema as expected, and there is only one initialization mechanism. I would suggest to make initialization sub options (no-vacuum, foreign-key...) rely on updating the initialization operations instead of being maintained separately. Currently "--no-vacuum --custom-init=vacuum" would skip vacuum, which is quite debatable... I'm not sure of the "custom-initialize" option name, but why not. I suggest to remove "is_initialize_suite", and make custom-initialize require -i as seems logical and upward compatible. ISTM that there should be short names for the phases. Maybe using only one letter could simplify the code significantly, dropping commas and list, eg: "t" for "create_table", "d" for "load_data", "p" for "create_pkey", "f" for "create_fkey", "v" for "vacuum". I do not think that allowing a space in the list is a shell-wise idea. I'm also wondering whether using a list is a good option, because it implies a large parse function, list management and so on. With the one letter version, it could be just a string to be scanned char by char for operations. Otherwise, at least allow short two letter codes (eg: ct ld pk fk va), in order to avoid writing a very long thing on the command line, which is quite a pain: sh> pgbench --custom-initialize=create_table,load_data,primary_key,foreign_key,vacuum ... vs sh> pgbench -i -I tdpfv ... Maybe there could be short-hands for usual setups, eg "default" for "tdpv" or maybe "ct,ld,pk,va", "full" for "tdpfv" or maybe "ct,ld,pk,fk,va"... Typo in doc "initailize" -> "initialize". Option values should be presented in their logical execution order, i.e. put vacuum at the end. Typo in help "initilize" -> "initialize". I would suggest to drop the space in the option value in the presentation so that quotes are not needed. Remove the "no-primary-keys" from the long option array as it has disappeared. You might consider make "custom-initialize" take the 'I' short option. Ranting unrelated to this patch: the automatic aid type switching based on scale is a bad idea (tm), because when trying to benchmark it means that changing the scale also changes the schema, and you really do not need that. ISTM that it should always use INT8. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Tue, Aug 8, 2017 at 10:50 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Mahahiko-san, > > My 0.02€ about the patch & feature, which only reflect my point of view: Thank you for reviewing the patch! > Please add a number to patches to avoid ambiguities. This was patch was the > second sent on the thread. > > There is no need to have both custom_init & init functions. I'll suggest to > remove function "init", rename "custom_init" as "init", and make the option > default to what is appropriate so that it initialize the schema as > expected, and there is only one initialization mechanism. > > I would suggest to make initialization sub options (no-vacuum, > foreign-key...) rely on updating the initialization operations instead of > being maintained separately. Currently "--no-vacuum --custom-init=vacuum" > would skip vacuum, which is quite debatable... > > I'm not sure of the "custom-initialize" option name, but why not. I suggest > to remove "is_initialize_suite", and make custom-initialize require -i as > seems logical and upward compatible. > > ISTM that there should be short names for the phases. Maybe using only one > letter could simplify the code significantly, dropping commas and list, eg: > "t" for "create_table", "d" for "load_data", "p" for "create_pkey", "f" for > "create_fkey", "v" for "vacuum". > > I do not think that allowing a space in the list is a shell-wise idea. I think we can use it like --custom-initialize="create_table, vacuum" which is similar to what we specify a connection option to psql for example. But it will be unnecessary if we have the one letter version. > I'm also wondering whether using a list is a good option, because it implies > a large parse function, list management and so on. With the one letter > version, it could be just a string to be scanned char by char for > operations. I basically agree with the one letter version. But I'm concerned that it'll confuse users if we have more initialization steps for the pgbench initialization. If we add more various initialization steps it makes pgbench command hard to read and the users might have to remember these options. > > Otherwise, at least allow short two letter codes (eg: ct ld pk fk va), in > order to avoid writing a very long thing on the command line, which is quite > a pain: > > sh> pgbench > --custom-initialize=create_table,load_data,primary_key,foreign_key,vacuum > ... > > vs > > sh> pgbench -i -I tdpfv ... > > Maybe there could be short-hands for usual setups, eg "default" for "tdpv" > or maybe "ct,ld,pk,va", "full" for "tdpfv" or maybe "ct,ld,pk,fk,va"... If --custom-initialize option requires for i option to be set, "pgbench -i" means the initialization with full steps and "pgbench -i --custom-initialize=..." means the initialization with custom operation steps. > Typo in doc "initailize" -> "initialize". Option values should be presented > in their logical execution order, i.e. put vacuum at the end. > > Typo in help "initilize" -> "initialize". I would suggest to drop the space > in the > option value in the presentation so that quotes are not needed. > > Remove the "no-primary-keys" from the long option array as it has > disappeared. You might consider make "custom-initialize" take the 'I' short > option. > > Ranting unrelated to this patch: the automatic aid type switching based on > scale is a bad idea (tm), because when trying to benchmark it means that > changing the scale also changes the schema, and you really do not need that. > ISTM that it should always use INT8. Hmm, I think it's a valid point. Should we allow users to specify like the above thing in the custom initialization feature as well? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello, > I think we can use it like --custom-initialize="create_table, vacuum" > which is similar to what we specify a connection option to psql for > example. Even if it is allowed, do not advertise it. Or use space as a separator and not comma. ISTM that with psql connections space is the higher level separator, not an optional thing, and comma is used for lower level splitting: "host=foo port=5432,5433 ..." > But it will be unnecessary if we have the one letter version. Sure. >> I'm also wondering whether using a list is a good option, because it implies >> a large parse function, list management and so on. With the one letter >> version, it could be just a string to be scanned char by char for >> operations. > > I basically agree with the one letter version. But I'm concerned that > it'll confuse users if we have more initialization steps for the > pgbench initialization. If we add more various initialization steps it > makes pgbench command hard to read and the users might have to > remember these options. I think that if we get to the point where so many initialization steps that people get confused, then adding long names will not be an issue:-) In the mean time it only needs 5 values. >> Maybe there could be short-hands for usual setups, eg "default" for "tdpv" >> or maybe "ct,ld,pk,va", "full" for "tdpfv" or maybe "ct,ld,pk,fk,va"... > > If --custom-initialize option requires for i option to be set, > "pgbench -i" means the initialization with full steps and "pgbench -i > --custom-initialize=..." means the initialization with custom > operation steps. Sure. It does not preclude the default to have a name. >> Remove the "no-primary-keys" from the long option array as it has >> disappeared. You might consider make "custom-initialize" take the 'I' short >> option. >> >> Ranting unrelated to this patch: the automatic aid type switching based on >> scale is a bad idea (tm), because when trying to benchmark it means that >> changing the scale also changes the schema, and you really do not need that. >> ISTM that it should always use INT8. > > Hmm, I think it's a valid point. Should we allow users to specify like > the above thing in the custom initialization feature as well? I would be in favor of having an option to do a tpc-b conforming schema which would include that, but which would also change the default balance type which is not large enough per spec. Maybe it could be "T". -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Tue, Aug 15, 2017 at 2:43 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello, > >> I think we can use it like --custom-initialize="create_table, vacuum" >> which is similar to what we specify a connection option to psql for >> example. > > > Even if it is allowed, do not advertise it. Or use space as a separator and > not comma. ISTM that with psql connections space is the higher level > separator, not an optional thing, and comma is used for lower level > splitting: "host=foo port=5432,5433 ..." > >> But it will be unnecessary if we have the one letter version. > > > Sure. > >>> I'm also wondering whether using a list is a good option, because it >>> implies >>> a large parse function, list management and so on. With the one letter >>> version, it could be just a string to be scanned char by char for >>> operations. >> >> >> I basically agree with the one letter version. But I'm concerned that >> it'll confuse users if we have more initialization steps for the >> pgbench initialization. If we add more various initialization steps it >> makes pgbench command hard to read and the users might have to >> remember these options. > > > I think that if we get to the point where so many initialization steps that > people get confused, then adding long names will not be an issue:-) > > In the mean time it only needs 5 values. Agreed. > >>> Maybe there could be short-hands for usual setups, eg "default" for >>> "tdpv" >>> or maybe "ct,ld,pk,va", "full" for "tdpfv" or maybe "ct,ld,pk,fk,va"... >> >> >> If --custom-initialize option requires for i option to be set, >> "pgbench -i" means the initialization with full steps and "pgbench -i >> --custom-initialize=..." means the initialization with custom >> operation steps. > > > Sure. It does not preclude the default to have a name. > >>> Remove the "no-primary-keys" from the long option array as it has >>> disappeared. You might consider make "custom-initialize" take the 'I' >>> short >>> option. >>> >>> Ranting unrelated to this patch: the automatic aid type switching based >>> on >>> scale is a bad idea (tm), because when trying to benchmark it means that >>> changing the scale also changes the schema, and you really do not need >>> that. >>> ISTM that it should always use INT8. >> >> >> Hmm, I think it's a valid point. Should we allow users to specify like >> the above thing in the custom initialization feature as well? > > > I would be in favor of having an option to do a tpc-b conforming schema > which would include that, but which would also change the default balance > type which is not large enough per spec. Maybe it could be "T". > Yeah, once custom initialization patch get committed we can extend it. Attached updated patch. I've incorporated the all comments from Fabien to it, and changed it to single letter version. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Masahiko-san, > Yeah, once custom initialization patch get committed we can extend it. > > Attached updated patch. I've incorporated the all comments from Fabien > to it, and changed it to single letter version. Patch applies and works. A few comments and questions about the code and documentation: Why not allow -I as a short option for --custom-initialize? I do not think that the --custom-initialize option needs to appear as a specific synopsis in the documentation, as it is now a sub-option of -i. checkCustomCmds: I would suggest to simplify test code with strchr and to merge the two fprintf into one, something like: if (strchr("tdpfv", *cmd) == NULL) { fprintf(stderr, "....\n" "....\n", ...); ... Moreover there is already an error message later if checkCustomCmds fails, I think it could be expanded and the two-line one in the function dropped entirely? It seems strange to have two level error messages for that. Help message looks strange. I suggest something regexp-like: --custom-initialize=[tdvpf]+ I would suggest to put the various init* functions in a more logical order: first create table, then load data, etc. In case 0: do not exchange unlogged_tables & foreign_keys gratuiously. After checking the initial code, I understand that the previous default was "tdvp", but you put "tdpv". I'm unsure whether vaccuum would do something to the indexes and that would make more sense. In doubt, I suggest to keep the previous default. Maybe --foreign-keys should really do "tdvpf"? I may be okay with disallowing --foreign-keys and --no-vacuum if --custom-init is used, but then there is no need to test it again in init... I think that in any case 'f' and 'v' should always trigger the corresponding initializations. On the other hand, I think that it could be more pragmatic with these options, i.e. --foreign-keys could just append 'f' to the current command if not already there, and '--no-vacuum' could remove 'v' if there, on the fly, so that nothing would be banned. This would require to always have a malloced custom_init string. It would allow to remove the "foreign_keys" global variable. "is_no_vacuum" is probably still needed for benchmarking. This way there would be no constraints and "is_custom_init" could be dropped as well. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Wed, Aug 16, 2017 at 4:55 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Masahiko-san, > >> Yeah, once custom initialization patch get committed we can extend it. >> >> Attached updated patch. I've incorporated the all comments from Fabien >> to it, and changed it to single letter version. > > > Patch applies and works. > > A few comments and questions about the code and documentation: Thank you for the comments! > > Why not allow -I as a short option for --custom-initialize? Other options for similar purpose such as --foreign-keys also have only a long option. Since I think --custom-initialize option is in the same context as other options I didn't add short option to it for now. Because the options name is found by a prefix searching we can use a short name --cu for now. > I do not think that the --custom-initialize option needs to appear as a > specific synopsis in the documentation, as it is now a sub-option of -i. > > checkCustomCmds: I would suggest to simplify test code with strchr > and to merge the two fprintf into one, something like: > > if (strchr("tdpfv", *cmd) == NULL) { > fprintf(stderr, > "....\n" > "....\n", > ...); > ... > > Moreover there is already an error message later if checkCustomCmds fails, I > think > it could be expanded and the two-line one in the function dropped entirely? > It seems strange to have two level error messages for that. > > Help message looks strange. I suggest something regexp-like: > > --custom-initialize=[tdvpf]+ > > I would suggest to put the various init* functions in a more logical order: > first create table, then load data, etc. > > In case 0: do not exchange unlogged_tables & foreign_keys gratuiously. > > After checking the initial code, I understand that the previous default was > "tdvp", but you put "tdpv". I'm unsure whether vaccuum would do something to > the indexes and that would make more sense. In doubt, I suggest to keep the > previous default. > > Maybe --foreign-keys should really do "tdvpf"? > > I may be okay with disallowing --foreign-keys and --no-vacuum if > --custom-init is used, > but then there is no need to test it again in init... I think that in any > case 'f' and 'v' > should always trigger the corresponding initializations. > > On the other hand, I think that it could be more pragmatic with these > options, i.e. --foreign-keys could just append 'f' to the current command if > not already there, and '--no-vacuum' could remove 'v' if there, on the fly, > so that nothing would be banned. This would require to always have a > malloced custom_init string. It would allow to remove the "foreign_keys" > global variable. "is_no_vacuum" is probably still needed for benchmarking. > This way there would be no constraints and "is_custom_init" could be dropped > as well. I'm inclined to remove the restriction so that we can specify --foreign-keys, --no-vacuum and --custom-initialize at the same time. I think a list of char would be better here rather than a single malloced string to remove particular initialization step easily. Thought? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
>> Why not allow -I as a short option for --custom-initialize? > > Other options for similar purpose such as --foreign-keys also have > only a long option. Since I think --custom-initialize option is in the > same context as other options I didn't add short option to it for now. > Because the options name is found by a prefix searching we can use a > short name --cu for now. Hmmm. I like short options:-) > I'm inclined to remove the restriction so that we can specify > --foreign-keys, --no-vacuum and --custom-initialize at the same time. Ok. I favor that as well. > I think a list of char would be better here rather than a single > malloced string to remove particular initialization step easily. > Thought? My thought is not to bother with a list of char. To remove a char from a string, I suggest to allow ' ' to stand for nothing and be skipped, so that substituting a letter by space would simply remove an initialization phase. For adding, realloc & append. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Tue, Aug 22, 2017 at 5:15 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > >>> Why not allow -I as a short option for --custom-initialize? >> >> >> Other options for similar purpose such as --foreign-keys also have >> only a long option. Since I think --custom-initialize option is in the >> same context as other options I didn't add short option to it for now. >> Because the options name is found by a prefix searching we can use a >> short name --cu for now. > > > Hmmm. I like short options:-) Okay, I added -I option for custom initialization :) > >> I'm inclined to remove the restriction so that we can specify >> --foreign-keys, --no-vacuum and --custom-initialize at the same time. > > > Ok. I favor that as well. If building foreign keys command is not specified in -I but --foreign-keys options is specified (e.g. pgbench -i -I tpd --foreign-keys) I think we can add 'f' to the end of the initialization commands. > >> I think a list of char would be better here rather than a single >> malloced string to remove particular initialization step easily. >> Thought? > > > My thought is not to bother with a list of char. > > To remove a char from a string, I suggest to allow ' ' to stand for nothing > and be skipped, so that substituting a letter by space would simply remove > an initialization phase. I think we should not add/remove a command of initialization commands during parsing pgbench options in order to not depend on its order. Therefore, if -I, --foreign-keys and --no-vacuum are specified at the same time, what we do is removing some 'v' commands if novacuum and adding a 'f' command if foreignkey. Also we expect that the length of initialization steps would not long. Using malloced string would less the work. Ok, I changed the patch so. Attached latest v4 patch. Please review it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Masahiko-san, > Attached latest v4 patch. Please review it. Patch applies, compiles. The messages/options do not seem to work properly: sh> ./pgbench -i -I t done. Does not seem to have initialized the tables although it was requested... sh> ./pgbench -i -I d creating tables... 100000 of 100000 tuples (100%) done (elapsed 0.08 s, remaining 0.00 s) done. It seems that "d" triggered table creation... In fact it seems that the work is done correctly, but the messages are not in the right place. Also another issue: sh> ./pgbench -i --foreign-keys creating tables... 100000 of 100000 tuples (100%) done (elapsed 0.09 s, remaining 0.00 s)vacuum... set primary keys... done. Foreign keys do not seem to have been set... Please check that all really work as expected. About the documentation: If a native English speaker could review the text, it would be great. At least: "Required to invoke" -> "Require to invoke". About the code: is_no_vacuum should be a bool? I'm really hesitating about the out of order processing of options. If the user writes sh> pgbench -i --no-vacuum -I v done. Then does it make sense to ignore the last thing the user asked for? ISTM that processing options in order and keeping the last resulting spec is more natural. Appending contradictory options can happen easily when scripting, and usual what is meant is the last one. Again, as pointed out in the previous review, I do not like much checkCustomCmds implementation: switch/case, fprintf and return on error which will trigger another fprintf and error above... ISTM that you should either take into account previous comments or explain why you disagree with them, but not send the same code without addressing them in any way. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Quick precision to my previous review. > sh> ./pgbench -i -I t > done. There should be "creating tables..." > Does not seem to have initialized the tables although it was requested... > > sh> ./pgbench -i -I d > creating tables... Probably "filling tables..." would be more appropriate. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Sun, Aug 27, 2017 at 5:12 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Masahiko-san, > >> Attached latest v4 patch. Please review it. Thank you for reviewing this patch! > > Patch applies, compiles. > > The messages/options do not seem to work properly: > > sh> ./pgbench -i -I t > done. Fixed this so that it ouptut "creating tables..." as you pointed out. > Does not seem to have initialized the tables although it was requested... > > sh> ./pgbench -i -I d > creating tables... > 100000 of 100000 tuples (100%) done (elapsed 0.08 s, remaining 0.00 s) > done. > > It seems that "d" triggered table creation... In fact it seems that the > work is done correctly, but the messages are not in the right place. Fixed, but I just removed "creating tables..." from -I d command. I think it's not good if we change the output messages by this patch. > Also another issue: > > sh> ./pgbench -i --foreign-keys > creating tables... > 100000 of 100000 tuples (100%) done (elapsed 0.09 s, remaining 0.00 s) > vacuum... > set primary keys... > done. > > Foreign keys do not seem to have been set... Please check that all really > work as expected. Fixed. > > About the documentation: > > If a native English speaker could review the text, it would be great. > > At least: "Required to invoke" -> "Require to invoke". Fixed. > > > About the code: > > is_no_vacuum should be a bool? We can change it but I think there is no difference actually. So keeping it would be better. > > I'm really hesitating about the out of order processing of options. If the > user writes > > sh> pgbench -i --no-vacuum -I v > done. > > Then does it make sense to ignore the last thing the user asked for? ISTM > that processing options in order and keeping the last resulting spec is more > natural. Appending contradictory options can happen easily when scripting, > and usual what is meant is the last one. Agreed. I changed it so that it processes options in order and keeps the last resulting spec. > > Again, as pointed out in the previous review, I do not like much > checkCustomCmds implementation: switch/case, fprintf and return on error > which will trigger another fprintf and error above... ISTM that you should > either take into account previous comments or explain why you disagree with > them, but not send the same code without addressing them in any way. Sorry, I didn't mean to ignore, I'd just missed the comment. Fixed it. Attached latest patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Masahiko-san, Patch applies cleanly, compiles, works for me. >> At least: "Required to invoke" -> "Require to invoke". > > Fixed. I meant the added one about -I, not the pre-existing one about -i. >> About the code: >> >> is_no_vacuum should be a bool? > > We can change it but I think there is no difference actually. So > keeping it would be better. I would like to insist a little bit: the name was declared as an int but passed to init and used as a bool there before the patch. Conceptually what is meant is really a bool, and I see no added value bar saving a few strokes to have an int. ISTM that recent pgbench changes have started turning old int-for-bool habits into using bool when bool is meant. initialize_cmds initialization: rather use pg_strdup instead of pg_malloc/strcpy? -I: pg_free before pg_strdup to avoid a small memory leak? I'm not sure I would have bothered with sizeof(char), but why not! I'm still a little bit annoyed by checkCustomCmds: when unhappy, it shows an error message and return false, which immediatly results in exit(1). However the pattern elsewhere in pgbench is to show the error and exit immediatly. I would suggest to simplify by void-ing the function and exiting instead of returning false. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Mon, Aug 28, 2017 at 4:59 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Masahiko-san, > > Patch applies cleanly, compiles, works for me. Thank you for reviewing! > >>> At least: "Required to invoke" -> "Require to invoke". >> >> >> Fixed. > > > I meant the added one about -I, not the pre-existing one about -i. Fixed. >>> About the code: >>> >>> is_no_vacuum should be a bool? >> >> >> We can change it but I think there is no difference actually. So >> keeping it would be better. > > > I would like to insist a little bit: the name was declared as an int but > passed to init and used as a bool there before the patch. Conceptually what > is meant is really a bool, and I see no added value bar saving a few strokes > to have an int. ISTM that recent pgbench changes have started turning old > int-for-bool habits into using bool when bool is meant. Since is_no_vacuum is a existing one, if we follow the habit we should change other similar variables as well: is_init_mode, do_vacuum_accounts and debug. And I think we should change them in a separated patch. > > initialize_cmds initialization: rather use pg_strdup instead of > pg_malloc/strcpy? Fixed. > -I: pg_free before pg_strdup to avoid a small memory leak? Fixed. > I'm not sure I would have bothered with sizeof(char), but why not! > > I'm still a little bit annoyed by checkCustomCmds: when unhappy, it shows an > error message and return false, which immediatly results in exit(1). However > the pattern elsewhere in pgbench is to show the error and exit immediatly. I > would suggest to simplify by void-ing the function and exiting instead of > returning false. Agreed, fixed. After more thought, I'm bit inclined to not have a short option for --custom-initialize because this option will be used for not primary cases. It would be better to save short options for future enhancements of pgbench. Thought? Attached latest patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello, Patch applies and works. >> I would like to insist a little bit: the name was declared as an int but >> passed to init and used as a bool there before the patch. Conceptually what >> is meant is really a bool, and I see no added value bar saving a few strokes >> to have an int. ISTM that recent pgbench changes have started turning old >> int-for-bool habits into using bool when bool is meant. > > Since is_no_vacuum is a existing one, if we follow the habit we should > change other similar variables as well: is_init_mode, do_vacuum_accounts > and debug. And I think we should change them in a separated patch. Hmmm. The existing "is_no_vacuum" variable is typed *both* as int (in "main") and as bool (in "init"), called by main (yuk!). I see no reason to choose the bad one for the global:-) > After more thought, I'm bit inclined to not have a short option for > --custom-initialize because this option will be used for not primary > cases. It would be better to save short options for future > enhancements of pgbench. Thought? I like it as is, especially as now the associated value is a simple and short string, I think that it makes sense to have a simple and short option to trigger it. Moreover -I stands cleanly for "initialization", and the capital stands for something a little special which it is. Its to good to miss. I think that the "-I" it should be added to the "--help" line, as it is done with other short & long options. Repeating "-I f" results in multiple foreign key constraints: Foreign-key constraints: "pgbench_tellers_bid_fkey" FOREIGN KEY (bid) REFERENCES pgbench_branches(bid) "pgbench_tellers_bid_fkey1"FOREIGN KEY (bid) REFERENCES pgbench_branches(bid) "pgbench_tellers_bid_fkey2" FOREIGN KEY(bid) REFERENCES pgbench_branches(bid) "pgbench_tellers_bid_fkey3" FOREIGN KEY (bid) REFERENCES pgbench_branches(bid) I wonder if this could be avoided easily? Maybe by setting the constraint name explicitely so that the second one fails on the existing one, which is fine, like for primary keys? Or adding a DROP CONSTRAINT IF EXISTS before the CREATE CONSTRAINT, like for tables? Or doing nothing about it? I would prefer the first option. Maybe the initial cleanup (DROP TABLE) could be made an option added to the default, so that cleaning up the database could be achieved with some "pgbench -i -I c", instead of connecting and droping the tables one by one which I have done quite a few times... What do you think? Before it is definitely engraved, I'm thinking about the letters: c - cleanup t - create table d - data p - primary key f - foreign key v - vacuum I think it is mostly okay, but it is the last time to think about it. Using "d" for cleanup (drop) would mean finding another letter for filling in data... maybe "g" for data generation? "c" may have been chosen for "create table", but then would not be available for "cleanup". Thoughts? -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Tue, Aug 29, 2017 at 4:47 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello, > > Patch applies and works. > >>> I would like to insist a little bit: the name was declared as an int but >>> passed to init and used as a bool there before the patch. Conceptually >>> what >>> is meant is really a bool, and I see no added value bar saving a few >>> strokes >>> to have an int. ISTM that recent pgbench changes have started turning old >>> int-for-bool habits into using bool when bool is meant. >> >> >> Since is_no_vacuum is a existing one, if we follow the habit we should >> change other similar variables as well: is_init_mode, do_vacuum_accounts and >> debug. And I think we should change them in a separated patch. > > > Hmmm. The existing "is_no_vacuum" variable is typed *both* as int (in > "main") and as bool (in "init"), called by main (yuk!). I see no reason to > choose the bad one for the global:-) > Yeah, I think this might be a good timing to re-consider int-for-bool habits in pgbench. If we decided to change is_no_vacuum to bool I want to change other similar variables as well. >> After more thought, I'm bit inclined to not have a short option for >> --custom-initialize because this option will be used for not primary >> cases. It would be better to save short options for future >> enhancements of pgbench. Thought? > > > I like it as is, especially as now the associated value is a simple and > short string, I think that it makes sense to have a simple and short option > to trigger it. Moreover -I stands cleanly for "initialization", and the > capital stands for something a little special which it is. Its to good to > miss. > > I think that the "-I" it should be added to the "--help" line, as it is done > with other short & long options. Okay, I'll leave it as of now. Maybe we can discuss later. > > Repeating "-I f" results in multiple foreign key constraints: > > Foreign-key constraints: > "pgbench_tellers_bid_fkey" FOREIGN KEY (bid) REFERENCES > pgbench_branches(bid) > "pgbench_tellers_bid_fkey1" FOREIGN KEY (bid) REFERENCES > pgbench_branches(bid) > "pgbench_tellers_bid_fkey2" FOREIGN KEY (bid) REFERENCES > pgbench_branches(bid) > "pgbench_tellers_bid_fkey3" FOREIGN KEY (bid) REFERENCES > pgbench_branches(bid) > > I wonder if this could be avoided easily? Maybe by setting the constraint > name explicitely so that the second one fails on the existing one, which is > fine, like for primary keys? Or adding a DROP CONSTRAINT IF EXISTS before > the CREATE CONSTRAINT, like for tables? Or doing nothing about it? I would > prefer the first option. Good point, I agree with first option. > > Maybe the initial cleanup (DROP TABLE) could be made an option added to the > default, so that cleaning up the database could be achieved with some > "pgbench -i -I c", instead of connecting and droping the tables one by one > which I have done quite a few times... What do you think? Yeah, I sometimes wanted that. Having the cleaning up tables option would be good idea. > Before it is definitely engraved, I'm thinking about the letters: > > c - cleanup > t - create table > d - data > p - primary key > f - foreign key > v - vacuum > > I think it is mostly okay, but it is the last time to think about it. Using > "d" for cleanup (drop) would mean finding another letter for filling in > data... maybe "g" for data generation? "c" may have been chosen for "create > table", but then would not be available for "cleanup". Thoughts? I'd say "g" for data generation would be better. Also, I'm inclined to add a command for the unlogged tables. How about this? c - cleanup t - create table u - create unlogged table g - data generation p - primary key f - foreign key v - vacuum Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello, >> Hmmm. The existing "is_no_vacuum" variable is typed *both* as int (in >> "main") and as bool (in "init"), called by main (yuk!). I see no reason to >> choose the bad one for the global:-) > > Yeah, I think this might be a good timing to re-consider int-for-bool > habits in pgbench. If we decided to change is_no_vacuum to bool I want > to change other similar variables as well. Franckly I would be fine with that, but committers might get touchy about "unrelated changes" in the patch... The "is_no_vacuum" is related to the patch and is already a bool -- if you chose the "init" definition as a reference -- so it is okay to bool it. >> I think that the "-I" it should be added to the "--help" line, as it is done >> with other short & long options. > > Okay, I'll leave it as of now. Maybe we can discuss later. Maybe we did not understand one another. I'm just suggesting to insert -I in the help line, that is change: " --custom-initialize=[...]+\n" to " -I, --custom-initialize=[...]+\n" I'm not sure it deserves to be discussed in depth later:-) >> I wonder if this could be avoided easily? Maybe by setting the constraint >> name explicitely so that the second one fails on the existing one, which is >> fine, like for primary keys? [...] > > Good point, I agree with first option. Ok. >> Maybe the initial cleanup (DROP TABLE) could be made an option added to the >> default, so that cleaning up the database could be achieved with some >> "pgbench -i -I c", instead of connecting and droping the tables one by one >> which I have done quite a few times... What do you think? > > Yeah, I sometimes wanted that. Having the cleaning up tables option > would be good idea. Ok. > I'd say "g" for data generation would be better. Also, I'm inclined to > add a command for the unlogged tables. How about this? > > c - [c]leanup / or [d]rop tables > t - create table / [t]able creation or [c]reate table > u - create unlogged table > g - data generation / [g]enerate data > p - [p]rimary key > f - [f]oreign key > v - [v]acuum I'm okay with that. I also put an alternative with d/c above, without any preference from my part. I'm not sure about "u", though. Unlogged, like tablespace, is an orthogonal option: other table creation options (I intend to submit one which conforms to the TPC-B standard, that is use an INT8 balance as INT4 is not wide enough per spec, and always use an INT8 aid) may be also unlogged or tablespaced. So that would mean having two ways to trigger them... thus I would avoid it and keep only --unlogged. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Wed, Aug 30, 2017 at 3:39 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello, > >>> Hmmm. The existing "is_no_vacuum" variable is typed *both* as int (in >>> "main") and as bool (in "init"), called by main (yuk!). I see no reason >>> to >>> choose the bad one for the global:-) >> >> >> Yeah, I think this might be a good timing to re-consider int-for-bool >> habits in pgbench. If we decided to change is_no_vacuum to bool I want >> to change other similar variables as well. > > > Franckly I would be fine with that, but committers might get touchy about > "unrelated changes" in the patch... The "is_no_vacuum" is related to the > patch and is already a bool -- if you chose the "init" definition as a > reference -- so it is okay to bool it. Okay, I changed only is_no_vacuum in this patch and other similar variables would be changed in another patch. > >>> I think that the "-I" it should be added to the "--help" line, as it is >>> done >>> with other short & long options. >> >> >> Okay, I'll leave it as of now. Maybe we can discuss later. > > > Maybe we did not understand one another. I'm just suggesting to insert > -I in the help line, that is change: > > " --custom-initialize=[...]+\n" > > to > > " -I, --custom-initialize=[...]+\n" Fixed. > I'm not sure it deserves to be discussed in depth later:-) Sorry, I meant about having short option --custom-initialize. > >>> I wonder if this could be avoided easily? Maybe by setting the constraint >>> name explicitely so that the second one fails on the existing one, which >>> is >>> fine, like for primary keys? [...] >> >> >> Good point, I agree with first option. > > > Ok. > >>> Maybe the initial cleanup (DROP TABLE) could be made an option added to >>> the >>> default, so that cleaning up the database could be achieved with some >>> "pgbench -i -I c", instead of connecting and droping the tables one by >>> one >>> which I have done quite a few times... What do you think? >> >> >> Yeah, I sometimes wanted that. Having the cleaning up tables option >> would be good idea. > > > Ok. > >> I'd say "g" for data generation would be better. Also, I'm inclined to >> add a command for the unlogged tables. How about this? >> >> c - [c]leanup / or [d]rop tables >> t - create table / [t]able creation or [c]reate table >> u - create unlogged table >> g - data generation / [g]enerate data >> p - [p]rimary key >> f - [f]oreign key >> v - [v]acuum > > > I'm okay with that. I also put an alternative with d/c above, without any > preference from my part. Personally I prefer "t" for table creation because "c" for create is a generic word. We might want to have another initialization command that creates something. > > I'm not sure about "u", though. Unlogged, like tablespace, is an orthogonal > option: other table creation options (I intend to submit one which conforms > to the TPC-B standard, that is use an INT8 balance as INT4 is not wide > enough per spec, and always use an INT8 aid) may be also unlogged or > tablespaced. So that would mean having two ways to trigger them... thus I > would avoid it and keep only --unlogged. > Yeah, I think I had misunderstood it. -I option is for specifying some particular initialization steps. So we don't need to have a command as a option for other initializatoin commands. Attached latest patch. Please review it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Masahiko-san, > [...] Personally I prefer "t" for table creation because "c" for create > is a generic word. We might want to have another initialization command > that creates something. Ok, good point. About the patch: applies, compiles, works for me. A few minor comments. While re-reading the documentation, I think that it should be "Set custom initialization steps". It could be "Require ..." when -I implied -i, but since -i is still required the sentence does not seem to apply as such. "Destroying any existing tables: ..." -> "Destroy existing pgbench tables: ...". I would suggest to add short expanded explanations in the term definition, next to the triggering letter, to underline the mnemonic. Something like: c (cleanup) t (table creation) g (generate data) v (vacuum) p (primary key) f (foreign key) Also update the error message in checkCustomCommands to "ctgvpf". Cleanup should have a message when it is executed. I suggest "cleaning up...". Maybe add a comment in front of the array tables to say that the order is important, something like "tables in reverse foreign key dependencies order"? case 'I': ISTM that initialize_cmds is necessarily already allocated, thus I would not bother to test before pg_free. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Thu, Aug 31, 2017 at 4:35 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Masahiko-san, > >> [...] Personally I prefer "t" for table creation because "c" for create is >> a generic word. We might want to have another initialization command that >> creates something. > > > Ok, good point. > > > About the patch: applies, compiles, works for me. A few minor comments. Thank you for dedicated reviewing this patch! > While re-reading the documentation, I think that it should be "Set custom > initialization steps". It could be "Require ..." when -I implied -i, but > since -i is still required the sentence does not seem to apply as such. > > "Destroying any existing tables: ..." -> "Destroy existing pgbench tables: > ...". Fixed. > I would suggest to add short expanded explanations in the term definition, > next to the triggering letter, to underline the mnemonic. Something like: > > c (cleanup) > t (table creation) > g (generate data) > v (vacuum) > p (primary key) > f (foreign key) Nice idea, agreed. > Also update the error message in checkCustomCommands to "ctgvpf". Fixed. > Cleanup should have a message when it is executed. I suggest "cleaning > up...". Fixed. > Maybe add a comment in front of the array tables to say that the order is > important, something like "tables in reverse foreign key dependencies > order"? Fixed. > > case 'I': ISTM that initialize_cmds is necessarily already allocated, thus I > would not bother to test before pg_free. Agreed, fixed. Attached latest patch. Please review it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Masahiko-san, Patch applies and compiles. One bug found, and some minor points again. Sorry for this hopefully last iteration... I'm kind of an iterative person... I've generated the doc to look a it. Short option "-I" does not use a "=", it should be "-I custom_init_commands". Also maybe it would look nicer and clearer if the short mnemonic was outside the literal, that is with: <term><literal>c</> (cleanup)</> instead of: <term><literal>c (cleanup)</></> But this is debatable. Do it the way you think is best. Command "g" does not work after "f", something I had not tested before: ./pgbench -i -I ctvpfg cleaning up... creating tables... vacuum... set primary keys... set foreign keys... ERROR: cannottruncate a table referenced in a foreign key constraint DETAIL: Table "pgbench_history" references "pgbench_accounts".HINT: Truncate table "pgbench_history" at the same time, or use TRUNCATE ... CASCADE. I think it should work. It probably just mean to TRUNCATE all tables as one command, or add the suggested CASCADE. I would favor the first option. I'm wondering whether this truncation should be yet another available command? Hmmm... maybe not. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Fri, Sep 1, 2017 at 4:42 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Masahiko-san, > > Patch applies and compiles. > > One bug found, and some minor points again. Sorry for this hopefully last > iteration... I'm kind of an iterative person... > > I've generated the doc to look a it. > > Short option "-I" does not use a "=", it should be "-I > custom_init_commands". > > Also maybe it would look nicer and clearer if the short mnemonic was outside > the literal, that is with: > > <term><literal>c</> (cleanup)</> > > instead of: > > <term><literal>c (cleanup)</></> > > But this is debatable. Do it the way you think is best. > > Command "g" does not work after "f", something I had not tested before: > > ./pgbench -i -I ctvpfg > cleaning up... > creating tables... > vacuum... > set primary keys... > set foreign keys... > ERROR: cannot truncate a table referenced in a foreign key constraint > DETAIL: Table "pgbench_history" references "pgbench_accounts". > HINT: Truncate table "pgbench_history" at the same time, or use TRUNCATE > ... CASCADE. > > I think it should work. It probably just mean to TRUNCATE all tables as one > command, or add the suggested CASCADE. I would favor the first option. > > I'm wondering whether this truncation should be yet another available > command? Hmmm... maybe not. Currently TRUNCATE pgbench_accounts command is executed within a transaction started immediately before it. If we move it out of the transaction, the table data will be truncated even if the copying data failed. Maybe we can do TRUNCATE pgbench_accounts, pgbench_history instead. Thought? Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
>> I'm wondering whether this truncation should be yet another available >> command? Hmmm... maybe not. > > Currently TRUNCATE pgbench_accounts command is executed within a > transaction started immediately before it. If we move it out of the > transaction, the table data will be truncated even if the copying data > failed. Maybe we can do TRUNCATE pgbench_accounts, pgbench_history > instead. Thought? Keep the truncate in the transaction, and truncate both (or all?) tables together. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Fri, Sep 1, 2017 at 11:29 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > >>> I'm wondering whether this truncation should be yet another available >>> command? Hmmm... maybe not. >> >> >> Currently TRUNCATE pgbench_accounts command is executed within a >> transaction started immediately before it. If we move it out of the >> transaction, the table data will be truncated even if the copying data >> failed. Maybe we can do TRUNCATE pgbench_accounts, pgbench_history >> instead. Thought? > > > Keep the truncate in the transaction, and truncate both (or all?) tables > together. Attached latest patch incorporated the comments I got so far. Please review it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Masahiko-san, >>> Currently TRUNCATE pgbench_accounts command is executed within a >>> transaction started immediately before it. If we move it out of the >>> transaction, the table data will be truncated even if the copying data >>> failed. Maybe we can do TRUNCATE pgbench_accounts, pgbench_history >>> instead. Thought? >> >> >> Keep the truncate in the transaction, and truncate both (or all?) tables >> together. > > Attached latest patch incorporated the comments I got so far. Please review it. "g" does not work for me yet when after "f", only the message is slightly different, it chokes on another dependency, branches instead of accounts. sh> ./pgbench -i -I ctpfg cleaning up... creating tables... set primary keys... set foreign keys... ERROR: cannottruncate a table referenced in a foreign key constraint DETAIL: Table "pgbench_history" references "pgbench_branches". HINT: Truncate table "pgbench_history" at the same time, or use TRUNCATE ... CASCADE. I think that the whole data generation should be in *one* transaction which starts with "truncate pgbench_history, pgbench_branches, pgbench_tellers, pgbench_accounts;" In passing, I think that the documentation should tell explicitely what the default value is (aka "ctgvp"). -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Tue, Sep 5, 2017 at 2:33 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Masahiko-san, > >>>> Currently TRUNCATE pgbench_accounts command is executed within a >>>> transaction started immediately before it. If we move it out of the >>>> transaction, the table data will be truncated even if the copying data >>>> failed. Maybe we can do TRUNCATE pgbench_accounts, pgbench_history >>>> instead. Thought? >>> >>> >>> >>> Keep the truncate in the transaction, and truncate both (or all?) tables >>> together. >> >> >> Attached latest patch incorporated the comments I got so far. Please >> review it. > > > "g" does not work for me yet when after "f", only the message is slightly > different, it chokes on another dependency, branches instead of accounts. > > sh> ./pgbench -i -I ctpfg > cleaning up... > creating tables... > set primary keys... > set foreign keys... > ERROR: cannot truncate a table referenced in a foreign key constraint > DETAIL: Table "pgbench_history" references "pgbench_branches". > HINT: Truncate table "pgbench_history" at the same time, or use TRUNCATE > ... CASCADE. Sorry, I'd missed something. > I think that the whole data generation should be in *one* transaction which > starts with "truncate pgbench_history, pgbench_branches, pgbench_tellers, > pgbench_accounts;" Agreed, and fixed. > > In passing, I think that the documentation should tell explicitely what the > default value is (aka "ctgvp"). Agreed. Attached the latest patch. Please review it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
> Attached the latest patch. Please review it. Patch applies and compiles cleanly. Three very minor points: Typo added in the documentation: ".Each" -> ". Each". In "case 8:" there is a very long line which lacks a newline before pg_realloc second argument. I think that the check should silently accept spaces as they are ignored by init later, i.e.: sh> ./pgbench -i -I "ctg vpf" invalid custom initialization script command " " possible commands are: "c", "t", "g","v", "p", "f" Could just work... -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Tue, Sep 5, 2017 at 4:06 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > >> Attached the latest patch. Please review it. > > > Patch applies and compiles cleanly. > > Three very minor points: > > Typo added in the documentation: ".Each" -> ". Each". > > In "case 8:" there is a very long line which lacks a newline before > pg_realloc second argument. Sorry, I don't follow that. You meant I should add a newline before pg_realloc()? That is, + initialize_cmds = + (char *) pg_realloc(initialize_cmds, + sizeof(char) * n_cmds + 1); Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
> Sorry, I don't follow that. You meant I should add a newline before > pg_realloc()? That is, > > + initialize_cmds = > + (char *) pg_realloc(initialize_cmds, > + sizeof(char) * n_cmds + 1); Yes. Or maybe my terminal was doing tricks, because I had the impression that both argument where on the same line with many tabs in between, but maybe I just misinterpreted the diff file. My apology if it is the case. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Wed, Sep 6, 2017 at 12:11 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > >> Sorry, I don't follow that. You meant I should add a newline before >> pg_realloc()? That is, >> >> + initialize_cmds = >> + (char *) pg_realloc(initialize_cmds, >> + sizeof(char) * n_cmds + >> 1); > > > Yes. Or maybe my terminal was doing tricks, because I had the impression > that both argument where on the same line with many tabs in between, but > maybe I just misinterpreted the diff file. My apology if it is the case. > I understood. It looks ugly in the patch but can be applied properly by git apply command. Attached latest patch incorporated the comments I got so far. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Applies, compiles, works for me. Very very minor comments that I should have noticed before, sorry for this additional round trip. In the help line, move -I just after -i, to put short options in alphabetical and decreasing importance order. On this line, also add the information about the default, something like: -i, --ini... .... -I, --custom....=[...]+ (default "ctgvp") ... When/if the pgbench tap test patch get through, the feature should be tested there as well. No action needed now. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Wed, Sep 6, 2017 at 4:01 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Applies, compiles, works for me. > > Very very minor comments that I should have noticed before, sorry for this > additional round trip. Thank you for the dedicated review! > > In the help line, move -I just after -i, to put short options in > alphabetical and decreasing importance order. On this line, also add the > information about the default, something like: > > -i, --ini... .... > -I, --custom....=[...]+ (default "ctgvp") > ... Agreed. Attached latest patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
>> Very very minor comments that I should have noticed before, sorry for this >> additional round trip. > > Thank you for the dedicated review! I'm someone at times pigheaded, I think in the good sense if it is possible, and I like to finish what I start:-) Patch applies, compiles, works, everything is fine from my point of view. I switched it to "Ready for Committer". Again, if the pgbench tap test patch get through, it should be tap tested. -- Fabien.
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Thu, Sep 7, 2017 at 4:15 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > >>> Very very minor comments that I should have noticed before, sorry for >>> this >>> additional round trip. >> >> >> Thank you for the dedicated review! > > > I'm someone at times pigheaded, I think in the good sense if it is possible, > and I like to finish what I start:-) > > Patch applies, compiles, works, everything is fine from my point of view. > > I switched it to "Ready for Committer". Thanks. > > Again, if the pgbench tap test patch get through, it should be tap tested. > Thank you for the remainder, I'll add tap tests once the patch got committed. -- Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Fri, Sep 8, 2017 at 9:52 AM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Thu, Sep 7, 2017 at 4:15 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: >> >>>> Very very minor comments that I should have noticed before, sorry for >>>> this >>>> additional round trip. >>> >>> >>> Thank you for the dedicated review! >> >> >> I'm someone at times pigheaded, I think in the good sense if it is possible, >> and I like to finish what I start:-) >> >> Patch applies, compiles, works, everything is fine from my point of view. >> >> I switched it to "Ready for Committer". > > Thanks. > >> >> Again, if the pgbench tap test patch get through, it should be tap tested. >> > > Thank you for the remainder, I'll add tap tests once the patch got committed. > Attached the latest version patch incorporated the tap tests. Please review it. -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Masahiko-san, > Attached the latest version patch incorporated the tap tests. > Please review it. Patch applies, compilation & make check ok. Tests are simple and provide good coverage of new functionalities. I would suggest to add '--unlogged-tables' so speedup the tests a little. Comment: "# Custom initialization option, including a space"... ISTM that there is no space. Space is tested in the next test because of the v's and the --no-vacuum which turned them into space, which is enough. Regex are just check for the whole output, so putting twice "qr{vacuum}" does not check that vacuum appears twice, it checks twice that vacuum appears once. I do not think that it is worth trying to check for the v repetition, so I suggest to remove one from the first test. Repetition of ' ' is checked with the second test. Maybe you could check that the data generation message is there. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Tue, Sep 19, 2017 at 12:41 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Masahiko-san, > >> Attached the latest version patch incorporated the tap tests. >> Please review it. > > > Patch applies, compilation & make check ok. > > Tests are simple and provide good coverage of new functionalities. > > I would suggest to add '--unlogged-tables' so speedup the tests a little. > Good idea, added. > Comment: "# Custom initialization option, including a space"... ISTM that > there is no space. Space is tested in the next test because of the v's and > the --no-vacuum which turned them into space, which is enough. You're right, I removed it. > Regex are just check for the whole output, so putting twice "qr{vacuum}" > does not check that vacuum appears twice, it checks twice that vacuum > appears once. I do not think that it is worth trying to check for the v > repetition, so I suggest to remove one from the first test. Repetition of ' > ' is checked with the second test. Agreed. > Maybe you could check that the data generation message is there. Added the check. Attached the latest patch. Please review it. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Masahiko-san, v14 applies, compiles and works. TAP tests provide good coverage. ISTM that you probably intended "\(.*\)" (actual parenthesis) instead of "(.*)" (memorization) in the data generation message check. Otherwise all is well for me. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Wed, Sep 20, 2017 at 3:26 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Masahiko-san, > > v14 applies, compiles and works. TAP tests provide good coverage. > > ISTM that you probably intended "\(.*\)" (actual parenthesis) instead of > "(.*)" (memorization) in the data generation message check. > Thank you, fixed it. > Otherwise all is well for me. > Attached the updated version patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Masahiko-san, >> ISTM that you probably intended "\(.*\)" (actual parenthesis) instead of >> "(.*)" (memorization) in the data generation message check. > > Thank you, fixed it. > >> Otherwise all is well for me. >> > > Attached the updated version patch. Applies, compiles, make check & tap test ok, doc is fine. All is well for me: the feature is useful, code is simple and clean, it is easy to invoke, easy to extend as well, which I'm planning to do once it gets in. I switched the patch to "Ready for Committers". No doubt they will have their own opinions about it. Let's wait and see. Thanks, -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Thu, Sep 21, 2017 at 5:23 PM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Masahiko-san, > >>> ISTM that you probably intended "\(.*\)" (actual parenthesis) instead of >>> "(.*)" (memorization) in the data generation message check. >> >> >> Thank you, fixed it. >> >>> Otherwise all is well for me. >>> >> >> Attached the updated version patch. > > > Applies, compiles, make check & tap test ok, doc is fine. > > All is well for me: the feature is useful, code is simple and clean, it is > easy to invoke, easy to extend as well, which I'm planning to do once it > gets in. > > I switched the patch to "Ready for Committers". No doubt they will have > their own opinions about it. Let's wait and see. Thank you for the reviewing this patch!! Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Masahiko-san, >> Attached the updated version patch. > > Applies, compiles, make check & tap test ok, doc is fine. > > All is well for me: the feature is useful, code is simple and clean, it is > easy to invoke, easy to extend as well, which I'm planning to do once it gets > in. > > I switched the patch to "Ready for Committers". No doubt they will have their > own opinions about it. Let's wait and see. The patch needs a rebase in the documentation because of the xml-ilization of the sgml doc. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Wed, Oct 18, 2017 at 5:32 AM, Fabien COELHO <coelho@cri.ensmp.fr> wrote: > > Hello Masahiko-san, > >>> Attached the updated version patch. >> >> >> Applies, compiles, make check & tap test ok, doc is fine. >> >> All is well for me: the feature is useful, code is simple and clean, it is >> easy to invoke, easy to extend as well, which I'm planning to do once it >> gets in. >> >> I switched the patch to "Ready for Committers". No doubt they will have >> their own opinions about it. Let's wait and see. > > > The patch needs a rebase in the documentation because of the xml-ilization > of the sgml doc. > Thank you for the notification! Attached rebased patch. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
>> The patch needs a rebase in the documentation because of the xml-ilization >> of the sgml doc. >> > > Thank you for the notification! Attached rebased patch. Ok. The new version works for me. -- Fabien. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Masahiko Sawada <sawada.mshk@gmail.com> writes: > [ pgbench_custom_initialization_v16.patch ] I'm starting to review this patch, and I wonder how it is that you ended up with "c" as the command letter for dropping existing tables. Seems like "d" for DROP would be much less confusing. I see that at one point "d" meant the data load step, but since you've gone with "g" for "generate data" that conflict is gone. regards, tom lane
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
Hello Tom, > Masahiko Sawada <sawada.mshk@gmail.com> writes: >> [ pgbench_custom_initialization_v16.patch ] > > I'm starting to review this patch, and I wonder how it is that you > ended up with "c" as the command letter for dropping existing tables. > Seems like "d" for DROP would be much less confusing. I see that at > one point "d" meant the data load step, but since you've gone with > "g" for "generate data" that conflict is gone. Indeed, you are right. As a reviewer, I can recall that there were some hesitations, not sure we ended up with the best possible choice. Note that if "c" is freed by "d" (drop), then it may be worth considering that "t" (table) could be replaced by "c" (create). I'm fine with anything consistent and easy to memorize, really. -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: >> Masahiko Sawada <sawada.mshk@gmail.com> writes: >>> [ pgbench_custom_initialization_v16.patch ] >> I'm starting to review this patch, and I wonder how it is that you >> ended up with "c" as the command letter for dropping existing tables. >> Seems like "d" for DROP would be much less confusing. I see that at >> one point "d" meant the data load step, but since you've gone with >> "g" for "generate data" that conflict is gone. > Indeed, you are right. As a reviewer, I can recall that there were some > hesitations, not sure we ended up with the best possible choice. OK, will make the appropriate changes. > Note that if "c" is freed by "d" (drop), then it may be worth considering > that "t" (table) could be replaced by "c" (create). I thought about that, but the argument that 'c' might mean different sorts of create steps (e.g. create index) seemed reasonable. I think we're best off leaving it as 't' in case of future expansion. regards, tom lane
Re: [HACKERS] pgbench: Skipping the creating primary keys afterinitialization
From
Fabien COELHO
Date:
>> Note that if "c" is freed by "d" (drop), then it may be worth considering >> that "t" (table) could be replaced by "c" (create). > > I thought about that, but the argument that 'c' might mean different > sorts of create steps (e.g. create index) seemed reasonable. I think > we're best off leaving it as 't' in case of future expansion. Ok. Fine with me. -- Fabien.
Fabien COELHO <coelho@cri.ensmp.fr> writes: >>> Note that if "c" is freed by "d" (drop), then it may be worth considering >>> that "t" (table) could be replaced by "c" (create). >> I thought about that, but the argument that 'c' might mean different >> sorts of create steps (e.g. create index) seemed reasonable. I think >> we're best off leaving it as 't' in case of future expansion. > Ok. Fine with me. Committed that way, with some additional mostly-cosmetic changes: * I did not much like the "--custom-initialize" option name, as that conveys no information; arguably any option makes the initialization "custom" in some way. Moreover it invites confusion with the use of "custom" elsewhere in pgbench's docs for custom benchmarking scripts. After some thought I chose "--init-steps", though if anyone has a better idea I'm happy to listen. I would have gone with just "--steps", except that the -s and -S short forms are both taken already. (Hm, maybe we should use --steps and not have a short form?) * I also tried to make the code use the "step" terminology consistently internally, since it's already using "command" to refer to commands in benchmarking scripts. * I fixed it so that conflicts between the step-selecting options are resolved after we've scanned all the options. The submitted coding would do the wrong thing with, eg, --foreign-keys before -I. * I changed the "drop" step to just drop all four tables in one command; that way avoids having to make any assumption about what foreign keys exist. (I suppose that constraints leading in different directions aren't all that likely, but if we're trying to cater to not-quite-default configurations, we might as well do this.) * Minor other cosmetic cleanup. regards, tom lane
Re: [HACKERS] pgbench: Skipping the creating primary keys after initialization
From
Masahiko Sawada
Date:
On Tue, Nov 14, 2017 at 6:50 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Fabien COELHO <coelho@cri.ensmp.fr> writes: >>>> Note that if "c" is freed by "d" (drop), then it may be worth considering >>>> that "t" (table) could be replaced by "c" (create). > >>> I thought about that, but the argument that 'c' might mean different >>> sorts of create steps (e.g. create index) seemed reasonable. I think >>> we're best off leaving it as 't' in case of future expansion. > >> Ok. Fine with me. > > Committed that way, with some additional mostly-cosmetic changes: > > * I did not much like the "--custom-initialize" option name, as that > conveys no information; arguably any option makes the initialization > "custom" in some way. Moreover it invites confusion with the use of > "custom" elsewhere in pgbench's docs for custom benchmarking scripts. > After some thought I chose "--init-steps", though if anyone has a better > idea I'm happy to listen. I would have gone with just "--steps", except > that the -s and -S short forms are both taken already. (Hm, maybe we > should use --steps and not have a short form?) > > * I also tried to make the code use the "step" terminology consistently > internally, since it's already using "command" to refer to commands in > benchmarking scripts. > > * I fixed it so that conflicts between the step-selecting options > are resolved after we've scanned all the options. The submitted > coding would do the wrong thing with, eg, --foreign-keys before -I. > > * I changed the "drop" step to just drop all four tables in one > command; that way avoids having to make any assumption about what > foreign keys exist. (I suppose that constraints leading in different > directions aren't all that likely, but if we're trying to cater to > not-quite-default configurations, we might as well do this.) > > * Minor other cosmetic cleanup. > Thank you for reviewing and committing the patch. I couldn't respond promptly because slept but I agree with the changes. Regards, -- Masahiko Sawada NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center