Thread: Refactoring pgbench.c
Main pgbench logic consists of single file pgbench.c which is 4036 lines of code as of today. This is not a small number and I think it would be nice if it is divided into smaller files because it will make it easier to maintain, add or change features of pgbench. I will come up with an idea how to split pgbench.c later. In the mean time I attached a call graph of pgbench.c generated by egypt, which we could get a basic idea how to split and modularize pgbench.c. BTW, while looking at pgbench.c I noticed subtle coding problems: 1) strtoint64() should be decalred as static 2) static void agg_vals_init(AggVals *aggs, instr_time start) "static" and "void" should be one line, rather than separate 2 lines according to our coding style. I will commit fix for these if there's no objection. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Hello Tatsuo, > Main pgbench logic consists of single file pgbench.c which is 4036 lines > of code as of today. This is not a small number and I think it would be > nice if it is divided into smaller files because it will make it easier > to maintain, add or change features of pgbench. I do not think that this large file is a so big a problem (good editors help navigation in the code), and I'm not sure that splitting it would achieve much: there are not that many functions, some of them are maybe long (main, threadRun, doCustom) but mostly for good reasons. I've submitted a patch to remove "fork-emulation", which I think would really help simplify the code (maybe -10% source in "pgbench.c", less #ifs, avoid double implementations or more-complex-than-necessary implementations or not-implemented features). -- Fabien.
Hi, On 06/28/2015 08:10 AM, Fabien COELHO wrote: > > Hello Tatsuo, > >> Main pgbench logic consists of single file pgbench.c which is 4036 >> lines of code as of today. This is not a small number and I think >> it would be nice if it is divided into smaller files because it >> will make it easier to maintain, add or change features of pgbench. > > I do not think that this large file is a so big a problem (good > editors help navigation in the code), and I'm not sure that splitting > it would achieve much: there are not that many functions, some of > them are maybe long (main, threadRun, doCustom) but mostly for good > reasons. My thoughts, exactly. I don't think just splitting the file into multiple pieces will achieve anything - the problem is that we've extended the original pgbench code in rather hackish way, without any major design changes, so IMHO what should be done is refactoring ... > I've submitted a patch to remove "fork-emulation", which I think > would really help simplify the code (maybe -10% source in > "pgbench.c", less #ifs, avoid double implementations or > more-complex-than-necessary implementations or not-implemented > features). ... and cleanup of dead code. -- Tomas Vondra http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>> I do not think that this large file is a so big a problem (good >> editors help navigation in the code), and I'm not sure that splitting >> it would achieve much: there are not that many functions, some of >> them are maybe long (main, threadRun, doCustom) but mostly for good >> reasons. > > My thoughts, exactly. I don't think just splitting the file into > multiple pieces will achieve anything - the problem is that we've > extended the original pgbench code in rather hackish way, without any > major design changes, so IMHO what should be done is refactoring ... This is kind of surprising to me that two people are against refactoring proposal (I understand that Fabien has pending patches for pgbench and that could be a motivation for this though). Splitting single large file into smaller files in which functions contain strong relation will be itself a benefit since there will be more chance for people to be needed to look into smaller places than before. This is just a basic idea of modular programming and will be a long term benefit of PostgreSQL project, which was my thought. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
On 06/28/15 18:56, Tatsuo Ishii wrote: >>> I do not think that this large file is a so big a problem (good >>> editors help navigation in the code), and I'm not sure that splitting >>> it would achieve much: there are not that many functions, some of >>> them are maybe long (main, threadRun, doCustom) but mostly for good >>> reasons. >> >> My thoughts, exactly. I don't think just splitting the file into >> multiple pieces will achieve anything - the problem is that we've >> extended the original pgbench code in rather hackish way, without any >> major design changes, so IMHO what should be done is refactoring ... > > This is kind of surprising to me that two people are against > refactoring proposal (I understand that Fabien has pending patches > for pgbench and that could be a motivation for this though). I think that's a misunderstanding. I'm not against refactoring - not at all, and neither is Fabien I believe. However the first paragraph of your post seems to suggest that we get better code by merely splitting the a large file into several smaller ones. I don't think that alone helps - I think we first need to refactor the code first, and then possibly split the file into several modules. > Splitting single large file into smaller files in which functions > contain strong relation will be itself a benefit since there will be > more chance for people to be needed to look into smaller places than > before. This is just a basic idea of modular programming and will be > a long term benefit of PostgreSQL project, which was my thought. I respectfully disagree. I think that for this to be true, the files need to be somehow logically related (which I'd expect to be the output of the refactoring). But maybe you have some particular refactoring and split in mind? best regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>> This is kind of surprising to me that two people are against >> refactoring proposal (I understand that Fabien has pending patches for >> pgbench and that could be a motivation for this though). > > I think that's a misunderstanding. I'm not against refactoring - not at all, > and neither is Fabien I believe. Indeed I'm not, I was writing about splitting "pgbench.c" into several files. Although, as Tatsuo-san noted, I'm probably against a refactoring before the patches I already submitted get considered:-) Really refactoring would mean breaking apart the long functions, but these are long for some reasons (say global goto to handle errors...), so how to do something very useful on that line is not clear to me. -- Fabien.
On Sat, Jun 27, 2015 at 5:10 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
Main pgbench logic consists of single file pgbench.c which is 4036
lines of code as of today. This is not a small number and I think it
would be nice if it is divided into smaller files because it will make
it easier to maintain, add or change features of pgbench. I will come
up with an idea how to split pgbench.c later. In the mean time I
attached a call graph of pgbench.c generated by egypt, which we could
get a basic idea how to split and modularize pgbench.c.
I've never found the raw line count of a file to be a problem, at least not at the 4000 line mark.
If some functions could be moved to existing libraries (like the functions strtoint64 or doConnect or read_line_from_file or the statistical distribution functions) that would be great, but I assume that would have been done already if there was already a logical place to hold them. I don't think inventing new libraries just to hold these would be an improvement.
If you provided a suggested dissection, maybe I would find it compelling. But without seeing a concrete proposal I think the process of refactoring is going to impede other improvements more than the result of refactoring it will promote them.
Cheers,
Jeff