Thread: pgbench - use pg logging capabilities
Bonjour Michaël, hello devs, As suggested in "cce64a51", this patch make pgbench use postgres logging capabilities. I tried to use fatal/error/warning/info/debug where appropriate. Some printing to stderr remain for some pgbench specific output. The patch fixes a inconsistent test case name that I noticed in passing. -- Fabien.
Attachment
On Tue, Dec 24, 2019 at 11:17:31AM +0100, Fabien COELHO wrote: > Some printing to stderr remain for some pgbench specific output. Hmm. Wouldn't it make sense to output the log generated as information from the test using pg_log_info() instead of using fprintf(stderr) (the logs of the initial data load, progress report)? It seems to me that this would be consistent with the other tools we have, and being able to make a difference with the level of logs is kind of a nice property of logging.c as you can grep easily for one problems instead of looking at multiple patterns matching an error in the logs. Note also an error in the scripts does not report an error. Another thing is that messages logged would need to be translated. I think that's nice, but perhaps others don't like that or may think that's not a good idea. Who knows.. > The patch fixes a inconsistent test case name that I noticed in passing. > > @@ -157,7 +157,7 @@ my @options = ( > qr{error while setting random seed from --random-seed option} > ] > ], > - [ 'bad partition type', '-i --partition-method=BAD', [qr{"range"}, qr{"hash"}, qr{"BAD"}] ], > + [ 'bad partition method', '-i --partition-method=BAD', [qr{"range"}, qr{"hash"}, qr{"BAD"}] ], > [ 'bad partition number', '-i --partitions -1', [ qr{invalid number of partitions: "-1"} ] ], No problem with this one from me, I'll fix it if there are no objections. -- Michael
Attachment
On 2019-12-24 11:17, Fabien COELHO wrote: > As suggested in "cce64a51", this patch make pgbench use postgres logging > capabilities. > > I tried to use fatal/error/warning/info/debug where appropriate. > > Some printing to stderr remain for some pgbench specific output. The patch seems pretty straightforward, but this +/* + * Convenient shorcuts + */ +#define fatal pg_log_fatal +#define error pg_log_error +#define warning pg_log_warning +#define info pg_log_info +#define debug pg_log_debug seems counterproductive. Let's just use the normal function names. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On December 31, 2019 8:10:13 PM GMT+09:00, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > seems counterproductive. Let's just use the normal function names. +1. -- Michael
Bonjour Michaël, et excellente année 2020 ! > Hmm. Wouldn't it make sense to output the log generated as > information from the test using pg_log_info() instead of using > fprintf(stderr) (the logs of the initial data load, progress report)? For the progress report, the reason I decided against is that the lines are already long enough with data (for the progress report: tps, latency, etc.), and prepending "pgbench info" or equivalent in front of every line does not look very useful and make it more likely that actually useful data could be pushed out of the terminal width. For data load, ISTM that people are used to it like that. Moreover, I do not think that the \r recently-added trick can work with the logging stuff, so I left it out as well altogether. > It seems to me that this would be consistent with the other tools we > have, and being able to make a difference with the level of logs is > kind of a nice property of logging.c as you can grep easily for one > problems instead of looking at multiple patterns matching an error in > the logs. Note also an error in the scripts does not report an > error. Another thing is that messages logged would need to be > translated. I think that's nice, but perhaps others don't like that > or may think that's not a good idea. Who knows.. Dunno about translation. ISTM that pgbench is mostly not translated, not sure why. -- Fabien.
Hello Peter, > The patch seems pretty straightforward, but this > > +/* > + * Convenient shorcuts > + */ > +#define fatal pg_log_fatal > +#define error pg_log_error > +#define warning pg_log_warning > +#define info pg_log_info > +#define debug pg_log_debug > > seems counterproductive. Let's just use the normal function names. I'm trying to keep the column width under control, but if you like it wider, here it is. Compared to v1 I have also made a few changes to be more consistent when using fatal/error/info. -- Fabien.
Attachment
On Wed, Jan 01, 2020 at 10:19:52PM +0100, Fabien COELHO wrote: > Bonjour Michaël, et excellente année 2020 ! Toi aussi! Bonne année. >> Hmm. Wouldn't it make sense to output the log generated as >> information from the test using pg_log_info() instead of using >> fprintf(stderr) (the logs of the initial data load, progress report)? > > For the progress report, the reason I decided against is that the lines are > already long enough with data (for the progress report: tps, latency, etc.), > and prepending "pgbench info" or equivalent in front of every line does not > look very useful and make it more likely that actually useful data could be > pushed out of the terminal width. Hm. Okay. That would limit the patch to only report errors in the first round of changes, which is fine by me. > For data load, ISTM that people are used to it like that. Moreover, I do not > think that the \r recently-added trick can work with the logging stuff, so I > left it out as well altogether. It could be possible to create new custom options for logging.c. We already have one as of PG_LOG_FLAG_TERSE to make the output of psql compatible with regression tests and such. These are just thoughts about the control of: - the progname is appended to the error string or not. - CR/LF as last character. > Dunno about translation. ISTM that pgbench is mostly not translated, not > sure why. Because as a benchmark tool that's not really worth it and its output is rather technical hence translating it would be more challenging? Perhaps others more used to translation work could chime in the discussion? -- Michael
Attachment
On 2020-01-01 22:55, Fabien COELHO wrote: > I'm trying to keep the column width under control, but if you like it > wider, here it is. > > Compared to v1 I have also made a few changes to be more consistent when > using fatal/error/info. The renaming of debug to debug_level seems unnecessary and unrelated. In runShellCommand(), you removed some but not all argv[0] from the output messages. I'm not sure what the intent was there. I would also recommend these changes: - pg_log_fatal("query failed: %s", sql); - pg_log_error("%s", PQerrorMessage(con)); + pg_log_fatal("query failed: %s", PQerrorMessage(con)); + pg_log_info("query was: %s", sql); This puts the most important information first. - pg_log_error("connection to database \"%s\" failed", dbName); - pg_log_error("%s", PQerrorMessage(conn)); + pg_log_error("connection to database \"%s\" failed: %s", + dbName, PQerrorMessage(conn)); Line break here is unnecessary. In both cases, pg_dump has similar messages that can serve as reference. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hello Peter, >> Compared to v1 I have also made a few changes to be more consistent when >> using fatal/error/info. > > The renaming of debug to debug_level seems unnecessary and unrelated. Indeed. It was when I used "debug" as a shorthand for "pg_log_debug". > In runShellCommand(), you removed some but not all argv[0] from the output > messages. I'm not sure what the intent was there. Without looking at the context I thought that argv[0] was the program name, which is not the case here. I put it back everywhere, including the DEBUG message. > I would also recommend these changes: > > - pg_log_fatal("query failed: %s", sql); > - pg_log_error("%s", PQerrorMessage(con)); > + pg_log_fatal("query failed: %s", PQerrorMessage(con)); > + pg_log_info("query was: %s", sql); > > This puts the most important information first. Ok. > - pg_log_error("connection to database \"%s\" failed", dbName); > - pg_log_error("%s", PQerrorMessage(conn)); > + pg_log_error("connection to database \"%s\" failed: %s", > + dbName, PQerrorMessage(conn)); > > Line break here is unnecessary. Ok. I homogeneised another similar message. Patch v3 attached hopefully fixes all of the above. -- Fabien.
Attachment
On Fri, Jan 03, 2020 at 01:01:18PM +0100, Fabien COELHO wrote: > Without looking at the context I thought that argv[0] was the program name, > which is not the case here. I put it back everywhere, including the DEBUG > message. The variable names in Command are confusing IMO... > Ok. I homogeneised another similar message. > > Patch v3 attached hopefully fixes all of the above. + pg_log_error("gaussian parameter must be at least " + "%f (not %f)", MIN_GAUSSIAN_PARAM, param); I would keep all the error message strings to be on the same line. That makes grepping for them easier on the same, and that's the usual convention even if these are larger than 72-80 characters. #ifdef DEBUG - printf("shell parameter name: \"%s\", value: \"%s\"\n", argv[1], res); + pg_log_debug("%s: shell parameter name: \"%s\", value: \"%s\"", argv[0], argv[1], res); #endif Worth removing this ifdef? - fprintf(stderr, "%s", PQerrorMessage(con)); + pg_log_fatal("unexpected copy in result"); + pg_log_error("%s", PQerrorMessage(con)); exit(1); [...] - fprintf(stderr, "%s", PQerrorMessage(con)); + pg_log_fatal("cannot count number of branches"); + pg_log_error("%s", PQerrorMessage(con)); These are inconsistent with the rest, why not combining both? I think that I would just remove the "debug" variable defined in pgbench.c all together, and switch the messages for the duration and the one in executeMetaCommand to use info-level logging.. -- Michael
Attachment
Bonjour Michaël, >> Without looking at the context I thought that argv[0] was the program name, >> which is not the case here. I put it back everywhere, including the DEBUG >> message. > > The variable names in Command are confusing IMO... Somehow, yes. Note that there is a logic, it will indeed be the argv of the called shell command… And I do not think it is the point of this patch to solve this possible confusion. > + pg_log_error("gaussian parameter must be at least " > + "%f (not %f)", MIN_GAUSSIAN_PARAM, param); > I would keep all the error message strings to be on the same line. > That makes grepping for them easier on the same, and that's the usual > convention even if these are larger than 72-80 characters. Ok. I also did other similar cases accordingly. > #ifdef DEBUG > Worth removing this ifdef? Yep, especially as it is the only instance. Done. > + pg_log_fatal("unexpected copy in result"); > + pg_log_error("%s", PQerrorMessage(con)); > + pg_log_fatal("cannot count number of branches"); > + pg_log_error("%s", PQerrorMessage(con)); > These are inconsistent with the rest, why not combining both? Ok, done. > I think that I would just remove the "debug" variable defined in > pgbench.c all together, and switch the messages for the duration and the > one in executeMetaCommand to use info-level logging.. Ok, done. Patch v4 attached addresses all these points. -- Fabien.
Attachment
On Mon, Jan 06, 2020 at 01:36:23PM +0100, Fabien COELHO wrote: >> I think that I would just remove the "debug" variable defined in >> pgbench.c all together, and switch the messages for the duration and the >> one in executeMetaCommand to use info-level logging.. > > Ok, done. Thanks. The output then gets kind of inconsistent when using --debug: pgbench: client 2 executing script "<builtin: TPC-B (sort of)>" client 2 executing \set aid client 2 executing \set bid client 2 executing \set tid client 2 executing \set delta My point was to just modify the code so as this uses pg_log_debug(), with a routine doing some reverse-engineering of the Command data to generate a string to show up in the logs. Sorry for the confusion.. And there is no need to use __pg_log_level either which should remain internal to logging.h IMO. We'd likely want a similar business in syntax_error() to be fully consistent with all other code paths dealing with an error showing up before exiting. No idea what others think here. I may be too much pedantic. -- Michael
Attachment
Bonjour Michaël, >>> I think that I would just remove the "debug" variable defined in >>> pgbench.c all together, and switch the messages for the duration and the >>> one in executeMetaCommand to use info-level logging.. >> >> Ok, done. > > Thanks. The output then gets kind of inconsistent when using --debug: > pgbench: client 2 executing script "<builtin: TPC-B (sort of)>" > client 2 executing \set aid > client 2 executing \set bid > client 2 executing \set tid > client 2 executing \set delta > > My point was to just modify the code so as this uses pg_log_debug(), > with a routine doing some reverse-engineering of the Command data to > generate a string to show up in the logs. Sorry for the confusion.. > And there is no need to use __pg_log_level either which should remain > internal to logging.h IMO. For the first case with the output you point out, there is a loop to generate the output. I do not think that we want to pay the cost of generating the string and then throw it away afterwards when not under debug, esp as string manipulation is not that cheap, so we need to enter the thing only when under debug. However, there is no easy way to do that without accessing __pg_log_level. It could be hidden in a macro to create, but that's it. For the second case I called pg_log_debug just once. > We'd likely want a similar business in syntax_error() to be fully > consistent with all other code paths dealing with an error showing up > before exiting. The syntax error is kind of complicated because there is the location information which is better left as is, IMHO. I moved remainder to a PQExpBuffer and pg_log_fatal. > No idea what others think here. I may be too much pedantic. Maybe a little:-) Note that I submitted another patch to use PQExpBuffer wherever possible in pgbench, especially to get rid of doubtful snprintf/strlen patterns. Patch v5 attached tries to follow your above suggestions. -- Fabien.
Attachment
> Patch v5 attached tries to follow your above suggestions. Patch v6 makes syntax error location code more compact and tests the case. -- Fabien.
Attachment
On 2020-01-08 13:07, Fabien COELHO wrote: > >> Patch v5 attached tries to follow your above suggestions. > > Patch v6 makes syntax error location code more compact and tests the case. Committed. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>> Patch v6 makes syntax error location code more compact and tests the case. > > Committed. Thanks! -- Fabien.
On Wed, Jan 08, 2020 at 02:27:46PM +0100, Peter Eisentraut wrote: > Committed. That was fast. - if (debug) + if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) { I am surprised that you kept this one, while syntax_error() has been changed in a more modular way. -- Michael
Attachment
On 2020-01-08 15:12, Michael Paquier wrote: > On Wed, Jan 08, 2020 at 02:27:46PM +0100, Peter Eisentraut wrote: >> Committed. > > That was fast. > > - if (debug) > + if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) > { > I am surprised that you kept this one, I'm not happy about it, but it seems OK for now. We can continue to improve here. > while syntax_error() has been > changed in a more modular way. I don't follow what you mean by that. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Wed, Jan 08, 2020 at 03:31:46PM +0100, Peter Eisentraut wrote: > On 2020-01-08 15:12, Michael Paquier wrote: >> while syntax_error() has been >> changed in a more modular way. > > I don't follow what you mean by that. The first versions of the patch did not change syntax_error(), and the version committed has switched to use PQExpBufferData there. I think that we should just do the same for the debug logs executing the meta commands. This way, we get an output consistent with what's printed out for sending or receiving stuff. Please see the attached. -- Michael
Attachment
Bonjour Michaël, >> I don't follow what you mean by that. > > The first versions of the patch did not change syntax_error(), and the > version committed has switched to use PQExpBufferData there. I think > that we should just do the same for the debug logs executing the meta > commands. This way, we get an output consistent with what's printed > out for sending or receiving stuff. Please see the attached. Yep, I thought of it, but I was not very keen on having a malloc/free cycle just for one debug message. However under debug this is probably not an issue. Your patch works for me. IT can avoid some level of format interpretation overheads by switching to Char/Str functions, see first attachement. The other point is the test on __pg_log_level, see second attached. -- Fabien.
Attachment
On Thu, Jan 09, 2020 at 10:28:21AM +0100, Fabien COELHO wrote: > Yep, I thought of it, but I was not very keen on having a malloc/free cycle > just for one debug message. However under debug this is probably not an > issue. Consistency is more important here IMO, so applied. > Your patch works for me. IT can avoid some level of format interpretation > overheads by switching to Char/Str functions, see first attachement. I kept both grouped to avoid any unnecessary churn with the manipulation of PQExpBufferData. > The other point is the test on __pg_log_level, see second attached. May be better to discuss that on a separate thread as that's not only related to pgbench. -- Michael
Attachment
On 2020-Jan-09, Fabien COELHO wrote: > - if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) > + if (pg_log_debug_level) > { Umm ... I find the original exceedingly ugly, but the new line is totally impenetrable. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, Jan 09, 2020 at 09:27:42PM -0300, Alvaro Herrera wrote: > On 2020-Jan-09, Fabien COELHO wrote: >> - if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) >> + if (pg_log_debug_level) >> { > > Umm ... I find the original exceedingly ugly, but the new line is > totally impenetrable. Maybe just a pg_logging_get_level() for consistency with the _set_level() one, and then compare the returned result with PG_LOG_DEBUG in pgbench? -- Michael
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2020-Jan-09, Fabien COELHO wrote: >> - if (unlikely(__pg_log_level <= PG_LOG_DEBUG)) >> + if (pg_log_debug_level) >> { > Umm ... I find the original exceedingly ugly, but the new line is > totally impenetrable. So, I had not been paying any attention to this thread, but that snippet is already enough to set off alarm bells. 1. (problem with already-committed code, evidently) The C standard is quite clear that -- All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use. -- All identifiers that begin with an underscore are always reserved for use as identifiers with file scope in both the ordinary and tag name spaces. "Reserved" in this context appears to mean "reserved for use by system headers and/or compiler special behaviors". Declaring our own global variables with double-underscore prefixes is not just asking for trouble, it's waving a red flag in front of a bull. 2. (problem with proposed patch) I share Alvaro's allergy for replacing uses of a common variable with a bunch of macros, especially macros that don't look like macros. That's not reducing the reader's cognitive burden. I'd even say it's actively misleading the reader, because what the new code *looks* like it's doing is referencing several independent global variables. We don't need our code to qualify as an entry for the Obfuscated C Contest. The notational confusion could be solved perhaps by writing the macros with function-like parentheses, but it still doesn't seem like an improvement. In particular, the whole point here is to have a common idiom for logging, but I'm unconvinced that every frontend program should be using unlikely() in this particular way. Maybe it's unlikely for pgbench's usage that verbose logging would be turned on, but why should we build in an assumption that that's universally the case? TBH, my recommendation would be to drop *all* of these likely() and unlikely() calls. What evidence have you got that those are meaningfully improving the quality of the generated code? And if they're buried inside macros, they certainly aren't doing anything useful in terms of documenting the code. regards, tom lane
On Thu, Jan 09, 2020 at 08:09:29PM -0500, Tom Lane wrote: > TBH, my recommendation would be to drop *all* of these likely() > and unlikely() calls. What evidence have you got that those are > meaningfully improving the quality of the generated code? And if > they're buried inside macros, they certainly aren't doing anything > useful in terms of documenting the code. Yes. I am wondering if we should not rework this part of the logging with something like the attached. My 2c, thoughts welcome. -- Michael
Attachment
Bonjour Michaël, >> TBH, my recommendation would be to drop *all* of these likely() >> and unlikely() calls. What evidence have you got that those are >> meaningfully improving the quality of the generated code? And if >> they're buried inside macros, they certainly aren't doing anything >> useful in terms of documenting the code. > > Yes. I am wondering if we should not rework this part of the logging > with something like the attached. My 2c, thoughts welcome. ISTM that the intent is to minimise the performance impact of ignored pg_log calls, especially when under debug where it is most likely to be the case AND that they may be in critical places. Compared to dealing with the level inside the call, the use of the level variable avoids a call-test-return cycle in this case, and the unlikely should help the compiler reorder instructions so that no actual branch is taken under the common case. So I think that the current situation is a good thing at least for debug. For other levels, they are on by default AND would not be placed at critical performance points, so the whole effort of avoiding the call are moot. I agree with Tom that __pg_log_level variable name violates usages. ISTM that switching the variable to explicitely global solves the issues, and that possible the level test can be moved to inside the function for all but the debug level. See attached which reprises some of your idea, but keep the outside filtering for the debug level. -- Fabien.
Attachment
On Fri, Jan 10, 2020 at 08:52:17AM +0100, Fabien COELHO wrote: > Compared to dealing with the level inside the call, the use of the level > variable avoids a call-test-return cycle in this case, and the unlikely > should help the compiler reorder instructions so that no actual branch is > taken under the common case. > > So I think that the current situation is a good thing at least for debug. If you look at some of my messages on other threads, you would likely notice that my mood of the day is to not design things which try to outsmart a user's expectations :) So I would stand on the position to just remove those likely/unlikely parts if we want this logging to be generic. > For other levels, they are on by default AND would not be placed at critical > performance points, so the whole effort of avoiding the call are moot. > > I agree with Tom that __pg_log_level variable name violates usages. My own taste would be to still keep the variable local to logging.c, and use a "get"-like routine to be consistent with the "set" part. I don't have to be right, let's see where this discussion leads us. (I mentioned that upthread, but I don't think it is a good idea to discuss about a redesign of those routines on a thread about pgbench based on $subject. All the main players are here so it likely does not matter, but..) -- Michael
Attachment
Michaël, >> So I think that the current situation is a good thing at least for debug. > > If you look at some of my messages on other threads, you would likely > notice that my mood of the day is to not design things which try to > outsmart a user's expectations :) I'm not following you. ISTM that user expectations is that the message is printed when the level requires it, and that the performance impact is minimal otherwise. I'm not aiming at anything different. > So I would stand on the position to just remove those likely/unlikely > parts if we want this logging to be generic. It is unclear to me whether your point is about the whole "if", or only the compiler directive itself (i.e. "likely" and "unlikely"). I'll assume the former. I do not think we should "want" logging to be generic per se, but only if it makes sense from a performance and feature point of view. For the normal case (standard level, no debug), there is basically no difference because the message is going to be printed anyway: either it is check+call+work, or call+check+work. Anything is fine. The directive would help the compiler reorder instructions so that usual case does not inccur a jump. For debug messages, things are different: removing the external test & unlikely would have a detrimental effect on performance when not debugging, which is most of the time, because you would pay the cost of evaluating arguments and call/return cycle on each message anyway. That can be bad if a debug message is place in some critical place. So the right place of the the debug check is early. Once this is done, then why not doing that for all other level for consistency? This is the current situation. If the check is moved inside the call, then there is a performance benefit to repeat it for debug, which is a pain because then it would be there twice in that case, and it creates an exception. The fact that some macro are simplified is not very useful because this is not really user visible. So IMHO the current situation is fine, but the __variable name. So ISTM that the attached is the simplest and more reasonable option to fix this. >> For other levels, they are on by default AND would not be placed at critical >> performance points, so the whole effort of avoiding the call are moot. >> >> I agree with Tom that __pg_log_level variable name violates usages. > > My own taste would be to still keep the variable local to logging.c, > and use a "get"-like routine to be consistent with the "set" part. I > don't have to be right, let's see where this discussion leads us. This would defeat the point of avoiding a function call, if a function call is needed to check whether the other function call is needed:-) Hence the macro. > (I mentioned that upthread, but I don't think it is a good idea to > discuss about a redesign of those routines on a thread about pgbench > based on $subject. All the main players are here so it likely does > not matter, but..) Yep. I hesitated to be the one to do it, and ISTM that the problem is small enough so that it can be resolved without a new thread. I may be naïvely wrong:-) -- Fabien.
Attachment
On Fri, Jan 10, 2020 at 05:39:40PM +0100, Fabien COELHO wrote: > So IMHO the current situation is fine, but the __variable name. So ISTM that > the attached is the simplest and more reasonable option to fix this. I'd rather hear more from others at this point. Peter's opinion, as the main author behind logging.c/h, would be good to have here. -- Michael