Thread: too much pgbench init output
When initializing a large database, pgbench writes tons of "%d tuples done" lines. I propose to change this to a sort of progress counter that stays on the same line, as in the attached patch.
Attachment
On Sat, Sep 1, 2012 at 12:00 AM, Peter Eisentraut <peter_e@gmx.net> wrote: > When initializing a large database, pgbench writes tons of "%d tuples > done" lines. I propose to change this to a sort of progress counter > that stays on the same line, as in the attached patch. I'm not sure I like this - what if the output is being saved off to a file? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 1 Září 2012, 12:30, Robert Haas wrote: > On Sat, Sep 1, 2012 at 12:00 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >> When initializing a large database, pgbench writes tons of "%d tuples >> done" lines. I propose to change this to a sort of progress counter >> that stays on the same line, as in the attached patch. > > I'm not sure I like this - what if the output is being saved off to a > file? What about using istty(stdout) to handle this situation? Although I find it usually confusing, because it prints one thing when executed directly and something else when the output is redirected to a file. I see two other options: (1) removing this output altogether (I can't imagine a situation when this really matters) and replace it with a simple "inserted 23% of rows, estimated remaining time 14:23 (863 sec)", updated each 1% (2) adding a switch (--verbose) that enables these lines, don't print them by default Another option might be updating the process title so that the "top" shows current progress more precisely than (1). Tomas
On 9/1/12 6:30 AM, Robert Haas wrote: > On Sat, Sep 1, 2012 at 12:00 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >> When initializing a large database, pgbench writes tons of "%d tuples >> done" lines. I propose to change this to a sort of progress counter >> that stays on the same line, as in the attached patch. > > I'm not sure I like this - what if the output is being saved off to a file? I suppose we could print \n instead of \r then.
Peter Eisentraut <peter_e@gmx.net> writes: > On 9/1/12 6:30 AM, Robert Haas wrote: >> On Sat, Sep 1, 2012 at 12:00 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >>> When initializing a large database, pgbench writes tons of "%d tuples >>> done" lines. I propose to change this to a sort of progress counter >>> that stays on the same line, as in the attached patch. >> I'm not sure I like this - what if the output is being saved off to a file? > I suppose we could print \n instead of \r then. Possibly off-the-wall idea: we could fix the "too much output" problem once and for all by going to a log scale. 10 tuples done100 tuples done1000 tuples done10000 tuples done100000 tuples done... regards, tom lane
On Tue, Sep 4, 2012 at 8:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Eisentraut <peter_e@gmx.net> writes: >> On 9/1/12 6:30 AM, Robert Haas wrote: >>> On Sat, Sep 1, 2012 at 12:00 AM, Peter Eisentraut <peter_e@gmx.net> wrote: >>>> When initializing a large database, pgbench writes tons of "%d tuples >>>> done" lines. I propose to change this to a sort of progress counter >>>> that stays on the same line, as in the attached patch. > >>> I'm not sure I like this - what if the output is being saved off to a file? > >> I suppose we could print \n instead of \r then. > > Possibly off-the-wall idea: we could fix the "too much output" problem > once and for all by going to a log scale. > > 10 tuples done > 100 tuples done > 1000 tuples done > 10000 tuples done > 100000 tuples done > ... I don't like that, because one of the things you can see by following the current output is where the checkpoint stalls are happening during the load. You'd lose the ability to notice any kind of slowdown after the first few tuples with this kind of format. Actually, this whole things seems like a solution in search of a problem to me. We just reduced the verbosity of pgbench -i tenfold in the very recent past - I would have thought that enough to address this problem. But maybe not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Tue, 2012-09-04 at 23:14 -0400, Robert Haas wrote: > Actually, this whole things seems like a solution in search of a > problem to me. We just reduced the verbosity of pgbench -i tenfold in > the very recent past - I would have thought that enough to address > this problem. But maybe not. The problem is that a) It blasts out too much output and everything scrolls off the screen, and b) There is no indication of where the end is. These are independent problems, and I'd be happy to address them separately if there are such specific concerns attached to this. Speaking of tenfold, we could reduce the output frequency tenfold to once every 1000000, which would alleviate this problem for a while longer.
Peter Eisentraut <peter_e@gmx.net> writes: > On Tue, 2012-09-04 at 23:14 -0400, Robert Haas wrote: >> Actually, this whole things seems like a solution in search of a >> problem to me. We just reduced the verbosity of pgbench -i tenfold in >> the very recent past - I would have thought that enough to address >> this problem. But maybe not. > The problem is that > a) It blasts out too much output and everything scrolls off the screen, > and Robert evidently thinks that the verbosity of the output is a feature not a bug. I'm not convinced that eyeballing pgbench output is a particularly useful way to measure checkpoint stalls, but ... > b) There is no indication of where the end is. Well, surely *that* can be fixed in a noncontroversial way: just print "M/N tuples done", where N is the target. regards, tom lane
On Tue, Sep 4, 2012 at 11:31 PM, Peter Eisentraut <peter_e@gmx.net> wrote: > On Tue, 2012-09-04 at 23:14 -0400, Robert Haas wrote: >> Actually, this whole things seems like a solution in search of a >> problem to me. We just reduced the verbosity of pgbench -i tenfold in >> the very recent past - I would have thought that enough to address >> this problem. But maybe not. > > The problem is that > > a) It blasts out too much output and everything scrolls off the screen, > and > > b) There is no indication of where the end is. > > These are independent problems, and I'd be happy to address them > separately if there are such specific concerns attached to this. > > Speaking of tenfold, we could reduce the output frequency tenfold to > once every 1000000, which would alleviate this problem for a while > longer. Well, I wouldn't object to displaying a percentage on each output line. But I don't really like the idea of having them less frequent than they already are, because if you run into a situation that makes pgbench -i run slowly, as I occasionally do, it's marginal to tell the difference between "slow" and "completely hung" even with the current level of verbosity. However, we could add a -q flag to run more quietly, or something like that. Actually, I'd even be fine with making the default quieter, though we can't use -v for verbose since that's already taken. But I'd like to preserve the option of getting the current amount of output because sometimes I need that to troubleshoot problems. Actually it'd be nice to even get a bit more output: say, a timestamp on each line, and a completion percentage... but now I'm getting greedy. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Sep 5, 2012 at 9:47 AM, Robert Haas <robertmhaas@gmail.com> wrote:
Actually it'd be nice to even get a bit more output: say, a timestamp
on each line, and a completion percentage... but now I'm getting
greedy.
May be we need a verbosity level and print a lot less or a lot more information than what we do today. That will satisfy everyone. Hopefully.
Thanks,
Pavan
On Tue, 2012-09-04 at 23:44 -0400, Tom Lane wrote: > > b) There is no indication of where the end is. > > Well, surely *that* can be fixed in a noncontroversial way: just > print "M/N tuples done", where N is the target. I have made this change. I won't pursue using \r if others find it useful as is.
On 5.9.2012 06:17, Robert Haas wrote: > On Tue, Sep 4, 2012 at 11:31 PM, Peter Eisentraut <peter_e@gmx.net> wrote: >> On Tue, 2012-09-04 at 23:14 -0400, Robert Haas wrote: >>> Actually, this whole things seems like a solution in search of a >>> problem to me. We just reduced the verbosity of pgbench -i tenfold in >>> the very recent past - I would have thought that enough to address >>> this problem. But maybe not. >> >> The problem is that >> >> a) It blasts out too much output and everything scrolls off the screen, >> and >> >> b) There is no indication of where the end is. >> >> These are independent problems, and I'd be happy to address them >> separately if there are such specific concerns attached to this. >> >> Speaking of tenfold, we could reduce the output frequency tenfold to >> once every 1000000, which would alleviate this problem for a while >> longer. > > Well, I wouldn't object to displaying a percentage on each output > line. But I don't really like the idea of having them less frequent > than they already are, because if you run into a situation that makes > pgbench -i run slowly, as I occasionally do, it's marginal to tell the > difference between "slow" and "completely hung" even with the current > level of verbosity. > > However, we could add a -q flag to run more quietly, or something like > that. Actually, I'd even be fine with making the default quieter, > though we can't use -v for verbose since that's already taken. But > I'd like to preserve the option of getting the current amount of > output because sometimes I need that to troubleshoot problems. > Actually it'd be nice to even get a bit more output: say, a timestamp > on each line, and a completion percentage... but now I'm getting > greedy. Hi, I've been thinking about this a bit more, and do propose to use an option that determines "logging step" i.e. number of items (either directly or as a percentage) between log lines. The attached patch defines a new option "--logging-step" that accepts either integers or percents. For example if you want to print a line each 1000 lines, you can to this $ pgbench -i -s 1000 --logging-step 1000 testdb and if you want to print a line each 5%, you can do this $ pgbench -i -s 1000 --logging-step 5% testdb and that's it. Moreover the patch adds a record of elapsed an estimate of remaining time. So for example with 21% you may get this: creating tables... 21000 of 100000 tuples (21%) done (elapsed 1.56 s, remaining 5.85 s). 42000 of 100000 tuples (42%) done (elapsed 3.15 s, remaining 4.35 s). 63000 of 100000 tuples (63%) done (elapsed 4.73 s, remaining 2.78 s). 84000 of 100000 tuples (84%) done (elapsed 6.30 s, remaining 1.20 s). 100000 of 100000 tuples (100%) done (elapsed 8.17 s, remaining 0.00 s). vacuum... set primary keys... Now, I've had a hard time with the patch - no matter what I do, I do get "invalid option" error whenever I try to run that from command line for some reason. But when I run it from gdb, it works just fine. kind regards Tomas
Attachment
Tomas Vondra wrote: > I've been thinking about this a bit more, and do propose to use an > option that determines "logging step" i.e. number of items (either > directly or as a percentage) between log lines. > > The attached patch defines a new option "--logging-step" that accepts > either integers or percents. For example if you want to print a line > each 1000 lines, you can to this > > $ pgbench -i -s 1000 --logging-step 1000 testdb I find it hard to get excited about having to specify a command line argument to tweak this. Would it work to have it emit messages depending on elapsed time and log scale of tuples emitted? So for example emit the first message after 5 seconds or 100k tuples, then back off until (say) 15 seconds have lapsed and 1M tuples, etc? The idea is to make it verbose enough to keep a human satisfied with what he sees, but not flood the terminal with pointless updates. (I think printing the ETA might be nice as well, not sure). -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Oct 23, 2012 at 12:02 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Tomas Vondra wrote: > >> I've been thinking about this a bit more, and do propose to use an >> option that determines "logging step" i.e. number of items (either >> directly or as a percentage) between log lines. >> >> The attached patch defines a new option "--logging-step" that accepts >> either integers or percents. For example if you want to print a line >> each 1000 lines, you can to this >> >> $ pgbench -i -s 1000 --logging-step 1000 testdb > > I find it hard to get excited about having to specify a command line > argument to tweak this. Would it work to have it emit messages > depending on elapsed time and log scale of tuples emitted? So for > example emit the first message after 5 seconds or 100k tuples, then back > off until (say) 15 seconds have lapsed and 1M tuples, etc? The idea is > to make it verbose enough to keep a human satisfied with what he sees, > but not flood the terminal with pointless updates. (I think printing > the ETA might be nice as well, not sure). I like this idea. One of the times when the more verbose output is really useful is when you expect it to run fast but then it turns out that for some reason it runs really slow. If you make the output too terse, then you end up not really knowing what's going on. Having it give an update at least every 5 seconds would be a nice way to give the user a heads-up if things aren't going as planned, without cluttering the normal case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 23.10.2012 18:21, Robert Haas wrote: > On Tue, Oct 23, 2012 at 12:02 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: >> Tomas Vondra wrote: >> >>> I've been thinking about this a bit more, and do propose to use an >>> option that determines "logging step" i.e. number of items (either >>> directly or as a percentage) between log lines. >>> >>> The attached patch defines a new option "--logging-step" that accepts >>> either integers or percents. For example if you want to print a line >>> each 1000 lines, you can to this >>> >>> $ pgbench -i -s 1000 --logging-step 1000 testdb >> >> I find it hard to get excited about having to specify a command line >> argument to tweak this. Would it work to have it emit messages >> depending on elapsed time and log scale of tuples emitted? So for >> example emit the first message after 5 seconds or 100k tuples, then back >> off until (say) 15 seconds have lapsed and 1M tuples, etc? The idea is >> to make it verbose enough to keep a human satisfied with what he sees, >> but not flood the terminal with pointless updates. (I think printing >> the ETA might be nice as well, not sure). > > I like this idea. One of the times when the more verbose output is > really useful is when you expect it to run fast but then it turns out > that for some reason it runs really slow. If you make the output too > terse, then you end up not really knowing what's going on. Having it > give an update at least every 5 seconds would be a nice way to give > the user a heads-up if things aren't going as planned, without > cluttering the normal case. I've prepared a patch along these lines. The attached version used only elapsed time to print the log messages each 5 seconds, so now it prints a meessage each 5 seconds no matter what, along with an estimate of remaining time. I've removed the config option, although it might be useful to specify the interval? I'm not entirely sure how the 'log scale of tuples' should work - for example when the time 15 seconds limit is reached, should it be reset back to the previous step (5 seconds) to give a more detailed info, or should it be kept at 15 seconds? Tomas
Attachment
Hi,<br /><br />I gone through the discussion for this patch and here is my review:<br /><br />The main aim of this patchis to reduce the number of log lines. It is also suggested to use an options to provide the interval but few of us arenot much agree on it. So final discussion ended at keeping 5 sec interval between each log line.<br /><br />However, Isee, there are two types of users here:<br />1. Who likes these log lines, so that they can troubleshoot some slowness andall<br />2. Who do not like these log lines.<br /><br />So keeping these in mind, I rather go for an option which willcontrol this. People falling in category one can set this option to very low where as users falling under second categorycan keep it high.<br /><br />However, assuming we settled on 5 sec delay, here are few comments on that patch attached:<br/><br />Comments:<br />=========<br /><br />Patch gets applied cleanly with some whitespace errors. make andmake install too went smooth.<br /> make check was smooth. Rather it should be smooth since there are NO changes in otherpart of the code rather than just pgbench.c and we do not have any test-case as well.<br /><br />However, here are fewcomments on changes in pgbench.c<br /><br />1.<br />Since the final discussion ended at keeping a 5 seconds interval willbe good enough, Author used a global int variable for that.<br />Given that it's just a constant, #define would be abetter choice.<br /><br /> 2.<br /><span style="font-family:courier new,monospace">+ /* let's not call the timingfor each row, but only each 100 rows */</span><br />Why only 100 rows ? Have you done any testing to come up with number100 ? To me it seems very low. It will be good to test with 1K or even 10K.<br /> On my machine (2.4 GHz Intel core2 duo Macbook PRO, running Ubuntu in VM with 4GB RAM, 1067 DDR3), in 5 Sec, approx 1M rows were inserted. So checkingevery 100 rows looks overkill.<br /><br />3.<br />Please indent following block as per the indentation just abovethat<br /><br /><span style="font-family:courier new,monospace"> /* used to track elapsed time and estimate of theremaining time */<br /> instr_time start, diff;<br /> double elapsed_sec, remaining_sec;<br /> int log_interval= 1;<br /></span><br />4.<br /><span style="font-family:courier new,monospace">+ /* have ve reachedthe next interval? */</span><br />Do you mean "have WE reached..."<br /><br />5.<br />While applying a patch, I gotfew white-space errors. But I think every patch goes through pgindent which might take care of this.<br /><br />Thanks<br/><br /><div class="gmail_extra"><br /><br /><div class="gmail_quote">On Sun, Nov 11, 2012 at 11:02 PM, TomasVondra <span dir="ltr"><<a href="mailto:tv@fuzzy.cz" target="_blank">tv@fuzzy.cz</a>></span> wrote:<br /><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><divclass="h5">On 23.10.2012 18:21, Robert Haas wrote:<br /> > On Tue, Oct 23, 2012 at 12:02 PM, AlvaroHerrera<br /> > <<a href="mailto:alvherre@2ndquadrant.com">alvherre@2ndquadrant.com</a>> wrote:<br /> >>Tomas Vondra wrote:<br /> >><br /> >>> I've been thinking about this a bit more, and do propose touse an<br /> >>> option that determines "logging step" i.e. number of items (either<br /> >>> directlyor as a percentage) between log lines.<br /> >>><br /> >>> The attached patch defines a new option"--logging-step" that accepts<br /> >>> either integers or percents. For example if you want to print a line<br/> >>> each 1000 lines, you can to this<br /> >>><br /> >>> $ pgbench -i -s 1000 --logging-step1000 testdb<br /> >><br /> >> I find it hard to get excited about having to specify a command line<br/> >> argument to tweak this. Would it work to have it emit messages<br /> >> depending on elapsed timeand log scale of tuples emitted? So for<br /> >> example emit the first message after 5 seconds or 100k tuples,then back<br /> >> off until (say) 15 seconds have lapsed and 1M tuples, etc? The idea is<br /> >> tomake it verbose enough to keep a human satisfied with what he sees,<br /> >> but not flood the terminal with pointlessupdates. (I think printing<br /> >> the ETA might be nice as well, not sure).<br /> ><br /> > I likethis idea. One of the times when the more verbose output is<br /> > really useful is when you expect it to run fastbut then it turns out<br /> > that for some reason it runs really slow. If you make the output too<br /> > terse,then you end up not really knowing what's going on. Having it<br /> > give an update at least every 5 seconds wouldbe a nice way to give<br /> > the user a heads-up if things aren't going as planned, without<br /> > clutteringthe normal case.<br /><br /></div></div>I've prepared a patch along these lines. The attached version used only<br/> elapsed time to print the log messages each 5 seconds, so now it prints<br /> a meessage each 5 seconds no matterwhat, along with an estimate of<br /> remaining time.<br /><br /> I've removed the config option, although it mightbe useful to specify<br /> the interval?<br /><br /> I'm not entirely sure how the 'log scale of tuples' should work- for<br /> example when the time 15 seconds limit is reached, should it be reset<br /> back to the previous step (5seconds) to give a more detailed info, or<br /> should it be kept at 15 seconds?<br /><span class="HOEnZb"><font color="#888888"><br/> Tomas<br /><br /></font></span><br /><br /> --<br /> Sent via pgsql-hackers mailing list (<a href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-hackers" target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/><br /></blockquote></div><br /><br clear="all" /><br/>-- <br />Jeevan B Chalke<br />Senior Software Engineer, R&D<br />EnterpriseDB Corporation<br />The EnterprisePostgreSQL Company<br /><br />Phone: +91 20 30589500<br /><br />Website: <a href="http://www.enterprisedb.com"target="_blank">www.enterprisedb.com</a><br /> EnterpriseDB Blog: <a href="http://blogs.enterprisedb.com/"target="_blank">http://blogs.enterprisedb.com/</a><br />Follow us on Twitter: <a href="http://www.twitter.com/enterprisedb"target="_blank">http://www.twitter.com/enterprisedb</a><br /><br />This e-mailmessage (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This messagecontains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosureunder applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient,any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited.If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete thismessage.<br /></div>
On 19.11.2012 11:59, Jeevan Chalke wrote: > Hi, > > I gone through the discussion for this patch and here is my review: > > The main aim of this patch is to reduce the number of log lines. It is > also suggested to use an options to provide the interval but few of us > are not much agree on it. So final discussion ended at keeping 5 sec > interval between each log line. > > However, I see, there are two types of users here: > 1. Who likes these log lines, so that they can troubleshoot some > slowness and all > 2. Who do not like these log lines. Who likes these lines / needs them for something useful? > So keeping these in mind, I rather go for an option which will control > this. People falling in category one can set this option to very low > where as users falling under second category can keep it high. So what option(s) would you expect? Something that tunes the interval length or something else? A switch that'd choose between the old and new behavior might be a good idea, but I'd strongly vote against "automagic" heuristics. It makes the behavior very difficult to predict and I really don't want to force the users to wonder whether the long delay is due to general slowness of the machine or some "clever" logic that causes long delays between log messages. That's why I choose a very simple approach with constant time interval. It does what I was aiming for (less logs) and it's easy to predict. Sure, we could choose different interval (or make it an option). > However, assuming we settled on 5 sec delay, here are few comments on > that patch attached: > > Comments: > ========= > > Patch gets applied cleanly with some whitespace errors. make and make > install too went smooth. > make check was smooth. Rather it should be smooth since there are NO > changes in other part of the code rather than just pgbench.c and we do > not have any test-case as well. > > However, here are few comments on changes in pgbench.c > > 1. > Since the final discussion ended at keeping a 5 seconds interval will be > good enough, Author used a global int variable for that. > Given that it's just a constant, #define would be a better choice. Good point. Although if we add an option to supply different values, a variable is a better match. > 2. > + /* let's not call the timing for each row, but only each 100 > rows */ > Why only 100 rows ? Have you done any testing to come up with number 100 > ? To me it seems very low. It will be good to test with 1K or even 10K. > On my machine (2.4 GHz Intel core 2 duo Macbook PRO, running Ubuntu in > VM with 4GB RAM, 1067 DDR3), in 5 Sec, approx 1M rows were inserted. So > checking every 100 rows looks overkill. Well, the 100 is clearly a magical constant. The goal was to choose a value large enough to amortize the getlocaltime() cost, but small enough to print info even in cases when the performance sucks for some reason. I've seen issues where the speed suddenly dropped to a fraction of the expected value, e.g. 100 rows/second, and in those cases you'd have to wait for a very long time to actually get the next log message (with the suggested 10k step). So 100 seems like a good compromise to me ... > 3. > Please indent following block as per the indentation just above that > > /* used to track elapsed time and estimate of the remaining time */ > instr_time start, diff; > double elapsed_sec, remaining_sec; > int log_interval = 1; OK > 4. > + /* have ve reached the next interval? */ > Do you mean "have WE reached..." OK > > 5. > While applying a patch, I got few white-space errors. But I think every > patch goes through pgindent which might take care of this. OK > > Thanks Thanks for the review. I'll wait for a bit more discussion about the choices before submitting another version of the patch. Tomas
Hi,
Thanks
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
On Tue, Nov 20, 2012 at 12:08 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
No idea. I fall in second category.
But from the discussion, I believe some people may need detailed (or lot more) output.
Interval length.
I am preferring an option for choosing an interval, say from 1 second to 10 seconds.
BTW, what if, we put one log message every 10% (or 5%) with time taken (time taken for last 10% (or 5%) and cumulative) over 5 seconds ?
This will have only 10 (or 20) lines per pgbench initialisation.
And since we are showing time taken for each block, if any slowness happens, one can easily find a block by looking at the timings and troubleshoot it.
Though 10% or 5% is again a debatable number, but keeping it constant will eliminate the requirement of an option.
Exactly, if we ended up with an option then it looks good. But in your current patch it was constant, so #define should be preferred.
Hmm... inserting just 100 rows / seconds is really slow.
Since you already seen such behaviour, I have no objection keeping it 100.
Sure
On 19.11.2012 11:59, Jeevan Chalke wrote:Who likes these lines / needs them for something useful?
> Hi,
>
> I gone through the discussion for this patch and here is my review:
>
> The main aim of this patch is to reduce the number of log lines. It is
> also suggested to use an options to provide the interval but few of us
> are not much agree on it. So final discussion ended at keeping 5 sec
> interval between each log line.
>
> However, I see, there are two types of users here:
> 1. Who likes these log lines, so that they can troubleshoot some
> slowness and all
> 2. Who do not like these log lines.
No idea. I fall in second category.
But from the discussion, I believe some people may need detailed (or lot more) output.
So what option(s) would you expect? Something that tunes the interval
> So keeping these in mind, I rather go for an option which will control
> this. People falling in category one can set this option to very low
> where as users falling under second category can keep it high.
length or something else?
Interval length.
A switch that'd choose between the old and new behavior might be a good
idea, but I'd strongly vote against "automagic" heuristics. It makes the
behavior very difficult to predict and I really don't want to force the
users to wonder whether the long delay is due to general slowness of the
machine or some "clever" logic that causes long delays between log messages.
That's why I choose a very simple approach with constant time interval.
It does what I was aiming for (less logs) and it's easy to predict.
Sure, we could choose different interval (or make it an option).
I am preferring an option for choosing an interval, say from 1 second to 10 seconds.
BTW, what if, we put one log message every 10% (or 5%) with time taken (time taken for last 10% (or 5%) and cumulative) over 5 seconds ?
This will have only 10 (or 20) lines per pgbench initialisation.
And since we are showing time taken for each block, if any slowness happens, one can easily find a block by looking at the timings and troubleshoot it.
Though 10% or 5% is again a debatable number, but keeping it constant will eliminate the requirement of an option.
Good point. Although if we add an option to supply different values, a
> However, assuming we settled on 5 sec delay, here are few comments on
> that patch attached:
>
> Comments:
> =========
>
> Patch gets applied cleanly with some whitespace errors. make and make
> install too went smooth.
> make check was smooth. Rather it should be smooth since there are NO
> changes in other part of the code rather than just pgbench.c and we do
> not have any test-case as well.
>
> However, here are few comments on changes in pgbench.c
>
> 1.
> Since the final discussion ended at keeping a 5 seconds interval will be
> good enough, Author used a global int variable for that.
> Given that it's just a constant, #define would be a better choice.
variable is a better match.
Exactly, if we ended up with an option then it looks good. But in your current patch it was constant, so #define should be preferred.
Well, the 100 is clearly a magical constant. The goal was to choose a
> 2.
> + /* let's not call the timing for each row, but only each 100
> rows */
> Why only 100 rows ? Have you done any testing to come up with number 100
> ? To me it seems very low. It will be good to test with 1K or even 10K.
> On my machine (2.4 GHz Intel core 2 duo Macbook PRO, running Ubuntu in
> VM with 4GB RAM, 1067 DDR3), in 5 Sec, approx 1M rows were inserted. So
> checking every 100 rows looks overkill.
value large enough to amortize the getlocaltime() cost, but small enough
to print info even in cases when the performance sucks for some reason.
I've seen issues where the speed suddenly dropped to a fraction of the
expected value, e.g. 100 rows/second, and in those cases you'd have to
wait for a very long time to actually get the next log message (with the
suggested 10k step).
So 100 seems like a good compromise to me ...
Hmm... inserting just 100 rows / seconds is really slow.
Since you already seen such behaviour, I have no objection keeping it 100.
OK
> 3.
> Please indent following block as per the indentation just above that
>
> /* used to track elapsed time and estimate of the remaining time */
> instr_time start, diff;
> double elapsed_sec, remaining_sec;
> int log_interval = 1;OK
> 4.
> + /* have ve reached the next interval? */
> Do you mean "have WE reached..."OK
>
> 5.
> While applying a patch, I got few white-space errors. But I think every
> patch goes through pgindent which might take care of this.
>
> Thanks
Thanks for the review. I'll wait for a bit more discussion about the
choices before submitting another version of the patch.
Sure
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Thanks
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
On 20.11.2012 08:22, Jeevan Chalke wrote: > Hi, > > > On Tue, Nov 20, 2012 at 12:08 AM, Tomas Vondra <tv@fuzzy.cz > <mailto:tv@fuzzy.cz>> wrote: > > On 19.11.2012 11:59, Jeevan Chalke wrote: > > Hi, > > > > I gone through the discussion for this patch and here is my review: > > > > The main aim of this patch is to reduce the number of log lines. It is > > also suggested to use an options to provide the interval but few of us > > are not much agree on it. So final discussion ended at keeping 5 sec > > interval between each log line. > > > > However, I see, there are two types of users here: > > 1. Who likes these log lines, so that they can troubleshoot some > > slowness and all > > 2. Who do not like these log lines. > > Who likes these lines / needs them for something useful? > > > No idea. I fall in second category. > > But from the discussion, I believe some people may need detailed (or lot > more) output. I've read the thread again and my impression is that no one really needs or likes those lines, but (1) it's rather pointless to print a message every 100k rows, as it usually fills the console with garbabe (2) it's handy to have regular updates of the progress I don't think there're people (in the thread) that require to keep the current amount of log messages. But there might be users who actually use the current logs to do something (although I can't imagine what). If we want to do this in a backwards compatible way, we should probably use a new option (e.g. "-q") to enable the new (less verbose) logging. Do we want to allow both types of logging, or shall we keep only the new one? If both, which one should be the default one? > > So keeping these in mind, I rather go for an option which will control > > this. People falling in category one can set this option to very low > > where as users falling under second category can keep it high. > > So what option(s) would you expect? Something that tunes the interval > length or something else? > > > Interval length. Well, I can surely imagine something like "--interval N". > A switch that'd choose between the old and new behavior might be a good > idea, but I'd strongly vote against "automagic" heuristics. It makes the > behavior very difficult to predict and I really don't want to force the > users to wonder whether the long delay is due to general slowness of the > machine or some "clever" logic that causes long delays between log > messages. > > That's why I choose a very simple approach with constant time interval. > It does what I was aiming for (less logs) and it's easy to predict. > Sure, we could choose different interval (or make it an option). > > > I am preferring an option for choosing an interval, say from 1 second to > 10 seconds. Ummmm, why not to allow arbitrary integer? Why saying 1 to 10 seconds? > BTW, what if, we put one log message every 10% (or 5%) with time taken > (time taken for last 10% (or 5%) and cumulative) over 5 seconds ? > This will have only 10 (or 20) lines per pgbench initialisation. > And since we are showing time taken for each block, if any slowness > happens, one can easily find a block by looking at the timings and > troubleshoot it. > Though 10% or 5% is again a debatable number, but keeping it constant > will eliminate the requirement of an option. That's what I originally proposed in September (see the messages from 17/9), and Alvaro was not relly excited about this. Attached is a patch with fixed whitespace / indentation errors etc. Otherwise it's the same as the previous version. Tomas
Attachment
On Sun, Dec 9, 2012 at 8:11 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
Both the options are fine with me, but the default should be the current behaviour.
+1
Hmm.. actually, I have no issues with any number there. Just put 1..10 as we hard-coded it 5. No particular reason as such.
OK. Looks good now.
Any other views / suggestions are welcome.
Thanks
On 20.11.2012 08:22, Jeevan Chalke wrote:
> Hi,
>
>
> On Tue, Nov 20, 2012 at 12:08 AM, Tomas Vondra <tv@fuzzy.cz> <mailto:tv@fuzzy.cz>> wrote:I've read the thread again and my impression is that no one really needs
>
> On 19.11.2012 11:59, Jeevan Chalke wrote:
> > Hi,
> >
> > I gone through the discussion for this patch and here is my review:
> >
> > The main aim of this patch is to reduce the number of log lines. It is
> > also suggested to use an options to provide the interval but few of us
> > are not much agree on it. So final discussion ended at keeping 5 sec
> > interval between each log line.
> >
> > However, I see, there are two types of users here:
> > 1. Who likes these log lines, so that they can troubleshoot some
> > slowness and all
> > 2. Who do not like these log lines.
>
> Who likes these lines / needs them for something useful?
>
>
> No idea. I fall in second category.
>
> But from the discussion, I believe some people may need detailed (or lot
> more) output.
or likes those lines, but
(1) it's rather pointless to print a message every 100k rows, as it
usually fills the console with garbabe
(2) it's handy to have regular updates of the progress
I don't think there're people (in the thread) that require to keep the
current amount of log messages.
But there might be users who actually use the current logs to do
something (although I can't imagine what). If we want to do this in a
backwards compatible way, we should probably use a new option (e.g.
"-q") to enable the new (less verbose) logging.
Do we want to allow both types of logging, or shall we keep only the new
one? If both, which one should be the default one?
Both the options are fine with me, but the default should be the current behaviour.
Well, I can surely imagine something like "--interval N".
> > So keeping these in mind, I rather go for an option which will control
> > this. People falling in category one can set this option to very low
> > where as users falling under second category can keep it high.
>
> So what option(s) would you expect? Something that tunes the interval
> length or something else?
>
>
> Interval length.
+1
Ummmm, why not to allow arbitrary integer? Why saying 1 to 10 seconds?
> A switch that'd choose between the old and new behavior might be a good
> idea, but I'd strongly vote against "automagic" heuristics. It makes the
> behavior very difficult to predict and I really don't want to force the
> users to wonder whether the long delay is due to general slowness of the
> machine or some "clever" logic that causes long delays between log
> messages.
>
> That's why I choose a very simple approach with constant time interval.
> It does what I was aiming for (less logs) and it's easy to predict.
> Sure, we could choose different interval (or make it an option).
>
>
> I am preferring an option for choosing an interval, say from 1 second to
> 10 seconds.
Hmm.. actually, I have no issues with any number there. Just put 1..10 as we hard-coded it 5. No particular reason as such.
That's what I originally proposed in September (see the messages from
> BTW, what if, we put one log message every 10% (or 5%) with time taken
> (time taken for last 10% (or 5%) and cumulative) over 5 seconds ?
> This will have only 10 (or 20) lines per pgbench initialisation.
> And since we are showing time taken for each block, if any slowness
> happens, one can easily find a block by looking at the timings and
> troubleshoot it.
> Though 10% or 5% is again a debatable number, but keeping it constant
> will eliminate the requirement of an option.
17/9), and Alvaro was not relly excited about this.
Attached is a patch with fixed whitespace / indentation errors etc.
Otherwise it's the same as the previous version.
OK. Looks good now.
Any other views / suggestions are welcome.
Thanks
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
Hi, attached is a new version of the patch that (a) converts the 'log_step_seconds' variable to a constant (and does not allow changing it using a command-line option etc.) (b) keeps the current logging as a default (c) adds a "-q" switch that enables the new logging with a 5-second interval I'm still not convinced there should be yet another know for tuning the log interval - opinions? Tomas On 11.12.2012 10:23, Jeevan Chalke wrote: > > > > On Sun, Dec 9, 2012 at 8:11 AM, Tomas Vondra <tv@fuzzy.cz > <mailto:tv@fuzzy.cz>> wrote: > > On 20.11.2012 08:22, Jeevan Chalke wrote: > > Hi, > > > > > > On Tue, Nov 20, 2012 at 12:08 AM, Tomas Vondra <tv@fuzzy.cz > <mailto:tv@fuzzy.cz> > > <mailto:tv@fuzzy.cz <mailto:tv@fuzzy.cz>>> wrote: > > > > On 19.11.2012 11:59, Jeevan Chalke wrote: > > > Hi, > > > > > > I gone through the discussion for this patch and here is my > review: > > > > > > The main aim of this patch is to reduce the number of log > lines. It is > > > also suggested to use an options to provide the interval but > few of us > > > are not much agree on it. So final discussion ended at > keeping 5 sec > > > interval between each log line. > > > > > > However, I see, there are two types of users here: > > > 1. Who likes these log lines, so that they can troubleshoot some > > > slowness and all > > > 2. Who do not like these log lines. > > > > Who likes these lines / needs them for something useful? > > > > > > No idea. I fall in second category. > > > > But from the discussion, I believe some people may need detailed > (or lot > > more) output. > > I've read the thread again and my impression is that no one really needs > or likes those lines, but > > (1) it's rather pointless to print a message every 100k rows, as it > usually fills the console with garbabe > > (2) it's handy to have regular updates of the progress > > I don't think there're people (in the thread) that require to keep the > current amount of log messages. > > But there might be users who actually use the current logs to do > something (although I can't imagine what). If we want to do this in a > backwards compatible way, we should probably use a new option (e.g. > "-q") to enable the new (less verbose) logging. > > Do we want to allow both types of logging, or shall we keep only the new > one? If both, which one should be the default one? > > > Both the options are fine with me, but the default should be the current > behaviour. > > > > > So keeping these in mind, I rather go for an option which > will control > > > this. People falling in category one can set this option to > very low > > > where as users falling under second category can keep it high. > > > > So what option(s) would you expect? Something that tunes the > interval > > length or something else? > > > > > > Interval length. > > Well, I can surely imagine something like "--interval N". > > > +1 > > > > > A switch that'd choose between the old and new behavior might > be a good > > idea, but I'd strongly vote against "automagic" heuristics. It > makes the > > behavior very difficult to predict and I really don't want to > force the > > users to wonder whether the long delay is due to general > slowness of the > > machine or some "clever" logic that causes long delays between log > > messages. > > > > That's why I choose a very simple approach with constant time > interval. > > It does what I was aiming for (less logs) and it's easy to > predict. > > Sure, we could choose different interval (or make it an option). > > > > > > I am preferring an option for choosing an interval, say from 1 > second to > > 10 seconds. > > Ummmm, why not to allow arbitrary integer? Why saying 1 to 10 seconds? > > > Hmm.. actually, I have no issues with any number there. Just put 1..10 > as we hard-coded it 5. No particular reason as such. > > > > > BTW, what if, we put one log message every 10% (or 5%) with time taken > > (time taken for last 10% (or 5%) and cumulative) over 5 seconds ? > > This will have only 10 (or 20) lines per pgbench initialisation. > > And since we are showing time taken for each block, if any slowness > > happens, one can easily find a block by looking at the timings and > > troubleshoot it. > > Though 10% or 5% is again a debatable number, but keeping it constant > > will eliminate the requirement of an option. > > That's what I originally proposed in September (see the messages from > 17/9), and Alvaro was not relly excited about this. > > Attached is a patch with fixed whitespace / indentation errors etc. > Otherwise it's the same as the previous version. > > > OK. Looks good now. > > Any other views / suggestions are welcome. > > Thanks > > > Tomas > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org > <mailto:pgsql-hackers@postgresql.org>) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > > > > > -- > Jeevan B Chalke > Senior Software Engineer, R&D > EnterpriseDB Corporation > The Enterprise PostgreSQL Company > > Phone: +91 20 30589500 > > Website: www.enterprisedb.com <http://www.enterprisedb.com> > EnterpriseDB Blog: http://blogs.enterprisedb.com/ > Follow us on Twitter: http://www.twitter.com/enterprisedb > > This e-mail message (and any attachment) is intended for the use of the > individual or entity to whom it is addressed. This message contains > information from EnterpriseDB Corporation that may be privileged, > confidential, or exempt from disclosure under applicable law. If you are > not the intended recipient or authorized to receive this for the > intended recipient, any use, dissemination, distribution, retention, > archiving, or copying of this communication is strictly prohibited. If > you have received this e-mail in error, please notify the sender > immediately by reply e-mail and delete this message.
Attachment
On Mon, Dec 17, 2012 at 5:37 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
Hi,
attached is a new version of the patch that
(a) converts the 'log_step_seconds' variable to a constant (and does
not allow changing it using a command-line option etc.)
(b) keeps the current logging as a default
(c) adds a "-q" switch that enables the new logging with a 5-second
interval
I'm still not convinced there should be yet another know for tuning the
log interval - opinions?
It seems that you have generated a patch over your earlier version and due to that it is not cleanly applying on fresh sources.
Please generate patch on fresh sources.
However, I absolutely no issues with the design. Also code review is already done and looks good to me.
I think to move forward on this we need someone from core-team. So I am planning to change its status to "ready-for-committor".
Before that please provide updated patch for final code review.
Thanks
Tomas
On 11.12.2012 10:23, Jeevan Chalke wrote:
>
>
>
> On Sun, Dec 9, 2012 at 8:11 AM, Tomas Vondra <tv@fuzzy.cz> <mailto:tv@fuzzy.cz>> wrote:
>
> On 20.11.2012 08:22, Jeevan Chalke wrote:
> > Hi,
> >
> >
> > On Tue, Nov 20, 2012 at 12:08 AM, Tomas Vondra <tv@fuzzy.cz
> <mailto:tv@fuzzy.cz>> <mailto:pgsql-hackers@postgresql.org>)> > <mailto:tv@fuzzy.cz <mailto:tv@fuzzy.cz>>> wrote:
> >
> > On 19.11.2012 11:59, Jeevan Chalke wrote:
> > > Hi,
> > >
> > > I gone through the discussion for this patch and here is my
> review:
> > >
> > > The main aim of this patch is to reduce the number of log
> lines. It is
> > > also suggested to use an options to provide the interval but
> few of us
> > > are not much agree on it. So final discussion ended at
> keeping 5 sec
> > > interval between each log line.
> > >
> > > However, I see, there are two types of users here:
> > > 1. Who likes these log lines, so that they can troubleshoot some
> > > slowness and all
> > > 2. Who do not like these log lines.
> >
> > Who likes these lines / needs them for something useful?
> >
> >
> > No idea. I fall in second category.
> >
> > But from the discussion, I believe some people may need detailed
> (or lot
> > more) output.
>
> I've read the thread again and my impression is that no one really needs
> or likes those lines, but
>
> (1) it's rather pointless to print a message every 100k rows, as it
> usually fills the console with garbabe
>
> (2) it's handy to have regular updates of the progress
>
> I don't think there're people (in the thread) that require to keep the
> current amount of log messages.
>
> But there might be users who actually use the current logs to do
> something (although I can't imagine what). If we want to do this in a
> backwards compatible way, we should probably use a new option (e.g.
> "-q") to enable the new (less verbose) logging.
>
> Do we want to allow both types of logging, or shall we keep only the new
> one? If both, which one should be the default one?
>
>
> Both the options are fine with me, but the default should be the current
> behaviour.
>
>
> > > So keeping these in mind, I rather go for an option which
> will control
> > > this. People falling in category one can set this option to
> very low
> > > where as users falling under second category can keep it high.
> >
> > So what option(s) would you expect? Something that tunes the
> interval
> > length or something else?
> >
> >
> > Interval length.
>
> Well, I can surely imagine something like "--interval N".
>
>
> +1
>
>
>
> > A switch that'd choose between the old and new behavior might
> be a good
> > idea, but I'd strongly vote against "automagic" heuristics. It
> makes the
> > behavior very difficult to predict and I really don't want to
> force the
> > users to wonder whether the long delay is due to general
> slowness of the
> > machine or some "clever" logic that causes long delays between log
> > messages.
> >
> > That's why I choose a very simple approach with constant time
> interval.
> > It does what I was aiming for (less logs) and it's easy to
> predict.
> > Sure, we could choose different interval (or make it an option).
> >
> >
> > I am preferring an option for choosing an interval, say from 1
> second to
> > 10 seconds.
>
> Ummmm, why not to allow arbitrary integer? Why saying 1 to 10 seconds?
>
>
> Hmm.. actually, I have no issues with any number there. Just put 1..10
> as we hard-coded it 5. No particular reason as such.
>
>
>
> > BTW, what if, we put one log message every 10% (or 5%) with time taken
> > (time taken for last 10% (or 5%) and cumulative) over 5 seconds ?
> > This will have only 10 (or 20) lines per pgbench initialisation.
> > And since we are showing time taken for each block, if any slowness
> > happens, one can easily find a block by looking at the timings and
> > troubleshoot it.
> > Though 10% or 5% is again a debatable number, but keeping it constant
> > will eliminate the requirement of an option.
>
> That's what I originally proposed in September (see the messages from
> 17/9), and Alvaro was not relly excited about this.
>
> Attached is a patch with fixed whitespace / indentation errors etc.
> Otherwise it's the same as the previous version.
>
>
> OK. Looks good now.
>
> Any other views / suggestions are welcome.
>
> Thanks
>
>
> Tomas
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org> To make changes to your subscription:> Website: www.enterprisedb.com <http://www.enterprisedb.com>
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>
>
>
> --
> Jeevan B Chalke
> Senior Software Engineer, R&D
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>
> Phone: +91 20 30589500
>> EnterpriseDB Blog: http://blogs.enterprisedb.com/
> Follow us on Twitter: http://www.twitter.com/enterprisedb
>
> This e-mail message (and any attachment) is intended for the use of the
> individual or entity to whom it is addressed. This message contains
> information from EnterpriseDB Corporation that may be privileged,
> confidential, or exempt from disclosure under applicable law. If you are
> not the intended recipient or authorized to receive this for the
> intended recipient, any use, dissemination, distribution, retention,
> archiving, or copying of this communication is strictly prohibited. If
> you have received this e-mail in error, please notify the sender
> immediately by reply e-mail and delete this message.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
On 19.12.2012 06:30, Jeevan Chalke wrote: > > > > On Mon, Dec 17, 2012 at 5:37 AM, Tomas Vondra <tv@fuzzy.cz > <mailto:tv@fuzzy.cz>> wrote: > > Hi, > > attached is a new version of the patch that > > (a) converts the 'log_step_seconds' variable to a constant (and does > not allow changing it using a command-line option etc.) > > (b) keeps the current logging as a default > > (c) adds a "-q" switch that enables the new logging with a 5-second > interval > > I'm still not convinced there should be yet another know for tuning the > log interval - opinions? > > > It seems that you have generated a patch over your earlier version and > due to that it is not cleanly applying on fresh sources. > Please generate patch on fresh sources. Seems you're right - I've attached the proper patch against current master. > However, I absolutely no issues with the design. Also code review is > already done and looks good to me. > I think to move forward on this we need someone from core-team. So I am > planning to change its status to "ready-for-committor". > > Before that please provide updated patch for final code review. thanks Tomas
Attachment
Looks good to me.
Will mark "Ready for Committer"On Thu, Dec 20, 2012 at 2:30 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
On 19.12.2012 06:30, Jeevan Chalke wrote:
>
>
>
> On Mon, Dec 17, 2012 at 5:37 AM, Tomas Vondra <tv@fuzzy.cz> <mailto:tv@fuzzy.cz>> wrote:Seems you're right - I've attached the proper patch against current master.
>
> Hi,
>
> attached is a new version of the patch that
>
> (a) converts the 'log_step_seconds' variable to a constant (and does
> not allow changing it using a command-line option etc.)
>
> (b) keeps the current logging as a default
>
> (c) adds a "-q" switch that enables the new logging with a 5-second
> interval
>
> I'm still not convinced there should be yet another know for tuning the
> log interval - opinions?
>
>
> It seems that you have generated a patch over your earlier version and
> due to that it is not cleanly applying on fresh sources.
> Please generate patch on fresh sources.thanks
> However, I absolutely no issues with the design. Also code review is
> already done and looks good to me.
> I think to move forward on this we need someone from core-team. So I am
> planning to change its status to "ready-for-committor".
>
> Before that please provide updated patch for final code review.
Tomas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
--
Jeevan B Chalke
Senior Software Engineer, R&D
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
Phone: +91 20 30589500
Website: www.enterprisedb.com
EnterpriseDB Blog: http://blogs.enterprisedb.com/
Follow us on Twitter: http://www.twitter.com/enterprisedb
This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message.
> On 19.12.2012 06:30, Jeevan Chalke wrote: >> >> >> >> On Mon, Dec 17, 2012 at 5:37 AM, Tomas Vondra <tv@fuzzy.cz >> <mailto:tv@fuzzy.cz>> wrote: >> >> Hi, >> >> attached is a new version of the patch that >> >> (a) converts the 'log_step_seconds' variable to a constant (and does >> not allow changing it using a command-line option etc.) >> >> (b) keeps the current logging as a default >> >> (c) adds a "-q" switch that enables the new logging with a 5-second >> interval >> >> I'm still not convinced there should be yet another know for tuning the >> log interval - opinions? >> >> >> It seems that you have generated a patch over your earlier version and >> due to that it is not cleanly applying on fresh sources. >> Please generate patch on fresh sources. > > Seems you're right - I've attached the proper patch against current master. > >> However, I absolutely no issues with the design. Also code review is >> already done and looks good to me. >> I think to move forward on this we need someone from core-team. So I am >> planning to change its status to "ready-for-committor". >> >> Before that please provide updated patch for final code review. As a committer, I have looked into the patch. I noticed two things: 1) In the help you put '-q' option into "Common options" section. I think this should be moved to "Initialization options" section because the option is only applied while initializing. 2) Shouldn't a long option for '-q' be provided? Something like '--quiet-progress-logging'? 3) No patches for docs found (doc/src/sgml/pgbench.sgml) -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
On 6.1.2013 03:03, Tatsuo Ishii wrote: > As a committer, I have looked into the patch. I noticed two things: > > 1) In the help you put '-q' option into "Common options" section. I > think this should be moved to "Initialization options" section because > the option is only applied while initializing. Good point, moved. > 2) Shouldn't a long option for '-q' be provided? Something like > '--quiet-progress-logging'? I don't think so. Currently pgbench has either short or long option, not both (for the same thing). I believe we should stick to this and either choose "-q" or "--quiet-logging" but not both. > 3) No patches for docs found (doc/src/sgml/pgbench.sgml) I've added a brief description of the "-q" option into the docs. IMHO it's enough but feel free to add some more details. There's one more thing I've just noticed - the original version of the patch simply removed the old logging, but this one keeps both old and quiet logging. But the old logging still uses this: fprintf(stderr, "%d of %d tuples (%d%%) done.\n", .... while the new logging does this fprintf(stderr, "%d of %d tuples (%d%%) done (elapsed %.2f s, remaining %.2f s).\n", i.e. it prints additional info about elapsed/estimated time. Do we want to keep it this way (i.e. not to mess with the old logging) or do we want to add these new fields to the old logging too? I suggest to add it to the old logging, to keep the log messages the same, the only difference being the logging frequency. Tomas
Attachment
> On 6.1.2013 03:03, Tatsuo Ishii wrote: >> As a committer, I have looked into the patch. I noticed two things: >> >> 1) In the help you put '-q' option into "Common options" section. I >> think this should be moved to "Initialization options" section because >> the option is only applied while initializing. > > Good point, moved. In addition to this, I'd suggest to add checking -q is only possible with -i option since without -i, -q is meaningless. >> 2) Shouldn't a long option for '-q' be provided? Something like >> '--quiet-progress-logging'? > > I don't think so. Currently pgbench has either short or long option, not > both (for the same thing). I believe we should stick to this and either > choose "-q" or "--quiet-logging" but not both. Ok. >> 3) No patches for docs found (doc/src/sgml/pgbench.sgml) > > I've added a brief description of the "-q" option into the docs. IMHO > it's enough but feel free to add some more details. Good. > There's one more thing I've just noticed - the original version of the > patch simply removed the old logging, but this one keeps both old and > quiet logging. But the old logging still uses this: > > fprintf(stderr, "%d of %d tuples (%d%%) done.\n", .... > > while the new logging does this > > fprintf(stderr, "%d of %d tuples (%d%%) done (elapsed %.2f s, > remaining %.2f s).\n", > > i.e. it prints additional info about elapsed/estimated time. Do we want > to keep it this way (i.e. not to mess with the old logging) or do we > want to add these new fields to the old logging too? > > I suggest to add it to the old logging, to keep the log messages the > same, the only difference being the logging frequency. If we do so, probably '-q' is not appropeate option name any more, since the only difference between old logging and new one is, the former is printed every 10000 lines while the latter is every 5 seconds, which is not really "quiet". What do you think? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
On 6.1.2013 05:07, Tatsuo Ishii wrote: >> On 6.1.2013 03:03, Tatsuo Ishii wrote: >>> As a committer, I have looked into the patch. I noticed two things: >>> >>> 1) In the help you put '-q' option into "Common options" section. I >>> think this should be moved to "Initialization options" section because >>> the option is only applied while initializing. >> >> Good point, moved. > > In addition to this, I'd suggest to add checking -q is only possible > with -i option since without -i, -q is meaningless. Done. >> There's one more thing I've just noticed - the original version of the >> patch simply removed the old logging, but this one keeps both old and >> quiet logging. But the old logging still uses this: >> >> fprintf(stderr, "%d of %d tuples (%d%%) done.\n", .... >> >> while the new logging does this >> >> fprintf(stderr, "%d of %d tuples (%d%%) done (elapsed %.2f s, >> remaining %.2f s).\n", >> >> i.e. it prints additional info about elapsed/estimated time. Do we want >> to keep it this way (i.e. not to mess with the old logging) or do we >> want to add these new fields to the old logging too? >> >> I suggest to add it to the old logging, to keep the log messages the >> same, the only difference being the logging frequency. > > If we do so, probably '-q' is not appropeate option name any more, > since the only difference between old logging and new one is, the > former is printed every 10000 lines while the latter is every 5 > seconds, which is not really "quiet". What do you think? AFAIK the "5 second" logging is much quieter in most cases (and a bit more verbose when the initialization gets very slower), so I think the "quiet" logging is not a bad match although maybe there's a better name. This change (adding the elapsed/remaining fields to the original loggin) would be consistent with this name, because considering a single line, the "-q" is more verbose right now. So I'd stick with the "-q" option and added the fields to the original logging. But I'm not opposing a different name, I just can't think of a better one. Tomas
Attachment
>> If we do so, probably '-q' is not appropeate option name any more, >> since the only difference between old logging and new one is, the >> former is printed every 10000 lines while the latter is every 5 >> seconds, which is not really "quiet". What do you think? > > AFAIK the "5 second" logging is much quieter in most cases (and a bit > more verbose when the initialization gets very slower), so I think the > "quiet" logging is not a bad match although maybe there's a better name. > > This change (adding the elapsed/remaining fields to the original loggin) > would be consistent with this name, because considering a single line, > the "-q" is more verbose right now. > > So I'd stick with the "-q" option and added the fields to the original > logging. But I'm not opposing a different name, I just can't think of a > better one. Ok, I'm with you ("-q" and along with adding the elapsed/remaining fields to the original logging). -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
On 6.1.2013 10:35, Tatsuo Ishii wrote: >>> If we do so, probably '-q' is not appropeate option name any more, >>> since the only difference between old logging and new one is, the >>> former is printed every 10000 lines while the latter is every 5 >>> seconds, which is not really "quiet". What do you think? >> >> AFAIK the "5 second" logging is much quieter in most cases (and a bit >> more verbose when the initialization gets very slower), so I think the >> "quiet" logging is not a bad match although maybe there's a better name. >> >> This change (adding the elapsed/remaining fields to the original loggin) >> would be consistent with this name, because considering a single line, >> the "-q" is more verbose right now. >> >> So I'd stick with the "-q" option and added the fields to the original >> logging. But I'm not opposing a different name, I just can't think of a >> better one. > > Ok, I'm with you ("-q" and along with adding the elapsed/remaining > fields to the original logging). Great, attached is a patch that does that. Tomas
Attachment
>>> AFAIK the "5 second" logging is much quieter in most cases (and a bit >>> more verbose when the initialization gets very slower), so I think the >>> "quiet" logging is not a bad match although maybe there's a better name. >>> >>> This change (adding the elapsed/remaining fields to the original loggin) >>> would be consistent with this name, because considering a single line, >>> the "-q" is more verbose right now. >>> >>> So I'd stick with the "-q" option and added the fields to the original >>> logging. But I'm not opposing a different name, I just can't think of a >>> better one. >> >> Ok, I'm with you ("-q" and along with adding the elapsed/remaining >> fields to the original logging). > > Great, attached is a patch that does that. Committed. Thanks. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp