Home > mailing lists
Re: too much pgbench init output - Mailing list pgsql-hackers

From	Jeevan Chalke
Subject	Re: too much pgbench init output
Date	November 19, 2012 14:00:17
Msg-id	CAM2+6=U5PSV3ge2i1qqH65b35afV4Bq7WXN05_PYSyAiUJ7w7A@mail.gmail.com Whole thread Raw
In response to	Re: too much pgbench init output (Tomas Vondra <tv@fuzzy.cz>)
Responses	Re: too much pgbench init output (Tomas Vondra <tv@fuzzy.cz>)
List	pgsql-hackers
Tree view
Hi,<br /><br />I gone through the discussion for this patch and here is my review:<br /><br />The main aim of this
patchis to reduce the number of log lines. It is also suggested to use an options to provide the interval but few of us
arenot much agree on it. So final discussion ended at keeping 5 sec interval between each log line.<br /><br />However,
Isee, there are two types of users here:<br />1. Who likes these log lines, so that they can troubleshoot some slowness
andall<br />2. Who do not like these log lines.<br /><br />So keeping these in mind, I rather go for an option which
willcontrol this. People falling in category one can set this option to very low where as users falling under second
categorycan keep it high.<br /><br />However, assuming we settled on 5 sec delay, here are few comments on that patch
attached:<br/><br />Comments:<br />=========<br /><br />Patch gets applied cleanly with some whitespace errors. make
andmake install too went smooth.<br /> make check was smooth. Rather it should be smooth since there are NO changes in
otherpart of the code rather than just pgbench.c and we do not have any test-case as well.<br /><br />However, here are
fewcomments on changes in pgbench.c<br /><br />1.<br />Since the final discussion ended at keeping a 5 seconds interval
willbe good enough, Author used a global int variable for that.<br />Given that it's just a constant, #define would be
abetter choice.<br /><br /> 2.<br /><span style="font-family:courier new,monospace">+        /* let's not call the
timingfor each row, but only each 100 rows */</span><br />Why only 100 rows ? Have you done any testing to come up with
number100 ? To me it seems very low. It will be good to test with 1K or even 10K.<br /> On my machine (2.4 GHz Intel
core2 duo Macbook PRO, running Ubuntu in VM with 4GB RAM, 1067 DDR3), in 5 Sec, approx 1M rows were inserted. So
checkingevery 100 rows looks overkill.<br /><br />3.<br />Please indent following block as per the indentation just
abovethat<br /><br /><span style="font-family:courier new,monospace">    /* used to track elapsed time and estimate of
theremaining time */<br />    instr_time    start, diff;<br />    double elapsed_sec, remaining_sec;<br />    int
log_interval= 1;<br /></span><br />4.<br /><span style="font-family:courier new,monospace">+            /* have ve
reachedthe next interval? */</span><br />Do you mean "have WE reached..."<br /><br />5.<br />While applying a patch, I
gotfew white-space errors. But I think every patch goes through pgindent which might take care of this.<br /><br
/>Thanks<br/><br /><div class="gmail_extra"><br /><br /><div class="gmail_quote">On Sun, Nov 11, 2012 at 11:02 PM,
TomasVondra <span dir="ltr"><<a href="mailto:tv@fuzzy.cz" target="_blank">tv@fuzzy.cz</a>></span> wrote:<br
/><blockquoteclass="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div
class="HOEnZb"><divclass="h5">On 23.10.2012 18:21, Robert Haas wrote:<br /> > On Tue, Oct 23, 2012 at 12:02 PM,
AlvaroHerrera<br /> > <<a href="mailto:alvherre@2ndquadrant.com">alvherre@2ndquadrant.com</a>> wrote:<br />
>>Tomas Vondra wrote:<br /> >><br /> >>> I've been thinking about this a bit more, and do propose
touse an<br /> >>> option that determines "logging step" i.e. number of items (either<br /> >>>
directlyor as a percentage) between log lines.<br /> >>><br /> >>> The attached patch defines a new
option"--logging-step" that accepts<br /> >>> either integers or percents. For example if you want to print a
line<br/> >>> each 1000 lines, you can to this<br /> >>><br /> >>>   $ pgbench -i -s 1000
--logging-step1000 testdb<br /> >><br /> >> I find it hard to get excited about having to specify a command
line<br/> >> argument to tweak this.  Would it work to have it emit messages<br /> >> depending on elapsed
timeand log scale of tuples emitted?  So for<br /> >> example emit the first message after 5 seconds or 100k
tuples,then back<br /> >> off until (say) 15 seconds have lapsed and 1M tuples, etc?  The idea is<br /> >>
tomake it verbose enough to keep a human satisfied with what he sees,<br /> >> but not flood the terminal with
pointlessupdates.  (I think printing<br /> >> the ETA might be nice as well, not sure).<br /> ><br /> > I
likethis idea.  One of the times when the more verbose output is<br /> > really useful is when you expect it to run
fastbut then it turns out<br /> > that for some reason it runs really slow.  If you make the output too<br /> >
terse,then you end up not really knowing what's going on.  Having it<br /> > give an update at least every 5 seconds
wouldbe a nice way to give<br /> > the user a heads-up if things aren't going as planned, without<br /> >
clutteringthe normal case.<br /><br /></div></div>I've prepared a patch along these lines. The attached version used
only<br/> elapsed time to print the log messages each 5 seconds, so now it prints<br /> a meessage each 5 seconds no
matterwhat, along with an estimate of<br /> remaining time.<br /><br /> I've removed the config option, although it
mightbe useful to specify<br /> the interval?<br /><br /> I'm not entirely sure how the 'log scale of tuples' should
work- for<br /> example when the time 15 seconds limit is reached, should it be reset<br /> back to the previous step
(5seconds) to give a more detailed info, or<br /> should it be kept at 15 seconds?<br /><span class="HOEnZb"><font
color="#888888"><br/> Tomas<br /><br /></font></span><br /><br /> --<br /> Sent via pgsql-hackers mailing list (<a
href="mailto:pgsql-hackers@postgresql.org">pgsql-hackers@postgresql.org</a>)<br/> To make changes to your
subscription:<br/><a href="http://www.postgresql.org/mailpref/pgsql-hackers"
target="_blank">http://www.postgresql.org/mailpref/pgsql-hackers</a><br/><br /></blockquote></div><br /><br clear="all"
/><br/>-- <br />Jeevan B Chalke<br />Senior Software Engineer, R&D<br />EnterpriseDB Corporation<br />The
EnterprisePostgreSQL Company<br /><br />Phone: +91 20 30589500<br /><br />Website: <a
href="http://www.enterprisedb.com"target="_blank">www.enterprisedb.com</a><br /> EnterpriseDB Blog: <a
href="http://blogs.enterprisedb.com/"target="_blank">http://blogs.enterprisedb.com/</a><br />Follow us on Twitter: <a
href="http://www.twitter.com/enterprisedb"target="_blank">http://www.twitter.com/enterprisedb</a><br /><br />This
e-mailmessage (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This
messagecontains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from
disclosureunder applicable law. If you are not the intended recipient or authorized to receive this for the intended
recipient,any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly
prohibited.If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete
thismessage.<br /></div>
pgsql-hackers by date:
From: Alexander Korotkov
Date: 19 November 2012, 13:55:32
Subject: Re: pg_trgm partial-match
From: Andres Freund
Date: 19 November 2012, 14:58:08
Subject: Re: foreign key locks
Re: too much pgbench init output - Mailing list pgsql-hackers

Previous

Next