Re: too much pgbench init output - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: too much pgbench init output
Date
Msg-id 50C3FA3C.2020105@fuzzy.cz
Whole thread Raw
In response to Re: too much pgbench init output  (Jeevan Chalke <jeevan.chalke@enterprisedb.com>)
Responses Re: too much pgbench init output
List pgsql-hackers
On 20.11.2012 08:22, Jeevan Chalke wrote:
> Hi,
>
>
> On Tue, Nov 20, 2012 at 12:08 AM, Tomas Vondra <tv@fuzzy.cz
> <mailto:tv@fuzzy.cz>> wrote:
>
>     On 19.11.2012 11:59, Jeevan Chalke wrote:
>     > Hi,
>     >
>     > I gone through the discussion for this patch and here is my review:
>     >
>     > The main aim of this patch is to reduce the number of log lines. It is
>     > also suggested to use an options to provide the interval but few of us
>     > are not much agree on it. So final discussion ended at keeping 5 sec
>     > interval between each log line.
>     >
>     > However, I see, there are two types of users here:
>     > 1. Who likes these log lines, so that they can troubleshoot some
>     > slowness and all
>     > 2. Who do not like these log lines.
>
>     Who likes these lines / needs them for something useful?
>
>
> No idea. I fall in second category.
>
> But from the discussion, I believe some people may need detailed (or lot
> more) output.

I've read the thread again and my impression is that no one really needs
or likes those lines, but

  (1) it's rather pointless to print a message every 100k rows, as it
      usually fills the console with garbabe

  (2) it's handy to have regular updates of the progress

I don't think there're people (in the thread) that require to keep the
current amount of log messages.

But there might be users who actually use the current logs to do
something (although I can't imagine what). If we want to do this in a
backwards compatible way, we should probably use a new option (e.g.
"-q") to enable the new (less verbose) logging.

Do we want to allow both types of logging, or shall we keep only the new
one? If both, which one should be the default one?

>     > So keeping these in mind, I rather go for an option which will control
>     > this. People falling in category one can set this option to very low
>     > where as users falling under second category can keep it high.
>
>     So what option(s) would you expect? Something that tunes the interval
>     length or something else?
>
>
> Interval length.

Well, I can surely imagine something like "--interval N".

>     A switch that'd choose between the old and new behavior might be a good
>     idea, but I'd strongly vote against "automagic" heuristics. It makes the
>     behavior very difficult to predict and I really don't want to force the
>     users to wonder whether the long delay is due to general slowness of the
>     machine or some "clever" logic that causes long delays between log
>     messages.
>
>     That's why I choose a very simple approach with constant time interval.
>     It does what I was aiming for (less logs) and it's easy to predict.
>     Sure, we could choose different interval (or make it an option).
>
>
> I am preferring an option for choosing an interval, say from 1 second to
> 10 seconds.

Ummmm, why not to allow arbitrary integer? Why saying 1 to 10 seconds?

> BTW, what if, we put one log message every 10% (or 5%) with time taken
> (time taken for last 10% (or 5%) and cumulative) over 5 seconds ?
> This will have only 10 (or 20) lines per pgbench initialisation.
> And since we are showing time taken for each block, if any slowness
> happens, one can easily find a block by looking at the timings and
> troubleshoot it.
> Though 10% or 5% is again a debatable number, but keeping it constant
> will eliminate the requirement of an option.

That's what I originally proposed in September (see the messages from
17/9), and Alvaro was not relly excited about this.

Attached is a patch with fixed whitespace / indentation errors etc.
Otherwise it's the same as the previous version.

Tomas

Attachment

pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: review: pgbench - aggregation of info written into log
Next
From: Joachim Wieland
Date:
Subject: Re: parallel pg_dump