Re: [HACKERS] pg_basebackup --progress output for batch execution - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [HACKERS] pg_basebackup --progress output for batch execution
Date
Msg-id 8bdb8be9-d6c0-ee5f-3641-be165169cdee@2ndquadrant.com
Whole thread Raw
In response to Re: [HACKERS] pg_basebackup --progress output for batch execution  (Martín Marqués <martin@2ndquadrant.com>)
List pgsql-hackers

On 11/10/2017 02:32 PM, Martín Marqués wrote:
> Hi,
> 
> Thanks for having a look at this patch.
> 
> 2017-11-09 20:55 GMT-03:00 Jeff Janes <jeff.janes@gmail.com>:
>> On Fri, Sep 29, 2017 at 4:00 PM, Martin Marques
>> <martin.marques@2ndquadrant.com> wrote:
>>>
>>> Hi,
>>>
>>> Some time ago I had to work on a system where I was cloning a standby
>>> using pg_basebackup, that didn't have screen or tmux. For that reason I
>>> redirected the output to a file and ran it with nohup.
>>>
>>> I normally (always actually ;) ) run pg_basebackup with --progress and
>>> --verbose so I can follow how much has been done. When done on a tty you
>>> get a nice progress bar with the percentage that has been cloned.
>>>
>>> The problem came with the execution and redirection of the output, as
>>> the --progress option will write a *very* long line!
>>>
>>> Back then I thought of writing a patch (actually someone suggested I do
>>> so) to add a --batch-mode option which would change the behavior
>>> pg_basebackup has when printing the output messages.
>>
>>
>>
>> While separate lines in the output file is better than one very long line,
>> it still doesn't seem so useful.  If you aren't watching it in real time,
>> then you really need to have a timestamp on each line so that you can
>> interpret the output.  The lines are about one second apart, but I don't
>> know robust that timing is under all conditions.
> 
> I kind of disagree with your view here.
> 
> If the cloning process takes many hours to complete (in my case, it
> was around 12 hours IIRC) you might want to peak at the log every now
> and then with tail.
> 
> I do agree on adding a timestamp prefix to each line, as it's not
> clear from the code how often progress_report() is called.
> 
>> I think I agree with Arthur that I'd rather have the decision made by
>> inspecting whether output is going to a tty, rather than by adding another
>> command line option.  But maybe that is not detected robustly enough across
>> all platforms and circumstances?
> 
> In this case, Arthur's idea is good, but would make the patch less 
> generic/flexible for the end user.
> 

I'm not quite sure about that. We're not getting the flexibility for
free, but for additional complexity (additional command-line option).
And what valuable capability would we (or the users) loose, really, by
relying on the isatty call?

So what use case is possible with --batch-mode but not with isatty (e.g.
when combined with tee)?

>
> That's why I tried to reproduce what top does when executed with -b 
> (Batch mode operation). There, it's the end user who decides how the 
> output is formatted (well, saying it decides on formatting a bit of
> an overstatement, but you get it ;) )
> 

In 'top' a batch mode also disables input, and runs for a certain number
of interactions (as determined by '-n' option). That's not the case in
pg_basebackup, where it only adds the extra newline.

>
> An example where using isatty() might fail is if you run 
> pg_basebackup from a tty but redirect the output to a file, I
> believe that in that case isatty() will return true, but it's very
> likely that the user might want batch mode output.
> 

IMHO that should work correctly, as already explained by Arthur.

>
> But maybe we should also add Arthurs idea anyway (when not in batch 
> mode), as it seems pretty lame to output progress in one line if you 
> are not in a tty.
> 

I'd say to just use isatty, unless we can come up with a compelling use
case that is not satisfied by it.

And if we end up using --batch-mode, perhaps we should only allow it
when --progress is used. It's the only thing it affects, and it makes no
difference when used without it.

Furthermore, I propose to reword this:
 <para>  Runs <command>pg_basebackup</command> in batch mode. This is useful  if the output is to be pipped so the
otherend of the pipe reads each  line. </para> <para>  Using this option with <option>--progress</option> will result
in printing each progress output with a newline at the end, instead of a  carrige return. </para>
 

like this:
 <para>  Runs <command>pg_basebackup</command> in batch mode, in which  <option>--progress</option> terminates the
lineswith a regular  newline instead of carriage return. </para> <para>  This is useful if the output is redirected to
afile or a pipe,  instead of a plain console. </para>
 

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] new function for tsquery creartion
Next
From: David Fetter
Date:
Subject: Re: percentile value check can be slow