Thread: [HACKERS] pg_basebackup --progress output for batch execution

[HACKERS] pg_basebackup --progress output for batch execution

From
Martin Marques
Date:
Hi,

Some time ago I had to work on a system where I was cloning a standby
using pg_basebackup, that didn't have screen or tmux. For that reason I
redirected the output to a file and ran it with nohup.

I normally (always actually ;) ) run pg_basebackup with --progress and
--verbose so I can follow how much has been done. When done on a tty you
get a nice progress bar with the percentage that has been cloned.

The problem came with the execution and redirection of the output, as
the --progress option will write a *very* long line!

Back then I thought of writing a patch (actually someone suggested I do
so) to add a --batch-mode option which would change the behavior
pg_basebackup has when printing the output messages.

Attach is the patch. I'll be submitting it to the CF.

P.D.: I'm aware that there's a documentation patch missing. :)

Kind regards,

-- 
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Martin Marques
Date:
Updated patch with documentation of the new option.


-- 
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Arthur Zakirov
Date:
Hello,

On Sun, Oct 01, 2017 at 04:49:17PM -0300, Martin Marques wrote:
> Updated patch with documentation of the new option.
> 

I have checked the patch.
The patch is applied and compiled correctly without any errors. Tests passed.
The documentation doesn't have errors too.


I have a little suggestion. Maybe insert new line without any additional parameters? We can check that stderr is not
terminalusing isatty().
 

The code could become:

if (!isatty(fileno(stderr)))fprintf(stderr, "\n");
elsefprintf(stderr, "\r");

Also it could be good to not insert new line after progress:

if (showprogress)
{progress_report(PQntuples(res), NULL, true);/* if (!batchmode) *//* or */if (isatty(fileno(stderr)))
fprintf(stderr,"\n");    /* Need to move to next line */
 
}

Otherwise we have an extra empty line:

pg_basebackup: initiating base backup, waiting for checkpoint to complete
pg_basebackup: checkpoint completed
pg_basebackup: write-ahead log start point: 0/1C000028 on timeline 1
pg_basebackup: starting background WAL receiver
pg_basebackup: created temporary replication slot "pg_basebackup_19635"    0/183345 kB (0%), 0/1 tablespace
(data_repl/backup_label            )
 
183355/183355 kB (100%), 0/1 tablespace (data_repl/global/pg_control        )
183355/183355 kB (100%), 1/1 tablespace

pg_basebackup: write-ahead log end point: 0/1C0000F8
pg_basebackup: waiting for background process to finish streaming ...
pg_basebackup: base backup completed

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Jeff Janes
Date:
On Fri, Sep 29, 2017 at 4:00 PM, Martin Marques <martin.marques@2ndquadrant.com> wrote:
Hi,

Some time ago I had to work on a system where I was cloning a standby
using pg_basebackup, that didn't have screen or tmux. For that reason I
redirected the output to a file and ran it with nohup.

I normally (always actually ;) ) run pg_basebackup with --progress and
--verbose so I can follow how much has been done. When done on a tty you
get a nice progress bar with the percentage that has been cloned.

The problem came with the execution and redirection of the output, as
the --progress option will write a *very* long line!

Back then I thought of writing a patch (actually someone suggested I do
so) to add a --batch-mode option which would change the behavior
pg_basebackup has when printing the output messages.


While separate lines in the output file is better than one very long line, it still doesn't seem so useful.  If you aren't watching it in real time, then you really need to have a timestamp on each line so that you can interpret the output.  The lines are about one second apart, but I don't know robust that timing is under all conditions.

I think I agree with Arthur that I'd rather have the decision made by inspecting whether output is going to a tty, rather than by adding another command line option.  But maybe that is not detected robustly enough across all platforms and circumstances?

Cheers,

Jeff

Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Arthur Zakirov
Date:
On Thu, Nov 09, 2017 at 03:55:36PM -0800, Jeff Janes wrote:
> 
> I think I agree with Arthur that I'd rather have the decision made by
> inspecting whether output is going to a tty, rather than by adding another
> command line option.  But maybe that is not detected robustly enough across
> all platforms and circumstances?
> 

isatty() is used within Postgres code already (for example, pg_upgrade/util.c).
However, it seems that on Windows isatty() is deprecated and it is recommended to use _isatty(). Moreover, on Windows
itcan give false positive result [1], if I'm not mistaken:
 

> The _isatty function determines whether fd is associated with a character device (a terminal, console, printer, or
serialport).
 

1 - https://msdn.microsoft.com/en-us/library/f4s0ddew(v=vs.140).aspx

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Martín Marqués
Date:
Hi,

Thanks for having a look at this patch.

2017-11-09 20:55 GMT-03:00 Jeff Janes <jeff.janes@gmail.com>:
> On Fri, Sep 29, 2017 at 4:00 PM, Martin Marques
> <martin.marques@2ndquadrant.com> wrote:
>>
>> Hi,
>>
>> Some time ago I had to work on a system where I was cloning a standby
>> using pg_basebackup, that didn't have screen or tmux. For that reason I
>> redirected the output to a file and ran it with nohup.
>>
>> I normally (always actually ;) ) run pg_basebackup with --progress and
>> --verbose so I can follow how much has been done. When done on a tty you
>> get a nice progress bar with the percentage that has been cloned.
>>
>> The problem came with the execution and redirection of the output, as
>> the --progress option will write a *very* long line!
>>
>> Back then I thought of writing a patch (actually someone suggested I do
>> so) to add a --batch-mode option which would change the behavior
>> pg_basebackup has when printing the output messages.
>
>
>
> While separate lines in the output file is better than one very long line,
> it still doesn't seem so useful.  If you aren't watching it in real time,
> then you really need to have a timestamp on each line so that you can
> interpret the output.  The lines are about one second apart, but I don't
> know robust that timing is under all conditions.

I kind of disagree with your view here.

If the cloning process takes many hours to complete (in my case, it
was around 12 hours IIRC) you might want to peak at the log every now
and then with tail.

I do agree on adding a timestamp prefix to each line, as it's not
clear from the code how often progress_report() is called.

> I think I agree with Arthur that I'd rather have the decision made by
> inspecting whether output is going to a tty, rather than by adding another
> command line option.  But maybe that is not detected robustly enough across
> all platforms and circumstances?

In this case, Arthur's idea is good, but would make the patch less
generic/flexible for the end user.

That's why I tried to reproduce what top does when executed with -b
(Batch mode operation). There, it's the end user who decides how the
output is formatted (well, saying it decides on formatting a bit of an
overstatement, but you get it ;) )

An example where using isatty() might fail is if you run pg_basebackup
from a tty but redirect the output to a file, I believe that in that
case isatty() will return true, but it's very likely that the user
might want batch mode output.

But maybe we should also add Arthurs idea anyway (when not in batch
mode), as it seems pretty lame to output progress in one line if you
are not in a tty.

Thoughts?

--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Arthur Zakirov
Date:
On Fri, Nov 10, 2017 at 10:32:23AM -0300, Martín Marqués wrote:
> An example where using isatty() might fail is if you run pg_basebackup
> from a tty but redirect the output to a file, I believe that in that
> case isatty() will return true, but it's very likely that the user
> might want batch mode output.

Sorry if I misunderstood you. I think this can happen if you redirect only standard output (stdout) to a file.
But pg_basebackup writes messages to stderr. So you need redirect stderr too:

pg_basebackup -D data -X stream -R --progress --verbose &> backup

or

pg_basebackup -D data_repl -X stream -R --progress --verbose > backup 2>&1

If you redirect stderr too then isatty() will know that message output is not tty.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Tomas Vondra
Date:

On 11/10/2017 02:32 PM, Martín Marqués wrote:
> Hi,
> 
> Thanks for having a look at this patch.
> 
> 2017-11-09 20:55 GMT-03:00 Jeff Janes <jeff.janes@gmail.com>:
>> On Fri, Sep 29, 2017 at 4:00 PM, Martin Marques
>> <martin.marques@2ndquadrant.com> wrote:
>>>
>>> Hi,
>>>
>>> Some time ago I had to work on a system where I was cloning a standby
>>> using pg_basebackup, that didn't have screen or tmux. For that reason I
>>> redirected the output to a file and ran it with nohup.
>>>
>>> I normally (always actually ;) ) run pg_basebackup with --progress and
>>> --verbose so I can follow how much has been done. When done on a tty you
>>> get a nice progress bar with the percentage that has been cloned.
>>>
>>> The problem came with the execution and redirection of the output, as
>>> the --progress option will write a *very* long line!
>>>
>>> Back then I thought of writing a patch (actually someone suggested I do
>>> so) to add a --batch-mode option which would change the behavior
>>> pg_basebackup has when printing the output messages.
>>
>>
>>
>> While separate lines in the output file is better than one very long line,
>> it still doesn't seem so useful.  If you aren't watching it in real time,
>> then you really need to have a timestamp on each line so that you can
>> interpret the output.  The lines are about one second apart, but I don't
>> know robust that timing is under all conditions.
> 
> I kind of disagree with your view here.
> 
> If the cloning process takes many hours to complete (in my case, it
> was around 12 hours IIRC) you might want to peak at the log every now
> and then with tail.
> 
> I do agree on adding a timestamp prefix to each line, as it's not
> clear from the code how often progress_report() is called.
> 
>> I think I agree with Arthur that I'd rather have the decision made by
>> inspecting whether output is going to a tty, rather than by adding another
>> command line option.  But maybe that is not detected robustly enough across
>> all platforms and circumstances?
> 
> In this case, Arthur's idea is good, but would make the patch less 
> generic/flexible for the end user.
> 

I'm not quite sure about that. We're not getting the flexibility for
free, but for additional complexity (additional command-line option).
And what valuable capability would we (or the users) loose, really, by
relying on the isatty call?

So what use case is possible with --batch-mode but not with isatty (e.g.
when combined with tee)?

>
> That's why I tried to reproduce what top does when executed with -b 
> (Batch mode operation). There, it's the end user who decides how the 
> output is formatted (well, saying it decides on formatting a bit of
> an overstatement, but you get it ;) )
> 

In 'top' a batch mode also disables input, and runs for a certain number
of interactions (as determined by '-n' option). That's not the case in
pg_basebackup, where it only adds the extra newline.

>
> An example where using isatty() might fail is if you run 
> pg_basebackup from a tty but redirect the output to a file, I
> believe that in that case isatty() will return true, but it's very
> likely that the user might want batch mode output.
> 

IMHO that should work correctly, as already explained by Arthur.

>
> But maybe we should also add Arthurs idea anyway (when not in batch 
> mode), as it seems pretty lame to output progress in one line if you 
> are not in a tty.
> 

I'd say to just use isatty, unless we can come up with a compelling use
case that is not satisfied by it.

And if we end up using --batch-mode, perhaps we should only allow it
when --progress is used. It's the only thing it affects, and it makes no
difference when used without it.

Furthermore, I propose to reword this:
 <para>  Runs <command>pg_basebackup</command> in batch mode. This is useful  if the output is to be pipped so the
otherend of the pipe reads each  line. </para> <para>  Using this option with <option>--progress</option> will result
in printing each progress output with a newline at the end, instead of a  carrige return. </para>
 

like this:
 <para>  Runs <command>pg_basebackup</command> in batch mode, in which  <option>--progress</option> terminates the
lineswith a regular  newline instead of carriage return. </para> <para>  This is useful if the output is redirected to
afile or a pipe,  instead of a plain console. </para>
 

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Martín Marqués
Date:
El 14/11/17 a las 06:25, Arthur Zakirov escribió:
> On Fri, Nov 10, 2017 at 10:32:23AM -0300, Martín Marqués wrote:
>> An example where using isatty() might fail is if you run pg_basebackup
>> from a tty but redirect the output to a file, I believe that in that
>> case isatty() will return true, but it's very likely that the user
>> might want batch mode output.
> 
> Sorry if I misunderstood you. I think this can happen if you redirect only standard output (stdout) to a file.

Don't be sorry. It was me who misunderstood how isatty() worked. Shame
on me. :(

I will iterate again with a new patch which has your ideas from before.

P.D.: But I'm going to first open a new thread on WIN32 compatibility of
isatty()

-- 
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Martín Marqués
Date:
El 09/11/17 a las 09:29, Arthur Zakirov escribió:
> Hello,
> 
> On Sun, Oct 01, 2017 at 04:49:17PM -0300, Martin Marques wrote:
>> Updated patch with documentation of the new option.
>>
> 
> I have checked the patch.
> The patch is applied and compiled correctly without any errors. Tests passed.
> The documentation doesn't have errors too.
> 
> 
> I have a little suggestion. Maybe insert new line without any additional parameters? We can check that stderr is not
terminalusing isatty().
 
> 
> The code could become:
> 
> if (!isatty(fileno(stderr)))
>     fprintf(stderr, "\n");
> else
>     fprintf(stderr, "\r");
> 
> Also it could be good to not insert new line after progress:
> 
> if (showprogress)
> {
>     progress_report(PQntuples(res), NULL, true);
>     /* if (!batchmode) */
>     /* or */
>     if (isatty(fileno(stderr)))
>         fprintf(stderr, "\n");    /* Need to move to next line */
> }


New version of patch, without the --batch-mode option and using isatty()

I send in a separate thread a proposal to make isatty() be defined as
_isatty() in windows.


-- 
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Arthur Zakirov
Date:
On Mon, Nov 20, 2017 at 04:45:48PM -0300, Martín Marqués wrote:
> New version of patch, without the --batch-mode option and using isatty()
> 

Great!

> +        fprintf(stderr, "waiting for checkpoint");
> +        if (isatty(fileno(stderr)))
> +            fprintf(stderr, "\n");
> +        else
> +            fprintf(stderr, "\r");

Here the condition should be inverted I think. The next condition should be used:

> if (!isatty(fileno(stderr)))
> ...

Otherwise pg_basebackup will insert '\n' in terminal instead '\r'.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Martín Marqués
Date:
El 21/11/17 a las 04:56, Arthur Zakirov escribió:
> On Mon, Nov 20, 2017 at 04:45:48PM -0300, Martín Marqués wrote:
>> New version of patch, without the --batch-mode option and using isatty()
>>
> 
> Great!
> 
>> +        fprintf(stderr, "waiting for checkpoint");
>> +        if (isatty(fileno(stderr)))
>> +            fprintf(stderr, "\n");
>> +        else
>> +            fprintf(stderr, "\r");
> 
> Here the condition should be inverted I think. The next condition should be used:
> 
>> if (!isatty(fileno(stderr)))
>> ...
> 
> Otherwise pg_basebackup will insert '\n' in terminal instead '\r'.

Ups! Attached the corrected version.:)

Nice catch. I completely missed that. Thanks.

-- 
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment

Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Arthur Zakirov
Date:
On Tue, Nov 21, 2017 at 07:16:46AM -0300, Martín Marqués wrote:
> 
> Ups! Attached the corrected version.:)
> 

Thank you for the new version.

The patch applies via 'patch' command without warnings and errors, tests passed. Marked the patch as "Ready for
Commiter".

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Martín Marqués
Date:
Arthur,

Thank you very much for reviewing the patch and for your valuable
input (you made me read the Microsoft Visual C specs ;))

Regards,

2017-11-21 8:11 GMT-03:00 Arthur Zakirov <a.zakirov@postgrespro.ru>:
> On Tue, Nov 21, 2017 at 07:16:46AM -0300, Martín Marqués wrote:
>>
>> Ups! Attached the corrected version.:)
>>
>
> Thank you for the new version.
>
> The patch applies via 'patch' command without warnings and errors, tests passed. Marked the patch as "Ready for
Commiter".
>
> --
> Arthur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company



--
Martín Marqués                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] pg_basebackup --progress output for batch execution

From
Peter Eisentraut
Date:
On 11/21/17 05:16, Martín Marqués wrote:
> El 21/11/17 a las 04:56, Arthur Zakirov escribió:
>> On Mon, Nov 20, 2017 at 04:45:48PM -0300, Martín Marqués wrote:
>>> New version of patch, without the --batch-mode option and using isatty()
>>>
>>
>> Great!
>>
>>> +        fprintf(stderr, "waiting for checkpoint");
>>> +        if (isatty(fileno(stderr)))
>>> +            fprintf(stderr, "\n");
>>> +        else
>>> +            fprintf(stderr, "\r");
>>
>> Here the condition should be inverted I think. The next condition should be used:
>>
>>> if (!isatty(fileno(stderr)))
>>> ...
>>
>> Otherwise pg_basebackup will insert '\n' in terminal instead '\r'.
> 
> Ups! Attached the corrected version.:)
> 
> Nice catch. I completely missed that. Thanks.

Committed.

I switched the if logic around even more, so that it is

!       if (isatty(fileno(stderr)))
!               fprintf(stderr, "\r");
!       else
!               fprintf(stderr, "\n");

instead of

!       if (!isatty(fileno(stderr)))
!               fprintf(stderr, "\n");
!       else
!               fprintf(stderr, "\r");


-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services