Thread: Progress reporting for pg_verify_checksums

Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

my colleague Bernd Helmle recently added progress reporting to our
pg_checksums application[1]. I have now forward ported this to
pg_verify_checksums for the September commitfest, please see the
attached patch.

Here's the description:

This optionally prints the progress of pg_verify_checksums via read
kilobytes to the terminal with the new command line argument -P.

If -P was forgotten and pg_verify_checksums operates on a large cluster,
the caller can send SIGUSR1 to pg_verify_checksums to turn progress
status reporting on during runtime.


Michael

[1] https://github.com/credativ/pg_checksums/commit/2b691
-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Hallo Michael,

> my colleague Bernd Helmle recently added progress reporting to our
> pg_checksums application[1]. I have now forward ported this to
> pg_verify_checksums for the September commitfest, please see the
> attached patch.

It seems that the patch does not apply cleanly on master, neither with 
"git apply" nor "patch". Could you rebase it?

> Here's the description:
>
> This optionally prints the progress of pg_verify_checksums via read
> kilobytes

MB? GB?

> to the terminal with the new command line argument -P.
>
> If -P was forgotten and pg_verify_checksums operates on a large cluster,
> the caller can send SIGUSR1 to pg_verify_checksums to turn progress
> status reporting on during runtime.

Hmmm. I cannot say I like the signal feature much. Would it make sense for 
the progress to be on by default, and to have a quiet option instead?

-- 
Fabien.


Re: Progress reporting for pg_verify_checksums

From
Alvaro Herrera
Date:
On 2018-Sep-01, Fabien COELHO wrote:

> > If -P was forgotten and pg_verify_checksums operates on a large cluster,
> > the caller can send SIGUSR1 to pg_verify_checksums to turn progress
> > status reporting on during runtime.
> 
> Hmmm. I cannot say I like the signal feature much. Would it make sense for
> the progress to be on by default, and to have a quiet option instead?

Hmm, I recall this technique being used elsewhere and is sometimes
useful.  Can't remember where though -- by manpages, it's not rsync nor
pv ...

How about making it a toggle?  Default off, enable-able by option,
toggleable by signal.  (If you enable it via the signal, what's the rate
to report at?)

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

On Mon, Sep 03, 2018 at 11:21:32AM -0300, Alvaro Herrera wrote:
> On 2018-Sep-01, Fabien COELHO wrote:
> > > If -P was forgotten and pg_verify_checksums operates on a large cluster,
> > > the caller can send SIGUSR1 to pg_verify_checksums to turn progress
> > > status reporting on during runtime.
> > 
> > Hmmm. I cannot say I like the signal feature much. Would it make sense for
> > the progress to be on by default, and to have a quiet option instead?
> 
> Hmm, I recall this technique being used elsewhere and is sometimes
> useful.  Can't remember where though -- by manpages, it's not rsync nor
> pv ...
 
e2fsck does it. And I think it would be useful for pg_basebackup as
well, but that's for another time.

In any case, pg_basebackup has the same option and it is off by default
as well.  I guess the rationale for not having it on by default for
pg_basebackup is the fact is does not have any other output normally.

pg_verify_checksums writes a report at the end however, but maybe that
one could be demoted to verbose mode now that we have it?

> How about making it a toggle?  Default off, enable-able by option,
> toggleable by signal.  

That sounds like a good idea.

> (If you enable it via the signal, what's the rate to report at?)

You mean how often it reports progress (that seems independent of the
signal)? From testing, it's roughly every second or so.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz


Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Freitag, den 31.08.2018, 14:50 +0200 schrieb Michael Banck:
> my colleague Bernd Helmle recently added progress reporting to our
> pg_checksums application[1]. I have now forward ported this to
> pg_verify_checksums for the September commitfest, please see the
> attached patch.
> 
> Here's the description:
> 
> This optionally prints the progress of pg_verify_checksums via read
> kilobytes to the terminal with the new command line argument -P.
> 
> If -P was forgotten and pg_verify_checksums operates on a large cluster,
> the caller can send SIGUSR1 to pg_verify_checksums to turn progress
> status reporting on during runtime.

Version 2 of the patch is attached. This is rebased to master as of
422952ee and changes the signal handling to be a toggle as suggested by
Alvaro.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Hallo Michael,

>> This optionally prints the progress of pg_verify_checksums via read
>> kilobytes to the terminal with the new command line argument -P.
>>
>> If -P was forgotten and pg_verify_checksums operates on a large cluster,
>> the caller can send SIGUSR1 to pg_verify_checksums to turn progress
>> status reporting on during runtime.
>
> Version 2 of the patch is attached. This is rebased to master as of
> 422952ee and changes the signal handling to be a toggle as suggested by
> Alvaro.

Patch applies cleanly and compiles.

About tests: "make check" is okay, but ITSM that the command is not 
started once, ever, in any test:-( It is unclear whether any test triggers 
checksums anyway...

The time() granularity to the second makes the result awkward on small 
tests:

  8/1554552 kB (0%, 8 kB/s)
  1040856/1554552 kB (66%, 1040856 kB/s)
  1554552/1554552 kB (100%, 1554552 kB/s)

I'd suggest to reuse the "portability/instr_time.h" stuff which allows a 
much better precision.

The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying 
1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are 
about storage which is usually measured in powers of 1,000, so I'd suggest 
to use thousands.

Another reserve I have is that on any hardware it is likely that there 
will be a lot of kilos, so maybe megas (MB) should be used instead.

I'm wondering whether the bandwidth display should be local (from the last 
display) or global (from the start of the command), but for the last 
forced one. This is an open question.

Maybe it would be nice to show elapsed time and expected completion time 
based on the current speed.

I would be in favor or having this turned on by default, and silenced with 
some option. I'm not sure other people would agree, though, so it is an 
open question as well.

About the code:

  +               if (show_progress)
  +                       show_progress = false;
  +               else
  +                       show_progress = true;

Why not a simple:

     /* invert current state */
     show_progress = !show_progress;

I do not see much advantage in using intermediate string buffers for 
values:

  + snprintf(totalstr, sizeof(totalstr), INT64_FORMAT,
  +          total_size / 1024);
  + snprintf(currentstr, sizeof(currentstr), INT64_FORMAT,
  +          current_size / 1024);
  + snprintf(currspeedstr, sizeof(currspeedstr), INT64_FORMAT,
  +          (current_size / 1024) / (((time(NULL) - scan_started) == 0) ? 1 : (time(NULL) - scan_started)));
  + fprintf(stderr, "%s/%s kB (%d%%, %s kB/s)",
  +          currentstr, totalstr, total_percent, currspeedstr);

ISTM that just one simple fprintf would be fine:

   fprintf(stderr, INT64_FORMAT "/" INT64_FORMAT " ...",
           formulas for each slot...);

ISTM that this line overwriting trick deserves a comment:

  + if (isatty(fileno(stderr)))
  +   fprintf(stderr, "\r");
  + else
  +   fprintf(stderr, "\n");

And it runs a little amok with "-v".

  + memset(&act, '\0', sizeof(act));

pg uses 0 instead of '\0' everywhere else.

   +   /* Make sure we just report at least once a second */
   +   if ((now == last_progress_update) && !force) return;

The comments seems contradictory, I understand the code makes sure that it 
is "at most" once a second. Pgbench uses "-P XXX" to strigger a progress 
report every XXX second. I'm unsure whether it would be useful to allow 
the user to change the 1 second display interval.

I'm not sure why you removed one unrelated line:

   #include "storage/checksum.h"
   #include "storage/checksum_impl.h"

  -
   static int64 files = 0;
   static int64 blocks = 0;
   static int64 badblocks = 0;


There is a problem in the scan_file code: The added sizeonly parameter is 
not used. It should be removed.

-- 
Fabien.


Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

thanks for the review!

On Wed, Sep 19, 2018 at 05:17:05PM +0200, Fabien COELHO wrote:
> >>This optionally prints the progress of pg_verify_checksums via read
> >>kilobytes to the terminal with the new command line argument -P.
> >>
> >>If -P was forgotten and pg_verify_checksums operates on a large cluster,
> >>the caller can send SIGUSR1 to pg_verify_checksums to turn progress
> >>status reporting on during runtime.
> >
> >Version 2 of the patch is attached. This is rebased to master as of
> >422952ee and changes the signal handling to be a toggle as suggested by
> >Alvaro.
> 
> Patch applies cleanly and compiles.
> 
> About tests: "make check" is okay, but ITSM that the command is not started
> once, ever, in any test:-( It is unclear whether any test triggers checksums
> anyway...

Yeah, this was discussed on another thread and some basic tap tests for
pg_verify_checksums were proposed (also by me), but this hasn't been
further addressed.

Personally, I think this would be a good thing to add to v11 even.

In any case, this particular feature might not be very easy to tap test,
but I am open to suggestions, of course.

> The time() granularity to the second makes the result awkward on small
> tests:
> 
>  8/1554552 kB (0%, 8 kB/s)
>  1040856/1554552 kB (66%, 1040856 kB/s)
>  1554552/1554552 kB (100%, 1554552 kB/s)
> 
> I'd suggest to reuse the "portability/instr_time.h" stuff which allows a
> much better precision.
> 
> The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying
> 1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are
> about storage which is usually measured in powers of 1,000, so I'd suggest
> to use thousands.
> 
> Another reserve I have is that on any hardware it is likely that there will
> be a lot of kilos, so maybe megas (MB) should be used instead.

The display is exactly the same as in pg_basebackup (except the
bandwith is shown as well), so right now I think it is more useful to be
consistent here.  So if we change that, pg_basebackup should be changed
as well I think.

Maybe the code could be factored out to some common file in the future.
 
> I'm wondering whether the bandwidth display should be local (from the last
> display) or global (from the start of the command), but for the last forced
> one. This is an open question.


 
> Maybe it would be nice to show elapsed time and expected completion time
> based on the current speed.

Maybe; this could be added to the factored out common code I mentioned
above.
 
> I would be in favor or having this turned on by default, and silenced with
> some option. I'm not sure other people would agree, though, so it is an open
> question as well.

If this runs in a script, it would be pretty annoying, so everybody
would have to add --no-progress so I am not convinced. Also, both
pg_basebackup and pgbench don't show progress by default.
 
> About the code:
> 
>  +               if (show_progress)
>  +                       show_progress = false;
>  +               else
>  +                       show_progress = true;
> 
> Why not a simple:
> 
>     /* invert current state */
>     show_progress = !show_progress;

Fair enough.
 
> I do not see much advantage in using intermediate string buffers for values:
> 
>  + snprintf(totalstr, sizeof(totalstr), INT64_FORMAT,
>  +          total_size / 1024);
>  + snprintf(currentstr, sizeof(currentstr), INT64_FORMAT,
>  +          current_size / 1024);
>  + snprintf(currspeedstr, sizeof(currspeedstr), INT64_FORMAT,
>  +          (current_size / 1024) / (((time(NULL) - scan_started) == 0) ? 1 : (time(NULL) - scan_started)));
>  + fprintf(stderr, "%s/%s kB (%d%%, %s kB/s)",
>  +          currentstr, totalstr, total_percent, currspeedstr);
> 
> ISTM that just one simple fprintf would be fine:
> 
>   fprintf(stderr, INT64_FORMAT "/" INT64_FORMAT " ...",
>           formulas for each slot...);

That code is adopted from pg_basebackup.c and the comment there says:

|     * Separate step to keep platform-dependent format code out of
|     * translatable strings.  And we only test for INT64_FORMAT
|     * availability in snprintf, not fprintf.

So that sounds legit to me.

> ISTM that this line overwriting trick deserves a comment:
> 
>  + if (isatty(fileno(stderr)))
>  +   fprintf(stderr, "\r");
>  + else
>  +   fprintf(stderr, "\n");

I agree a comment should be in there. But that code is also taken
verbatim from pg_basebackup.c (but this time, there's no comment there,
either).

How about this:

|     * If we are reporting to a terminal, send a carriage return so that
|     * we stay on the same line.  If not, send a newline. 

> And it runs a little amok with "-v".

Do you suggest we should make those mutually exlusive?  I agree that in
general, -P is not very useful if -v is on, but if you have a really big
table, it should still be, no?

>  + memset(&act, '\0', sizeof(act));
> 
> pg uses 0 instead of '\0' everywhere else.

Ok.
 
>   +   /* Make sure we just report at least once a second */
>   +   if ((now == last_progress_update) && !force) return;
> 
> The comments seems contradictory, I understand the code makes sure that it
> is "at most" once a second. 

I guess you're right as the identical code in pg_basebackup.c has a
comment saying "Max once per second".

> Pgbench uses "-P XXX" to strigger a progress
> report every XXX second. I'm unsure whether it would be useful to allow the
> user to change the 1 second display interval.

I think pgbench writes a new line for each report? In that case, having
it every second for longer runs might be annoying and I can see the
point in having in configurable.

In the case of pg_basebackup/pg_verify_checksums, I guess 1 second is
fine.
 
> I'm not sure why you removed one unrelated line:
> 
>   #include "storage/checksum.h"
>   #include "storage/checksum_impl.h"
> 
>  -
>   static int64 files = 0;
>   static int64 blocks = 0;
>   static int64 badblocks = 0;

Merge error on my side I guess, will fix.
 
> There is a problem in the scan_file code: The added sizeonly parameter is
> not used. It should be removed.

Right, this might have been a leftover from an earlier version of the
code, I'll check back with Bernd to make sure that was not a
porting/merge error on my side.

I've attached V3 of the patch, addressing some of your comments.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Attachment

Re: Progress reporting for pg_verify_checksums

From
Thomas Munro
Date:
On Tue, Sep 4, 2018 at 2:21 AM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> On 2018-Sep-01, Fabien COELHO wrote:
> > > If -P was forgotten and pg_verify_checksums operates on a large cluster,
> > > the caller can send SIGUSR1 to pg_verify_checksums to turn progress
> > > status reporting on during runtime.
> >
> > Hmmm. I cannot say I like the signal feature much. Would it make sense for
> > the progress to be on by default, and to have a quiet option instead?
>
> Hmm, I recall this technique being used elsewhere and is sometimes
> useful.  Can't remember where though -- by manpages, it's not rsync nor
> pv ...

On BSD-derived systems including macOS, you can hit ^T to deliver
SIGINFO to the foreground process (much like ^C delivers SIGINT).
Many well behaved programs including rsync (at least on FreeBSD) then
show you some kind of progress or status report:

munro@asterix $ sleep 60
load: 0.11  cmd: sleep 62234 [nanslp] 1.22r 0.00u 0.00s 0% 2032k
sleep: about 58 second(s) left out of the original 60

I've contemplated various crackpot ideas about teaching psql to show
information about what my backend is doing when I hit ^T (perhaps via
a second connection, similar to the way it cancels queries?)...

There was an interesting thread[1] about dumping various things from
backends on (multiplexed) SIGUSR1.  Also if memory serves, JVMs also
dump internal stuff when you signal them, so you can confirm that they
really did eat all 64GB of your RAM.  But none of these things are
examples of enabling/disabling an on-going status reporting mode,
which is perhaps what you meant, they're more like one-off pokes to
dump information.

[1] https://www.postgresql.org/message-id/flat/20140623101501.GN16260%40awork2.anarazel.de

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Hallo Michael,

About patch v3: applies cleanly and compiles.

The xml documentation should be updated! (It is kind of hard to notice 
what is not there:-)

See "doc/src/sgml/ref/pg_verify_checksums.sgml".

>> The time() granularity to the second makes the result awkward on small
>> tests:
>>
>>  8/1554552 kB (0%, 8 kB/s)
>>  1040856/1554552 kB (66%, 1040856 kB/s)
>>  1554552/1554552 kB (100%, 1554552 kB/s)
>>
>> I'd suggest to reuse the "portability/instr_time.h" stuff which allows a
>> much better precision.

I still think it would make sense to use that instead of low-precision 
time().

>> The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying
>> 1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are
>> about storage which is usually measured in powers of 1,000, so I'd suggest
>> to use thousands.
>>
>> Another reserve I have is that on any hardware it is likely that there will
>> be a lot of kilos, so maybe megas (MB) should be used instead.
>
> The display is exactly the same as in pg_basebackup (except the
> bandwith is shown as well), so right now I think it is more useful to be
> consistent here.

Hmmm... units are units, and the display is wrong. The fact that other 
commands are wrong as well does not look like a good argument to persist 
in an error.

> So if we change that, pg_basebackup should be changed as well I think.

I'm okay with fixing pg_basebackup as well! I'm unsure about the best 
place to put such a function, though. Probably under "src/common/" where 
there is already some front-end specific code ("fe_memutils.c").

> Maybe the code could be factored out to some common file in the future.

I would be okay with a progress printing function shared between some 
commands. It just needs some place. pg_rewind also has a --rewind option.

>> Maybe it would be nice to show elapsed time and expected completion time
>> based on the current speed.
>
> Maybe; this could be added to the factored out common code I mentioned
> above.

Yep.

>> I would be in favor or having this turned on by default, and silenced with
>> some option. I'm not sure other people would agree, though, so it is an open
>> question as well.
>
> If this runs in a script, it would be pretty annoying, so everybody
> would have to add --no-progress so I am not convinced. Also, both
> pg_basebackup and pgbench don't show progress by default.
>

Ok.

>> I do not see much advantage in using intermediate string buffers for values:
>>
>>  + snprintf(totalstr, sizeof(totalstr), INT64_FORMAT,
>>  +          total_size / 1024);
>>  + snprintf(currentstr, sizeof(currentstr), INT64_FORMAT,
>>  +          current_size / 1024);
>>  + snprintf(currspeedstr, sizeof(currspeedstr), INT64_FORMAT,
>>  +          (current_size / 1024) / (((time(NULL) - scan_started) == 0) ? 1 : (time(NULL) - scan_started)));
>>  + fprintf(stderr, "%s/%s kB (%d%%, %s kB/s)",
>>  +          currentstr, totalstr, total_percent, currspeedstr);
>>
>> ISTM that just one simple fprintf would be fine:
>>
>>   fprintf(stderr, INT64_FORMAT "/" INT64_FORMAT " ...",
>>           formulas for each slot...);
>
> That code is adopted from pg_basebackup.c and the comment there says:
>
> |     * Separate step to keep platform-dependent format code out of
> |     * translatable strings.  And we only test for INT64_FORMAT
> |     * availability in snprintf, not fprintf.
>
> So that sounds legit to me.

Hmmm. Translation. Not sure I like much the idea of translating units,
but why not.

> |     * If we are reporting to a terminal, send a carriage return so that
> |     * we stay on the same line.  If not, send a newline.
>
>> And it runs a little amok with "-v".
>
> Do you suggest we should make those mutually exlusive?  I agree that in
> general, -P is not very useful if -v is on, but if you have a really big
> table, it should still be, no?

Yep. I was just mentionning that they interfere on a terminal, but I agree
that there is no obvious fix.

>>  + memset(&act, '\0', sizeof(act));
>>
>> pg uses 0 instead of '\0' everywhere else.
>
> Ok.

Not '0', plain 0 (the integer of value zero).

-- 
Fabien.


Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

On Wed, Sep 26, 2018 at 10:46:06AM +0200, Fabien COELHO wrote:
> The xml documentation should be updated! (It is kind of hard to notice what
> is not there:-)
> 
> See "doc/src/sgml/ref/pg_verify_checksums.sgml".

Right, I've added a paragraph.
 
> >>The time() granularity to the second makes the result awkward on small
> >>tests:
> >>
> >> 8/1554552 kB (0%, 8 kB/s)
> >> 1040856/1554552 kB (66%, 1040856 kB/s)
> >> 1554552/1554552 kB (100%, 1554552 kB/s)
> >>
> >>I'd suggest to reuse the "portability/instr_time.h" stuff which allows a
> >>much better precision.
> 
> I still think it would make sense to use that instead of low-precision
> time().

As the use-case of this is not for small tests, I don't think it is
useful to make the code more complicated for this.
 
> >>The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying
> >>1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are
> >>about storage which is usually measured in powers of 1,000, so I'd suggest
> >>to use thousands.
> >>
> >>Another reserve I have is that on any hardware it is likely that there will
> >>be a lot of kilos, so maybe megas (MB) should be used instead.
> >
> >The display is exactly the same as in pg_basebackup (except the
> >bandwith is shown as well), so right now I think it is more useful to be
> >consistent here.
> 
> Hmmm... units are units, and the display is wrong. The fact that other
> commands are wrong as well does not look like a good argument to persist in
> an error.

I've had a look around, and "kB" seems to be a common unit for 1024
bytes, e.g. in pg_test_fsync etc., unless I am mistaken?
 
> >So if we change that, pg_basebackup should be changed as well I think.
> 
> I'm okay with fixing pg_basebackup as well! I'm unsure about the best place
> to put such a function, though. Probably under "src/common/" where there is
> already some front-end specific code ("fe_memutils.c").
> 
> >Maybe the code could be factored out to some common file in the future.
> 
> I would be okay with a progress printing function shared between some
> commands. It just needs some place. pg_rewind also has a --rewind option.

I guess you mean pg_rewind also has a --progress option.

I do agree it makes sense to refactor that, but I don't think this
should be part of this patch.
 
> >> + memset(&act, '\0', sizeof(act));
> >>
> >>pg uses 0 instead of '\0' everywhere else.
> >
> >Ok.
> 
> Not '0', plain 0 (the integer of value zero).

Oops, thanks for spotting that.

I've attached v4 of the patch.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
>>>> The time() granularity to the second makes the result awkward on small
>>>> tests:
>>>>
>>>> 8/1554552 kB (0%, 8 kB/s)
>>>> 1040856/1554552 kB (66%, 1040856 kB/s)
>>>> 1554552/1554552 kB (100%, 1554552 kB/s)
>>>>
>>>> I'd suggest to reuse the "portability/instr_time.h" stuff which allows a
>>>> much better precision.
>>
>> I still think it would make sense to use that instead of low-precision
>> time().
>
> As the use-case of this is not for small tests,

Even for a long run, the induce error by the 1 second imprecision on both 
points is significant at the beginning of the scan.

> I don't think it is useful to make the code more complicated for this.

I do not think that it would be much more complicated to use the 
portability macros to get a precise time.

>>> The display is exactly the same as in pg_basebackup (except the
>>> bandwith is shown as well), so right now I think it is more useful to be
>>> consistent here.
>>
>> Hmmm... units are units, and the display is wrong. The fact that other
>> commands are wrong as well does not look like a good argument to persist in
>> an error.
>
> I've had a look around, and "kB" seems to be a common unit for 1024
> bytes, e.g. in pg_test_fsync etc., unless I am mistaken?

I can only repeat my above comment: the fact that postgres is wrong 
elsewhere is not a good reason to persist in reproducing an error.


See the mother of all knowledge https://en.wikipedia.org/wiki/Kilobyte

  - SI (decimal, 1000): kB, MB, GB, ...
  - IEC (binary 1024): KiB, MiB, GiB, ...
  - JEDEC (binary 1024, used for memory): KB, MB, GB.

Summary:
  - 1 kB = 1000 bytes
  - 1 KB = 1 KiB = 1024 bytes

Decimal is used for storage (HDD, SSD), binary for memory. That is life, 
and IMHO Postgres code is not the place to invent new units.

Moreover, I still think that the progress should use MB defined as 1000000 
bytes for showing the progress.

>> I would be okay with a progress printing function shared between some
>> commands. It just needs some place. pg_rewind also has a --rewind option.
>
> I guess you mean pg_rewind also has a --progress option.

Indeed.

> I do agree it makes sense to refactor that, but I don't think this
> should be part of this patch.

That's a point. I'd suggest to put the new progress function directly in 
the common part, and other patches could take advantage of it for other 
commands when someone feels like it.

Other comments:

function toggle_progress_report has empty lines after { and before },
but this style is not used elsewhere, I think that they should be removed.

The report_progress API should be ready to be used by another client 
application, which suggest that global variables should be avoided.

Maybe:

   void report_progress(current, total, force)

and the scan start and last times could be a static variable inside the 
function initialized on the first call, which would suggest to call the 
function at the beginning of the scan, probably with current == 0.

Or maybe:

   time_type report_progress(current, total, start, last, force)

Which would return the last time, and the caller has responsability for 
initializing a start time.

-- 
Fabien.


Re: Progress reporting for pg_verify_checksums

From
Thomas Munro
Date:
On Sat, Sep 29, 2018 at 1:07 AM Michael Banck <michael.banck@credativ.de> wrote:
> I've attached v4 of the patch.

Hi Michael,

Windows doesn't like sigaction:

https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.15189

I'm not sure if we classify this as a "frontend" program.  Should it
be using pqsignal() from src/port/pqsignal.c?  Or perhaps just
sigaction as you have it (pqsignal.c says that we require sigaction on
all Unices), but #ifndef WIN32 around that stuff, since SIGUSR1 is
never going to work anyway.

-- 
Thomas Munro
http://www.enterprisedb.com


Re: Progress reporting for pg_verify_checksums

From
Dmitry Dolgov
Date:
> On Wed, Oct 3, 2018 at 12:51 AM Thomas Munro <thomas.munro@enterprisedb.com> wrote:
>
> On Sat, Sep 29, 2018 at 1:07 AM Michael Banck <michael.banck@credativ.de> wrote:
> > I've attached v4 of the patch.
>
> Hi Michael,
>
> Windows doesn't like sigaction:
>
> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.15189
>
> I'm not sure if we classify this as a "frontend" program.  Should it
> be using pqsignal() from src/port/pqsignal.c?  Or perhaps just
> sigaction as you have it (pqsignal.c says that we require sigaction on
> all Unices), but #ifndef WIN32 around that stuff, since SIGUSR1 is
> never going to work anyway.

Unfortunately, patch also has some conflicts with the current master. I'll move
it to the next CF.


Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

On Wed, Oct 03, 2018 at 11:50:36AM +1300, Thomas Munro wrote:
> On Sat, Sep 29, 2018 at 1:07 AM Michael Banck <michael.banck@credativ.de> wrote:
> Windows doesn't like sigaction:
> 
> https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.15189

Thanks for the report and sorry for taking so long to reply.
 
> I'm not sure if we classify this as a "frontend" program.  Should it
> be using pqsignal() from src/port/pqsignal.c?  Or perhaps just
> sigaction as you have it (pqsignal.c says that we require sigaction on
> all Unices), but #ifndef WIN32 around that stuff, since SIGUSR1 is
> never going to work anyway.

I've used pqsignal now, and disabled that feature on Windows.

I've also updated one comment and added an additional regression test.

V5 attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Hallo Michael,

> V5 attached.

Patch applies cleanly, compiles, global & local make check are ok.

Given that the specific output is not checked, I do not think that the -P 
check deserves a test on its own, I think that the -P option could simply 
be added to any of the existing tests.

I'm still unhappy that "kB" is used for 1024 bytes in the output, contrary 
to all existing standards (1 kB = 1000 bytes -- SI, 1 KB = 1 KiB = 1024 
bytes -- IEC & JEDEC). The fact that this is also used wrongly elsewhere 
in pg is not relevant, these are bugs to be fixed, not to be replicated.

Given the speed of verifying checksums and its storage-oriented status, I 
also still think that a (possibly fractional) MB (1,000,000 bytes), or 
even GB, is the right unit to use for reporting this progress. On my 
laptop (SSD), verifying runs at least at 1.26 GB/s (on one small test), 
there is no point in displaying kilobytes progress.

I still think that using a more precise time than time(), eg with existing 
macros from "instr_time.h", would not cost anything more and result in a 
better precision output. It would also allow to remove the check used to 
avoid a division-by-zero by switching to double.

If the check is performed while online (other patch in queue), then the 
size may change thus it may not reach or go beyond 100%. No big deal.

I'd consider inverting the sizeonly boolean, so that true does the check 
and false does only the size collection. It seems more logical to me if it 
performs more with true than with false.

-- 
Fabien.


Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
> Given the speed of verifying checksums and its storage-oriented status, I 
> also still think that a (possibly fractional) MB (1,000,000 bytes), or even 
> GB, is the right unit to use for reporting this progress. On my laptop (SSD), 
> verifying runs at least at 1.26 GB/s (on one small test), there is no point 
> in displaying kilobytes progress.

Obviously the file is cached by the system at such speed, but still most 
disks should provides dozens of MB per second of read bandwidth. If GB is 
used, it should use fractional display (eg 1.25 GB) though.

-- 
Fabien.


Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Dienstag, den 25.12.2018, 12:12 +0100 schrieb Fabien COELHO:
> > Given the speed of verifying checksums and its storage-oriented status, I 
> > also still think that a (possibly fractional) MB (1,000,000 bytes), or even 
> > GB, is the right unit to use for reporting this progress. On my laptop (SSD), 
> > verifying runs at least at 1.26 GB/s (on one small test), there is no point 
> > in displaying kilobytes progress.
> 
> Obviously the file is cached by the system at such speed, but still most 
> disks should provides dozens of MB per second of read bandwidth. If GB is 
> used, it should use fractional display (eg 1.25 GB) though.

I think MB indeed makes more sense than kB, so I have changed that now
in V7, per attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
> I think MB indeed makes more sense than kB, so I have changed that now 
> in V7, per attached.

You use 1024² bytes. What about 1000000 bytes per MB, as the unit is about 
stored files?

Also, you did not answer to my other points:
  - use "instr_time.h" for better precision
  - invert sizeonly
  - reuse a test

-- 
Fabien.

Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Tue, Dec 25, 2018 at 07:05:30PM +0100, Fabien COELHO wrote:
> You use 1024² bytes. What about 1000000 bytes per MB, as the unit is about
> stored files?
>
> Also, you did not answer to my other points:
>  - use "instr_time.h" for better precision
>  - invert sizeonly
>  - reuse a test

It seems to me that the SIGUSR1 business is not necessary for the
basic layer of this feature, so I would rather rip that off now.  If
necessary we could always discuss that later on.  My take about the
option is that --progress should not be the default, but that reports
should only be provided if the caller wants them.

--quiet may have some value by itself, but that seems like a separate
discussion to me.

+   /*
+    * If we are reporting to a terminal, send a carriage return so that we
+    * stay on the same line.  If not, send a newline.
+    */
+   if (isatty(fileno(stderr)))
+       fprintf(stderr, "\r");
+   else
+       fprintf(stderr, "\n");
This bit is not really elegant, why not just '\r'?

+   /* The same for total size */
+   if (current_size > total_size)
+       total_size = current_size / 1048576;
Let's use that in a variable and not hardcode the number.

What's introduced here is very similar to what progress_report() does
in pg_rewind/logging.c.  The report depends on the context so this
cannot be a common routine logic but perhaps we could keep a
consistent output for both tools?
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Alvaro Herrera
Date:
On 2018-Dec-26, Michael Paquier wrote:

> +   /*
> +    * If we are reporting to a terminal, send a carriage return so that we
> +    * stay on the same line.  If not, send a newline.
> +    */
> +   if (isatty(fileno(stderr)))
> +       fprintf(stderr, "\r");
> +   else
> +       fprintf(stderr, "\n");
> This bit is not really elegant, why not just '\r'?

Umm, this is established coding pattern in pg_basebackup.c.
Stylistically I'd change all those cases to "fprintf(stderr,
isatty(fileno(stderr)) ? "\r" : "\n")" but leave the string alone, since
AFAIR it took some time to figure out what to do.  (I'd also make the
comment one line instead of four, say "Stay on the same line if
reporting to a terminal".  That makes the whole stanza two lines rather
than eight, which is the appropriate amount of space for it).

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Tue, Dec 25, 2018 at 11:45:09PM -0300, Alvaro Herrera wrote:
> Umm, this is established coding pattern in pg_basebackup.c.
> Stylistically I'd change all those cases to "fprintf(stderr,
> isatty(fileno(stderr)) ? "\r" : "\n")" but leave the string alone, since
> AFAIR it took some time to figure out what to do.  (I'd also make the
> comment one line instead of four, say "Stay on the same line if
> reporting to a terminal".  That makes the whole stanza two lines rather
> than eight, which is the appropriate amount of space for it).

There has been no input from the author for a couple of weeks now, so
I have marked the patch as returned with feedback.
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Mittwoch, den 26.12.2018, 11:14 +0900 schrieb Michael Paquier:
> On Tue, Dec 25, 2018 at 07:05:30PM +0100, Fabien COELHO wrote:
> > You use 1024² bytes. What about 1000000 bytes per MB, as the
> > unit is about stored files?

I haven't changed that as Ihave not been pointed to a clear project
guideline that 1 MB should be 1000000 and not 1024x1024.
 
> > Also, you did not answer to my other points:
> >  - use "instr_time.h" for better precision

Both pg_rewind and pg_basebackup are using time() as well, so I think
there's precedence for this.

> >  - invert sizeonly

The current way is consistent with src/backend/replication/basebackup.c
so I'd prefer to keep it that way.

> >  - reuse a test

I've removed the test. 

> It seems to me that the SIGUSR1 business is not necessary for the
> basic layer of this feature, so I would rather rip that off now.  

But is it gaining all so much if it is ripped out? The patch will be a
dozen lines shorter, but those are all isolated.

And contrary to pg_basebackup, pg_verify_checksums might be run in the
foreground much more often - right now it has to be done while the
cluster is offline, so if you have large instance and you forgot to pass
-P then you're glad you can signal it so you roughly know when you can
expect to startup the instance again.

> If necessary we could always discuss that later on.  

If you insist we can rip it out, of course.

> My take about the option is that --progress should not be the default, 
> but that reports should only be provided if the caller wants them.
>
> --quiet may have some value by itself, but that seems like a separate
> discussion to me.

Right now --progress isn't default so I think that's how you like it?
    
> +   /*
> +    * If we are reporting to a terminal, send a carriage return so that we
> +    * stay on the same line.  If not, send a newline.
> +    */
> +   if (isatty(fileno(stderr)))
> +       fprintf(stderr, "\r");
> +   else
> +       fprintf(stderr, "\n");
> This bit is not really elegant, why not just '\r'?

I've changed it as Alvaro suggested.

> +   /* The same for total size */
> +   if (current_size > total_size)
> +       total_size = current_size / 1048576;
> Let's use that in a variable and not hardcode the number.

Done so, I called it MEGABYTES.

> What's introduced here is very similar to what progress_report() does
> in pg_rewind/logging.c.  The report depends on the context so this
> cannot be a common routine logic but perhaps we could keep a
> consistent output for both tools?

Well, pg_rewind is also using kB, so should I switch it back to that? 

It looks like the progress reporting is otherwise quite similar, so what
exactly did you have in mind?

New patch attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Sun, Feb 17, 2019 at 09:00:29PM +0100, Michael Banck wrote:
> Well, pg_rewind is also using kB, so should I switch it back to that?

I am not sure which one is better, sorry :)

There is an argument for switching pg_rewind to use MB as well.  I
don't expect pg_rewind to transfer gigs of data, but it could make the
maths harder for a newcomer if it transfers a lot of data.

> It looks like the progress reporting is otherwise quite similar, so what
> exactly did you have in mind?

I was thinking about a routine in src/common/ or such at this time.
The tools have separate goals though, so it looks sensible as well to
keep two routines.  I am fine to let that up to you.
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Montag, den 18.02.2019, 12:24 +0900 schrieb Michael Paquier:
> On Sun, Feb 17, 2019 at 09:00:29PM +0100, Michael Banck wrote:
> > Well, pg_rewind is also using kB, so should I switch it back to that?
> 
> I am not sure which one is better, sorry :)
> 
> There is an argument for switching pg_rewind to use MB as well.  I
> don't expect pg_rewind to transfer gigs of data, but it could make the
> maths harder for a newcomer if it transfers a lot of data.

Right, I think it makes more sense for pg_basebackup and
pg_verify_checksums as they go over the whole cluster than for
pg_rewind, but being consistent is also worth something.

> > It looks like the progress reporting is otherwise quite similar, so what
> > exactly did you have in mind?
> 
> I was thinking about a routine in src/common/ or such at this time.
> The tools have separate goals though, so it looks sensible as well to
> keep two routines.  I am fine to let that up to you.

I'll leave it as-is for now, but will see whether this can be improved
in a follow-up patch.


Michael
    
-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz


Re: Progress reporting for pg_verify_checksums

From
Alvaro Herrera
Date:
On 2019-Feb-17, Michael Banck wrote:

> +    /*
> +     * As progress status information may be requested, we need to scan the
> +     * directory tree(s) twice, once to get the idea how much data we need to
> +     * scan and finally to do the real legwork.
> +     */
> +    total_size = scan_directory(DataDir, "global", true);
> +    total_size += scan_directory(DataDir, "base", true);
> +    total_size += scan_directory(DataDir, "pg_tblspc", true);

Surely we know at that point whether this first scan is needed, and we
can skip it if not?

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Montag, den 18.02.2019, 11:18 -0300 schrieb Alvaro Herrera:
> On 2019-Feb-17, Michael Banck wrote:
> > +    /*
> > +     * As progress status information may be requested, we need to scan the
> > +     * directory tree(s) twice, once to get the idea how much data we need to
> > +     * scan and finally to do the real legwork.
> > +     */
> > +    total_size = scan_directory(DataDir, "global", true);
> > +    total_size += scan_directory(DataDir, "base", true);
> > +    total_size += scan_directory(DataDir, "pg_tblspc", true);
> 
> Surely we know at that point whether this first scan is needed, and we
> can skip it if not?

Yeah - new patch attached.


Michael 

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Bernd Helmle
Date:
Am Montag, den 18.02.2019, 16:52 +0100 schrieb Michael Banck:
> > Surely we know at that point whether this first scan is needed, and
> we
> > can skip it if not?
>
> Yeah - new patch attached.

Maybe i'm wrong, but my thought is that this breaks the SIGUSR1
business, since there seems no code path which calculates total_size in
this case?


    Bernd


Attachment

Re: Progress reporting for pg_verify_checksums

From
Alvaro Herrera
Date:
On 2019-Feb-18, Bernd Helmle wrote:

> Am Montag, den 18.02.2019, 16:52 +0100 schrieb Michael Banck:
> > > Surely we know at that point whether this first scan is needed, and
> > we
> > > can skip it if not?
> > 
> > Yeah - new patch attached.
> 
> Maybe i'm wrong, but my thought is that this breaks the SIGUSR1
> business, since there seems no code path which calculates total_size in
> this case?

Oh, yeah, it does.  In that case, a comment explaining that is needed.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Montag, den 18.02.2019, 13:42 -0300 schrieb Alvaro Herrera:
> On 2019-Feb-18, Bernd Helmle wrote:
> 
> > Am Montag, den 18.02.2019, 16:52 +0100 schrieb Michael Banck:
> > > > Surely we know at that point whether this first scan is needed, and
> > > > we can skip it if not?
> > > 
> > > Yeah - new patch attached.
> > 
> > Maybe i'm wrong, but my thought is that this breaks the SIGUSR1
> > business, since there seems no code path which calculates total_size in
> > this case?
> 
> Oh, yeah, it does.  In that case, a comment explaining that is needed.

New patch attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Hallo Michael,

> New patch attached.

Patch applies cleanly. Compiles, "make check" ok. doc build is also ok.

There are no tests, which is probably fine for such an interactive 
feature.

Docs look okay to me. Clear and to the point.

About :

  total_percent = total_size ? (int64) ((current_size / MEGABYTES) * 100 / (total_size / MEGABYTES)) : 0;

MEGABYTES can be simplified and will enhance precision. ISTM that the 
percent could be a double:

   total_percent = total_size ? 100.0 * current_size / total_size : 0.0 ;

and then just let fprintf do the rounding.

I would bother rounding down < 100% to 100, because then you would get

   1560/1492 MB (100\%, X MB/s)

which is kind of silly. If it goes over it just mean that the initial 
estimate was wrong, which might happen if the db is running (other patch 
in the queue). I'm fine with over 100 if it is consistent with the MB 
ratio next to it.

fprintf could be called once instead of twice by merging the eol and the 
previous message.

Maybe I'd consider inverting the sizeonly boolean so that more is done 
when true.

I'd still would use more precise than one second precision time so that 
the speed displayed at the beginning is more realistic. eg I got on a 
short test with probably in system cache files:

   188/188 MB (100%, 188 MB/s)

but the reality is that it took 0.126 seconds, and the speed was really 
1492 MB/s.

-- 
Fabien.


Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Dienstag, den 19.02.2019, 16:37 +0100 schrieb Fabien COELHO
> 
> About :
> 
>   total_percent = total_size ? (int64) ((current_size / MEGABYTES) * 100 / (total_size / MEGABYTES)) : 0;
> 
> MEGABYTES can be simplified and will enhance precision. ISTM that the 
> percent could be a double:
> 
>    total_percent = total_size ? 100.0 * current_size / total_size : 0.0 ;
> 
> and then just let fprintf do the rounding.

Ok, done so.

> I would bother rounding down < 100% to 100, because then you would get
> 
>    1560/1492 MB (100\%, X MB/s)
> 
> which is kind of silly. 

No, we cap the total_size to current_size so you won't see that (but
total_size will potentially gradually increase). pg_basebackup has the
same behaviour.

> I'd still would use more precise than one second precision time so that 
> the speed displayed at the beginning is more realistic. eg I got on a 
> short test with probably in system cache files:
> 
>    188/188 MB (100%, 188 MB/s)
> 
> but the reality is that it took 0.126 seconds, and the speed was really 
> 1492 MB/s.

Because I implemented I/O throttling for pg_checksums and needed it
there anyway, I've switched it to the instr_time API now. It now updates
the progress 10 times every second.

So now I get

|20/20 MB (100%, 1382 MB/s)

for an empty cluster vs

|20/20 MB (100%, 20 MB/s)

with the previous version of the patch.

New patch attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Hallo Michael,

>> I would bother rounding down < 100% to 100, because then you would get
>>
>>    1560/1492 MB (100\%, X MB/s)
>>
>> which is kind of silly.
>
> No, we cap the total_size to current_size so you won't see that (but
> total_size will potentially gradually increase). pg_basebackup has the
> same behaviour.

Ok.

> Because I implemented I/O throttling for pg_checksums

Great!

> New patch attached.

Does not apply because of the renaming committed by Michaël.

Could you rebase?

-- 
Fabien.

Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Wed, Mar 13, 2019 at 07:22:28AM +0100, Fabien COELHO wrote:
> Does not apply because of the renaming committed by Michaël.
>
> Could you rebase?

This stuff touches pg_checksums.c, so you may want to wait one day or
two to avoid extra work...  I think that I'll be able to finish the
addition of the --enable and --disable switches by then.
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Kyotaro HORIGUCHI
Date:
Hello.

At Wed, 13 Mar 2019 16:25:15 +0900, Michael Paquier <michael@paquier.xyz> wrote in <20190313072515.GB2988@paquier.xyz>
> On Wed, Mar 13, 2019 at 07:22:28AM +0100, Fabien COELHO wrote:
> > Does not apply because of the renaming committed by Michaël.
> >
> > Could you rebase?
>
> This stuff touches pg_checksums.c, so you may want to wait one day or
> two to avoid extra work...  I think that I'll be able to finish the
> addition of the --enable and --disable switches by then.

> +    /* Make sure we report at most once every tenth of a second */
> +    if ((INSTR_TIME_GET_MILLISEC(now) -
> +         INSTR_TIME_GET_MILLISEC(last_progress_update) < 100) && !force)

I'm not a skilled gamer and 10Hz update is a bit hard for my
eyes:p The second hand of old clocks ticks at 4Hz. I think it is
enough frequently.


> 841/1516 MB (55%, 700 MB/s)

Differently from other prgress reporting project, estimating ETC
(estimated time to completion), which is in the most important,
is quite easy. (pgbench does that.) And I'd like to see a
progress bar there but it might be overdoing. I'm not sure let
the current size occupy a part of screen width is needed.  I
don't think total size is needed to be fixed in MB and I think at
most four or five digits are enough. (That is the same for
speed.)

If the all of aboves are involved, the line would look as the
follows.

[=======================             ] ( 63% of 12.53 GB, 179 MB/s, ETC 26s)

# Note that this is just an opinion.

(pg_checksum runs fast at the beginning so ETC behaves somewhat
strange in the meanwhile.)



> +#define MEGABYTES 1048576

 1024 * 1024 is preferable than that many digits.


> +    /* we handle SIGUSR1 only, and toggle the value of show_progress */
> +    if (signum == SIGUSR1)
> +        show_progress = !show_progress;

SIGUSR1 *toggles* progress. A garbage line is left alone after
turning it off. It would be erasable. I'm not sure which is
better, though.


>
> @@ -167,7 +255,7 @@ scan_directory(const char *basedir, const char *subdir)
>          if (strncmp(de->d_name,
>                      PG_TEMP_FILES_DIR,
>                      strlen(PG_TEMP_FILES_DIR)) == 0)
> -            return;
> +            continue;

Why this patch changes the behavior for temprary directories? It
seems like a bug fix of pg_checksums.


regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Thu, Mar 14, 2019 at 11:54:17AM +0900, Kyotaro HORIGUCHI wrote:
> Why this patch changes the behavior for temprary directories? It
> seems like a bug fix of pg_checksums.

Oops, that's a thinko from 5c995139, so fixed.  Note this has no
actual consequence though as PG_TEMP_FILE_PREFIX and PG_TEMP_FILES_DIR
have the same value.  However a bug is a bug.
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

thanks for the additional review!

Am Donnerstag, den 14.03.2019, 11:54 +0900 schrieb Kyotaro HORIGUCHI:
> At Wed, 13 Mar 2019 16:25:15 +0900, Michael Paquier <michael@paquier.xyz> wrote in
<20190313072515.GB2988@paquier.xyz>
> > On Wed, Mar 13, 2019 at 07:22:28AM +0100, Fabien COELHO wrote:
> > > Does not apply because of the renaming committed by Michaël.
> > > 
> > > Could you rebase?
> > 
> > This stuff touches pg_checksums.c, so you may want to wait one day or
> > two to avoid extra work...  I think that I'll be able to finish the
> > addition of the --enable and --disable switches by then.

I have rebased it now.

> > +     /* Make sure we report at most once every tenth of a second */
> > +     if ((INSTR_TIME_GET_MILLISEC(now) -
> > +              INSTR_TIME_GET_MILLISEC(last_progress_update) < 100) && !force)
> 
> I'm not a skilled gamer and 10Hz update is a bit hard for my
> eyes:p The second hand of old clocks ticks at 4Hz. I think it is
> enough frequently.

Ok.

> > 841/1516 MB (55%, 700 MB/s)
> 
> Differently from other prgress reporting project, estimating ETC
> (estimated time to completion), which is in the most important,
> is quite easy. (pgbench does that.) And I'd like to see a
> progress bar there but it might be overdoing. I'm not sure let
> the current size occupy a part of screen width is needed.  I
> don't think total size is needed to be fixed in MB and I think at
> most four or five digits are enough. (That is the same for
> speed.)
> 
> If the all of aboves are involved, the line would look as the
> follows.
> 
> [=======================             ] ( 63% of 12.53 GB, 179 MB/s, ETC 26s)
> 
> # Note that this is just an opinion.
> 
> (pg_checksum runs fast at the beginning so ETC behaves somewhat
> strange in the meanwhile.)

I haven't changed that for now as it seems to be a bit more involved.
I'd like to hear other opinions on whether that is worthwhile?

> > +#define MEGABYTES 1048576
> 
>  1024 * 1024 is preferable than that many digits.

Good point, changed that way.

> > +     /* we handle SIGUSR1 only, and toggle the value of show_progress */
> > +     if (signum == SIGUSR1)
> > +             show_progress = !show_progress;
> 
> SIGUSR1 *toggles* progress. 

Not sure what you mean here, are you saying it should be called
toggle_progress and not show_progress? I think it was like that
originally but got changed due to review comments.

> A garbage line is left alone after turning it off. It would be
> erasable. I'm not sure which is better, though.

How about just printing a "\n" in the signal handler if we are in a
terminal?  I think erasing the line might get too clever and will be a
portability hazard? I have done that now in the attached patch, let me
know what you think.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
> I have rebased it now.

Thanks. Will look at it.

>> If the all of aboves are involved, the line would look as the
>> follows.
>>
>> [=======================             ] ( 63% of 12.53 GB, 179 MB/s, ETC 26s)
>>
>> # Note that this is just an opinion.
>>
>> (pg_checksum runs fast at the beginning so ETC behaves somewhat
>> strange in the meanwhile.)
>
> I haven't changed that for now as it seems to be a bit more involved.
> I'd like to hear other opinions on whether that is worthwhile?

I think that the bar is overkill, but ETC is easy and nice.

>>> +     /* we handle SIGUSR1 only, and toggle the value of show_progress */
>>> +     if (signum == SIGUSR1)
>>> +             show_progress = !show_progress;
>>
>> SIGUSR1 *toggles* progress.
>
> Not sure what you mean here,

Probably it is meant to simplify the comment?

-- 
Fabien.

Re: Progress reporting for pg_verify_checksums

From
Kyotaro HORIGUCHI
Date:
Hello.

At Mon, 18 Mar 2019 23:14:01 +0100 (CET), Fabien COELHO <coelho@cri.ensmp.fr> wrote in
<alpine.DEB.2.21.1903182311130.23282@lancre>
>
> > I have rebased it now.
>
> Thanks. Will look at it.
>
> >> If the all of aboves are involved, the line would look as the
> >> follows.
> >>
> >> [=======================             ] ( 63% of 12.53 GB, 179 MB/s,
> >> ETC 26s)
> >>
> >> # Note that this is just an opinion.
> >>
> >> (pg_checksum runs fast at the beginning so ETC behaves somewhat
> >> strange in the meanwhile.)
> >
> > I haven't changed that for now as it seems to be a bit more involved.
> > I'd like to hear other opinions on whether that is worthwhile?
> I think that the bar is overkill, but ETC is easy and nice.

I'm fine with that.

> >>> +     /* we handle SIGUSR1 only, and toggle the value of show_progress
> >>> */
> >>> +     if (signum == SIGUSR1)
> >>> +             show_progress = !show_progress;
> >>
> >> SIGUSR1 *toggles* progress.
> >
> > Not sure what you mean here,
>
> Probably it is meant to simplify the comment?

Sorry. I meant that "it can be turned off and a perhaps-garbage
line is left alone after turning off. Don't we need to erase it?".

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center



Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Dienstag, den 19.03.2019, 09:04 +0900 schrieb Kyotaro HORIGUCHI:
> At Mon, 18 Mar 2019 23:14:01 +0100 (CET), Fabien COELHO <coelho@cri.ensmp.fr> wrote in
<alpine.DEB.2.21.1903182311130.23282@lancre>
> > >>> +     /* we handle SIGUSR1 only, and toggle the value of show_progress
> > >>> */
> > >>> +     if (signum == SIGUSR1)
> > >>> +             show_progress = !show_progress;
> > >>
> > >> SIGUSR1 *toggles* progress.
> > >
> > > Not sure what you mean here,
> > 
> > Probably it is meant to simplify the comment?
> 
> Sorry. I meant that "it can be turned off and a perhaps-garbage
> line is left alone after turning off. Don't we need to erase it?".

The current version prints a newline when it progress reporting is
toggled off. Do you mean there is a hazard that this happens right when
we are printing the progress, so end up with a partly garbage line? I
don't think that'd be so bad to warrant further code complexitiy, after
all, the user explicitly wanted the progress to stop.

But maybe I am misunderstanding?


Michael
  
-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz


Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Fri, Mar 22, 2019 at 02:23:17PM +0100, Michael Banck wrote:
> The current version prints a newline when it progress reporting is
> toggled off. Do you mean there is a hazard that this happens right when
> we are printing the progress, so end up with a partly garbage line? I
> don't think that'd be so bad to warrant further code complexitiy, after
> all, the user explicitly wanted the progress to stop.
>
> But maybe I am misunderstanding?

The latest patch does not apply because of mainly ed308d7.  Could you
send a rebase?

FWIW, I would remove the signal toggling stuff from the patch and keep
the logic simple at this stage.  Not having a solution which works on
Windows is perhaps not a strong reason to not include it, but it's a
sign that we could perhaps design something better, and that's
annoying.  Personally I think that I would just use --progress all the
time and not use the signaling part at all, my 2c.
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Samstag, den 23.03.2019, 10:49 +0900 schrieb Michael Paquier:
> On Fri, Mar 22, 2019 at 02:23:17PM +0100, Michael Banck wrote:
> > The current version prints a newline when it progress reporting is
> > toggled off. Do you mean there is a hazard that this happens right when
> > we are printing the progress, so end up with a partly garbage line? I
> > don't think that'd be so bad to warrant further code complexitiy, after
> > all, the user explicitly wanted the progress to stop.
> > 
> > But maybe I am misunderstanding?
> 
> The latest patch does not apply because of mainly ed308d7.  Could you
> send a rebase?

I've rebased it now, see attached.

> FWIW, I would remove the signal toggling stuff from the patch and keep
> the logic simple at this stage.  Not having a solution which works on
> Windows is perhaps not a strong reason to not include it, but it's a
> sign that we could perhaps design something better, and that's
> annoying.  Personally I think that I would just use --progress all the
> time and not use the signaling part at all, my 2c.

We had this discussion already, and back then I was against dropping it
as well.

Do you see --progress being the default? I don't think this is in-line
with other project binaries, so I guess not?

If not, I guess your main point is that we might come up with a better
implementation/UI further down the road.  Which is fair enough, but I
don't think this is such an in-your-face feature that changing the way
it works in future releases would be a terrible breakage of backwards
compatibility. Even more so as it is solely an interactive feature and
cannot be used in scripts etc.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Hallo Michael,

About patch v12:

Patch applies cleanly, compiles. make check ok, but feature is not tested. 
Doc build ok.

Although I'm in favor of it, I'm not sure that the signal think will make 
it for 12. Maybe it is worth compromising, doing a simple version for now, 
and resubmitting the signal feature later?

ISTM that someone suggested 4 Hz was the good eye speed, but you wait for 
400 ms, which is 2.5 Hz. What about it?

I seems that current_size and total_size are not in the same unit:

  +   if (current_size > total_size)
  +      total_size = current_size / MEGABYTES;

But they should both really be bytes, always.

I think that the computations should be inverted:
  - first adjust total_size to current_size if needed
  - then compute percent
  - remove the percent adjustement as it is <= 100
    since current_size <= total_size

I still think that the speed should compute a double to avoid integer 
rounding errors within the computation. ISTM that rounding should only be 
done for display in the end.

I'm okay with calling the report on each file even if this means every few 
GB...

Someone suggested ETA, and it seems rather simple to do. What about 
adding it?

-- 
Fabien.



Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

thanks again for your review!

Am Mittwoch, den 27.03.2019, 15:34 +0100 schrieb Fabien COELHO:
> Hallo Michael,
> 
> About patch v12:
> 
> Patch applies cleanly, compiles. make check ok, but feature is not tested. 
> Doc build ok.
> 
> Although I'm in favor of it, I'm not sure that the signal think will make 
> it for 12. Maybe it is worth compromising, doing a simple version for now, 
> and resubmitting the signal feature later?

Let's wait one more round. I think that feature is quite useful and
isolated enough that it could stay, but I'll remove it for the next
patch version if needed.

> ISTM that someone suggested 4 Hz was the good eye speed, but you wait for 
> 400 ms, which is 2.5 Hz. What about it?

Uh right, I messed that up, fixed.

> I seems that current_size and total_size are not in the same unit:
> 
>   +   if (current_size > total_size)
>   +      total_size = current_size / MEGABYTES;
> 
> But they should both really be bytes, always.

Yeah, they are both bytes and the "/ MEGABYTES" is wrong, removed.

> I think that the computations should be inverted:
>   - first adjust total_size to current_size if needed
>   - then compute percent
>   - remove the percent adjustement as it is <= 100
>     since current_size <= total_size

Ok, done so.

> I still think that the speed should compute a double to avoid integer 
> rounding errors within the computation. ISTM that rounding should only be 
> done for display in the end.

Ok, also done this way. I decided to print only full MB and not any
further digits as those don't seem very relevant.

> I'm okay with calling the report on each file even if this means every few 
> GB...

For my I/O throttling branch I've backed off to only call the report
every 100 blocks (and at the end of the file). I've added that logic to
this patch as well.

> Someone suggested ETA, and it seems rather simple to do. What about 
> adding it?

I don't know, just computing the ETA is easy of course. But you'd have
to fiddle with seconds/minutes/hours conversions cause the runtime could
be hours. Then, you don't want to have it bumping around weirdly when
your I/O rate changes, cf. https://xkcd.com/612/ That doesn't sounds
simpler than the signal trigger we might get rid of.

I have attached a POC which just prints the remaining seconds. If
somebody wants to contribute a full implementation as a co-author, I'm
happy to include it. Otherwise, I might take a stab at it over the next
days, but it is not on the top of my TODO list.

New patch attached. I've also changed the reporting line so that it
prints a couple of blanks at the end cause I noticed that if the
previously reported line was longer than the current line, then the end
of it is still visible.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:

>> I still think that the speed should compute a double to avoid integer
>> rounding errors within the computation. ISTM that rounding should only be
>> done for display in the end.
>
> Ok, also done this way. I decided to print only full MB and not any
> further digits as those don't seem very relevant.

For the computation to be in double, you must convert to double somewhere 
in the formula, otherwise it is computed as ints and only cast as a double 
afterwards. Maybe:

   current_speed = (1000.0 / MEGABYTES) * current_size / elapsed;

Or anything which converts to double early.

Instead of casting percent to int, maybe use "%.0f" as well, just like 
current_speed?

  +  if ((blockno % 100 == 0) && show_progress)

I'd invert the condition to avoid a modulo when not needed.

I'm not very found of global variables, but it does not matter here and 
doing it differently would involve more code. It would matter though if 
the progress report would be shared between commands.

>> I'm okay with calling the report on each file even if this means every few
>> GB...
>
> For my I/O throttling branch I've backed off to only call the report
> every 100 blocks (and at the end of the file). I've added that logic to
> this patch as well.

The 100 blocks = 800 KiB looks arbitrary. What is the rational? ISTM that 
postgres will tend to store a power of two number of pages on full 
segments, so maybe there could be a rational for 1024. Not sure.

>> Someone suggested ETA, and it seems rather simple to do. What about
>> adding it?
>
> I don't know, just computing the ETA is easy of course. But you'd have
> to fiddle with seconds/minutes/hours conversions cause the runtime could
> be hours. Then, you don't want to have it bumping around weirdly when
> your I/O rate changes, cf. https://xkcd.com/612/ That doesn't sounds
> simpler than the signal trigger we might get rid of.

Ok, it is more complicated that it looks for large sizes if second is not 
the right display unit.

-- 
Fabien.



Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Mittwoch, den 27.03.2019, 21:31 +0100 schrieb Fabien COELHO:
> > > I still think that the speed should compute a double to avoid integer
> > > rounding errors within the computation. ISTM that rounding should only be
> > > done for display in the end.
> > 
> > Ok, also done this way. I decided to print only full MB and not any
> > further digits as those don't seem very relevant.
> 
> For the computation to be in double, you must convert to double somewhere 
> in the formula, otherwise it is computed as ints and only cast as a double 
> afterwards. Maybe:
> 
>    current_speed = (1000.0 / MEGABYTES) * current_size / elapsed;
> 
> Or anything which converts to double early.

Are you sure, seeing elapsed is a double already? If I print out two
additional fractional digits for current_speed, I get two non-zero
numbers, are those garbage?

Anyway, I've done that now as it doesn't hurt.

> Instead of casting percent to int, maybe use "%.0f" as well, just like 
> current_speed?

Ok.

>   +  if ((blockno % 100 == 0) && show_progress)
> 
> I'd invert the condition to avoid a modulo when not needed.

Ok.

> > > I'm okay with calling the report on each file even if this means every few
> > > GB...
> > 
> > For my I/O throttling branch I've backed off to only call the report
> > every 100 blocks (and at the end of the file). I've added that logic to
> > this patch as well.
> 
> The 100 blocks = 800 KiB looks arbitrary. What is the rational? ISTM that 
> postgres will tend to store a power of two number of pages on full 
> segments, so maybe there could be a rational for 1024. Not sure.

It was arbitrary. I've changed it to 1024 now which looks a bit better.

> > > Someone suggested ETA, and it seems rather simple to do. What about
> > > adding it?
> > 
> > I don't know, just computing the ETA is easy of course. But you'd have
> > to fiddle with seconds/minutes/hours conversions cause the runtime could
> > be hours. Then, you don't want to have it bumping around weirdly when
> > your I/O rate changes, cf. https://xkcd.com/612/ That doesn't sounds
> > simpler than the signal trigger we might get rid of.
> 
> Ok, it is more complicated that it looks for large sizes if second is not 
> the right display unit.

Right, new version attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Hallo Michael,

>> Or anything which converts to double early.

> Are you sure, seeing elapsed is a double already?


Argh, I missed that. You are right that a double elapsed is enough for the 
second part. However, with

  +  current_speed = (current_size / MEGABYTES) / (elapsed / 1000.0);

the first division is an integer one because both operands are ints, so 
megabytes conversion is rounded down. I'd suggest:

  +  current_speed = ((double) current_size / MEGABYTES) / (elapsed / 1000);

> If I print out two additional fractional digits for current_speed, I get 
> two non-zero numbers, are those garbage?

Nope, they come from the other divisions, but the fist division will have 
wipped out 0.5 MiB on average.

>> Ok, it is more complicated that it looks for large sizes if second is not
>> the right display unit.
>
> Right, new version attached.

Applies, compiles, global & local "make check" (although not tested) ok, 
doc build ok, manual tests ok.

Otherwise a very minor comment: I'd invert !force and the computations in 
the return condition to avoid the computations when not needed.

-- 
Fabien.



Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Donnerstag, den 28.03.2019, 09:41 +0100 schrieb Fabien COELHO:
> Hallo Michael,
> 
> > > Or anything which converts to double early.
> > Are you sure, seeing elapsed is a double already?
> 
> 
> Argh, I missed that. You are right that a double elapsed is enough for the 
> second part. However, with
> 
>   +  current_speed = (current_size / MEGABYTES) / (elapsed / 1000.0);
> 
> the first division is an integer one because both operands are ints, so 
> megabytes conversion is rounded down. I'd suggest:
> 
>   +  current_speed = ((double) current_size / MEGABYTES) / (elapsed / 1000);

Ok.

> > > Ok, it is more complicated that it looks for large sizes if second is not
> > > the right display unit.
> > 
> > Right, new version attached.
> 
> Applies, compiles, global & local "make check" (although not tested) ok, 
> doc build ok, manual tests ok.

Thanks.

> Otherwise a very minor comment: I'd invert !force and the computations in 
> the return condition to avoid the computations when not needed.

The force is only ever true right at the end of the program so it would
not save anything really and detract from the main gist of that
statement, so I left it as-is.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
>> Otherwise a very minor comment: I'd invert !force and the computations in
>> the return condition to avoid the computations when not needed.
>
> The force is only ever true right at the end of the program so it would
> not save anything really and detract from the main gist of that
> statement, so I left it as-is.

Ok.

I marked it as ready, but I'd advise that you split it in (1) progress and 
(2) signal toggling so that the first part is more likely to make it 
before 12 freeze.

-- 
Fabien.



Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Donnerstag, den 28.03.2019, 10:08 +0100 schrieb Fabien COELHO:
> I marked it as ready, 

Thanks!

> but I'd advise that you split it in (1) progress and (2) signal
> toggling so that the first part is more likely to make it before 12
> freeze.

Ok, done so in the attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Hallo Michael,

>> but I'd advise that you split it in (1) progress and (2) signal
>> toggling so that the first part is more likely to make it before 12
>> freeze.
>
> Ok, done so in the attached.

Fine.

I think that it is good to show the overall impact of the signal stuff, in 
particular the fact that the size must always be computed if the progress 
may be activated.

Also, I'd suggest to add "volatile" on the show_progress variable in the 
second patch.

-- 
Fabien.

Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Thu, Mar 28, 2019 at 03:53:59PM +0100, Fabien COELHO wrote:
> I think that it is good to show the overall impact of the signal stuff, in
> particular the fact that the size must always be computed if the progress
> may be activated.

Getting to know the total size and the current size are the two
important factors that matter when it comes to do progress reporting
in my opinion.  I have read the patch, and I am not really convinced
by the need to show the progress report based on an interval of 250ms
as we talk about an operation which could take dozens of minutes.  So
I have simplified the patch to only show a progress report every
second.  This also removes the include for the time-related APIs from
portability/.  A second thing is that I don't think that the speed is
much useful.  I would expect the speed to be steady, still there is a
risk to show incorrect information if the speed of the operation is
spiky or irregular leading to an incorrect estimation of the remaining
time.

In short, I would like to commit the first patch as attached, which is
much more simple than what has been sent previously, still it provides
the progress information which is useful.
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Bonjour Michaël,

> Getting to know the total size and the current size are the two
> important factors that matter when it comes to do progress reporting
> in my opinion.  I have read the patch, and I am not really convinced
> by the need to show the progress report based on an interval of 250ms
> as we talk about an operation which could take dozens of minutes.

I do not think that it matters. I like to see things moving, and the 
performance impact is null.

> So I have simplified the patch to only show a progress report every 
> second.  This also removes the include for the time-related APIs from 
> portability/.

I do not think that it is a good idea, because Michael is thinking of 
adding some throttling capability, which would be a very good thing, but 
which will need something precise, so better use the precise stuff from 
the start. Also, the per second stuff induces rounding effects at the 
beginning.

> A second thing is that I don't think that the speed is much useful.

Hmmm. I like this information because I this is where I have expectations, 
whereas I'm not sure whether 1234 seconds for 12.3 GB is good or bad, but 
I know that 10 MB/s on my SSD is not very good.

> I would expect the speed to be steady, still there is a risk to show 
> incorrect information if the speed of the operation is spiky or 
> irregular leading to an incorrect estimation of the remaining time.

Hmmm. That is life, I'd say I'm used to it.

> In short, I would like to commit the first patch as attached, which is
> much more simple than what has been sent previously, still it provides
> the progress information which is useful.

I would prefer that you would keep the patch with the initial precision & 
features for the reasons outlined above, but you are the committer and I'm 
only a reviewer.

-- 
Fabien.

Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

so we are basically back at the original patch as written by Bernd :)

> +/*
> + * Report current progress status. Parts borrowed from
> + * PostgreSQLs' src/bin/pg_basebackup.c
> + */
> +static void
> +progress_report(bool force)
> +{
> +     int                     percent;
> +     char            total_size_str[32];
> +     char            current_size_str[32];
> +     pg_time_t       now;
> +
> +     if (!showprogress)
> +             return;

The way you've now changed this is that there's a function call into
progress_report() for every block that's being read, even if there is no
progress reporting requested. That looks like a pretty severe
performance problem so I suggest to at least stick with checking
showprogress before calling progress_report() and not the other way
round.

I guess there is still some noticeable performance penalty to pay when
reporting progress, even more so now that you reverted to only doing it
once per second - if we report progress, we presumably call
report_progress() thousands of times and return early 99,9% of the time.

So my vote is in favour of only calling progress_report() every once in
a while - I saw quite a speedup (or removal of slowdown) due to this in
my tests, this was not just some unwarranted microoptimization.

> +     fprintf(stderr, "\r");

I think the isatty() check that was in our versions of the patch is
useful here, did you check how this looks when piping the output to a
file?

> @@ -195,12 +254,23 @@ scan_file(const char *fn, BlockNumber segmentno)
>                                       _("%s: checksums enabled in file
\"%s\"\n"), progname, fn);
>       }
>  
> +     /* Make sure progress is reported at least once per file */
> +     progress_report(false);
> +
>       close(f);
>  }

This hunk is from the performance optimization of calling
progress_report only every 1024 blocks and could be removed if we stick
with calling it for every block anyway (but see above).

> -static void
> -scan_directory(const char *basedir, const char *subdir)
> +/*
> + * Scan the given directory for items which can be checksummed and
> + * operate on each one of them.  If "sizeonly" is true, the size of
> + * all the items which have checksums is computed and returned back

"computed" is maybe a bit strong a word here, how about "added up"?


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz



Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Sat, Mar 30, 2019 at 09:07:41PM +0100, Michael Banck wrote:
> The way you've now changed this is that there's a function call into
> progress_report() for every block that's being read, even if there is no
> progress reporting requested. That looks like a pretty severe
> performance problem so I suggest to at least stick with checking
> showprogress before calling progress_report() and not the other way
> round.
>
> So my vote is in favour of only calling progress_report() every once in
> a while - I saw quite a speedup (or removal of slowdown) due to this in
> my tests, this was not just some unwarranted microoptimization.

Do you have some runtime numbers?  If all the pages are in the OS
cache you should be able to see more impact.  I have been testing with
a pgbench database at scale 300 and --check, and perf is showing me
that progress_report counts for 0.10% with or without the option of
the profile while pg_checksum_page() counts for 36%.  Switching the
switch in or out of it makes the performance change lost in the noise.
I'd rather keep the check for showprogress within progress_report() so
as extra callers don't miss bypassing the report if --progress is not
enabled, still we are talking about only one caller now so the
attached does it the other way around with an assertion.

>> +     fprintf(stderr, "\r");
>
> I think the isatty() check that was in our versions of the patch is
> useful here, did you check how this looks when piping the output to a
> file?

Oops, fixed.  The output was strange.

> This hunk is from the performance optimization of calling
> progress_report only every 1024 blocks and could be removed if we stick
> with calling it for every block anyway (but see above).

Indeed, removed this one.

>> -static void
>> -scan_directory(const char *basedir, const char *subdir)
>> +/*
>> + * Scan the given directory for items which can be checksummed and
>> + * operate on each one of them.  If "sizeonly" is true, the size of
>> + * all the items which have checksums is computed and returned back
>
> "computed" is maybe a bit strong a word here, how about "added up"?

"computed" sounds fine to me.
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Sat, Mar 30, 2019 at 06:52:56PM +0100, Fabien COELHO wrote:
> I do not think that it matters. I like to see things moving, and the
> performance impact is null.

Another point is that this bloats the logs redirected to a file by 4
compared to the initial proposal.  I am not sure that this helps
much for anybody.

> I do not think that it is a good idea, because Michael is thinking of adding
> some throttling capability, which would be a very good thing, but which will
> need something precise, so better use the precise stuff from the start.
> Also, the per second stuff induces rounding effects at the beginning.

Let's revisit that when the need shows up then.  I'd rather have us
start with a basic set of metrics which can be extended later on.

> Hmmm. I like this information because I this is where I have expectations,
> whereas I'm not sure whether 1234 seconds for 12.3 GB is good or bad, but I
> know that 10 MB/s on my SSD is not very good.

Well, with some progress generated once per second you are one
substraction away to guess how much has been computed in the last N
second...
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Bonjour Michaël,

>> I do not think that it matters. I like to see things moving, and the
>> performance impact is null.
>
> Another point is that this bloats the logs redirected to a file by 4
> compared to the initial proposal.  I am not sure that this helps
> much for anybody.

Hmmm. Progress is more an interactive feature where the previous result is 
overriden thanks to the \r. Maybe it should be -P X where X is the 
expected delay in seconds. Pgbench progress reporting on initialization 
basically outputs 10 rows per second, probably it is too much.

>> I do not think that it is a good idea, because Michael is thinking of adding
>> some throttling capability, which would be a very good thing, but which will
>> need something precise, so better use the precise stuff from the start.
>> Also, the per second stuff induces rounding effects at the beginning.
>
> Let's revisit that when the need shows up then.  I'd rather have us
> start with a basic set of metrics which can be extended later on.

I do not see why it would be better to do it roughly if it is already 
implemented precisely and nicely.

>> Hmmm. I like this information because I this is where I have expectations,
>> whereas I'm not sure whether 1234 seconds for 12.3 GB is good or bad, but I
>> know that 10 MB/s on my SSD is not very good.
>
> Well, with some progress generated once per second you are one
> substraction away to guess how much has been computed in the last N
> second...

I would prefer to have the speed simply printed out.

The per second or more thing is debatable, but for the other changes I do 
not think that they improve the feature much.

As I said, I'm only a reviewer, you do as you feel.

-- 
Fabien.

Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Montag, den 01.04.2019, 15:10 +0900 schrieb Michael Paquier:
> On Sat, Mar 30, 2019 at 09:07:41PM +0100, Michael Banck wrote:
> > The way you've now changed this is that there's a function call into
> > progress_report() for every block that's being read, even if there is no
> > progress reporting requested. That looks like a pretty severe
> > performance problem so I suggest to at least stick with checking
> > showprogress before calling progress_report() and not the other way
> > round.
> > 
> > So my vote is in favour of only calling progress_report() every once in
> > a while - I saw quite a speedup (or removal of slowdown) due to this in
> > my tests, this was not just some unwarranted microoptimization.
> 
> Do you have some runtime numbers?

I had another look and I don't see any slowdown with your patch.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz



Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Mon, Apr 01, 2019 at 08:51:03PM +0200, Michael Banck wrote:
> I had another look and I don't see any slowdown with your patch.

I noticed one slowdown when using --disable --progress as this was
scanning the data directory for the total size but we don't need it in
this case.  Fixed that and committed.
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Mon, Apr 01, 2019 at 07:26:00PM +0200, Fabien COELHO wrote:
> Hmmm. Progress is more an interactive feature where the previous result is
> overriden thanks to the \r.

Well, many people also redirect the output for such things.

> Maybe it should be -P X where X is the expected
> delay in seconds. Pgbench progress reporting on initialization basically
> outputs 10 rows per second, probably it is too much.

I cannot say for pgbench.  I personally think that's a lot but you are
the one who wrote it as such I guess.

> I do not see why it would be better to do it roughly if it is already
> implemented precisely and nicely.

Simple things can be extended later on, while complicated things
cannot, and we don't have similar metrics for other tools which may
make sense for them to have (not pg_rewind, but pg_basebackup).
Please note that progress reports on the backend also include total
amount of data to process vs current amount of data processed, which
is reliable output.  The speed may be nice, but it is easy enough to
see in an output file where things get stuck even if there is no
speed showing up (or maybe just the difference with the last progress
makes more sense to have?).
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
Bonjour Michaël,

>> Maybe it should be -P X where X is the expected
>> delay in seconds. Pgbench progress reporting on initialization basically
>> outputs 10 rows per second, probably it is too much.
>
> I cannot say for pgbench.  I personally think that's a lot but you are
> the one who wrote it as such I guess.

Nope, this was way before I intervened. ISTM that a patch was submitted to 
get per second or slower progress reporting on the initalization, and it 
was rejected. Now that there are many SSD, maybe I could bring it back. An 
issue with pbbench is that the reported time and progress is for the 
insertion phase only, but PK and other FK declaration take as much time 
and are not included, so I'm not sure it can be much better.

For pg_checksums, probably some improvement patch will be submitted later, 
if someone feels like it.

-- 
Fabien.

Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Tue, Apr 02, 2019 at 08:11:45AM +0200, Fabien COELHO wrote:
> Nope, this was way before I intervened. ISTM that a patch was submitted to
> get per second or slower progress reporting on the initalization, and it was
> rejected. Now that there are many SSD, maybe I could bring it back. An issue
> with pbbench is that the reported time and progress is for the insertion
> phase only, but PK and other FK declaration take as much time and are not
> included, so I'm not sure it can be much better.

Bonus idea: integrate automatically with pg_stat_progress_vacuum() for
the last vacuum at the end of initialization.  This step can also take
a very long time and one can only know the progress of the VACUUM with
a second session/terminal.

> For pg_checksums, probably some improvement patch will be submitted later,
> if someone feels like it.

Let's see.  I think that what we have now in v12 is good enough for
checksum operations on an offline cluster.  And my take is that we
should focus more on online checksum verification for v13 and future
versions.

Regarding all this tooling around checksums.  With v12, enabling
checksums with no actual downtime is doable with a primary-standby
deployment using physical replication and one planned failover
(possible as well with v10 and newer versions and logical replication
but that takes longer), so I think that there is not much value in
being able to enable checksums on a single node while it is online.
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Fabien COELHO
Date:
>> For pg_checksums, probably some improvement patch will be submitted later,
>> if someone feels like it.
>
> Let's see.  I think that what we have now in v12 is good enough for
> checksum operations on an offline cluster.  And my take is that we
> should focus more on online checksum verification for v13 and future
> versions.

I agree.

For online, we should want throttling so that the check can have a reduced 
performance impact when scrubbing.

-- 
Fabien.



Re: Progress reporting for pg_verify_checksums

From
Michael Banck
Date:
Hi,

Am Dienstag, den 02.04.2019, 15:56 +0900 schrieb Michael Paquier:
> Regarding all this tooling around checksums.  With v12, enabling
> checksums with no actual downtime is doable with a primary-standby
> deployment using physical replication and one planned failover

Can you explain in more detail how this would work? I thought we came to
the conclusion (and the documentation seems to indicate so), that you
should stop all participating instances of a cluster and then enable
checksums on all of them, which would impose a downtime.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz



Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Tue, Apr 02, 2019 at 04:02:59PM +0200, Michael Banck wrote:
> Can you explain in more detail how this would work? I thought we came to
> the conclusion (and the documentation seems to indicate so), that you
> should stop all participating instances of a cluster and then enable
> checksums on all of them, which would impose a downtime.

That's what I was pointing out here:
https://www.postgresql.org/message-id/20190320225924.GC20192@paquier.xyz

I still agree about keeping in the docs safer recommendations, in the
shape of the ones currently present, than what I mentioned on the
other thread.
--
Michael

Attachment

Re: Progress reporting for pg_verify_checksums

From
Michael Paquier
Date:
On Tue, Apr 02, 2019 at 10:10:34AM +0200, Fabien COELHO wrote:
> For online, we should want throttling so that the check can have a reduced
> performance impact when scrubbing.

Yes.  Throttling is a necessary property in my opinion, perhaps in
combination with some autovacuum-like options to only trigger the
checks for a relation after a certain amout of activity has happened
on it.
--
Michael

Attachment