Re: Progress reporting for pg_verify_checksums - Mailing list pgsql-hackers

From Fabien COELHO
Subject Re: Progress reporting for pg_verify_checksums
Date
Msg-id alpine.DEB.2.21.1809201058240.2239@lancre
Whole thread Raw
In response to Re: Progress reporting for pg_verify_checksums  (Michael Banck <michael.banck@credativ.de>)
Responses Re: Progress reporting for pg_verify_checksums
List pgsql-hackers
Hallo Michael,

About patch v3: applies cleanly and compiles.

The xml documentation should be updated! (It is kind of hard to notice 
what is not there:-)

See "doc/src/sgml/ref/pg_verify_checksums.sgml".

>> The time() granularity to the second makes the result awkward on small
>> tests:
>>
>>  8/1554552 kB (0%, 8 kB/s)
>>  1040856/1554552 kB (66%, 1040856 kB/s)
>>  1554552/1554552 kB (100%, 1554552 kB/s)
>>
>> I'd suggest to reuse the "portability/instr_time.h" stuff which allows a
>> much better precision.

I still think it would make sense to use that instead of low-precision 
time().

>> The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying
>> 1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are
>> about storage which is usually measured in powers of 1,000, so I'd suggest
>> to use thousands.
>>
>> Another reserve I have is that on any hardware it is likely that there will
>> be a lot of kilos, so maybe megas (MB) should be used instead.
>
> The display is exactly the same as in pg_basebackup (except the
> bandwith is shown as well), so right now I think it is more useful to be
> consistent here.

Hmmm... units are units, and the display is wrong. The fact that other 
commands are wrong as well does not look like a good argument to persist 
in an error.

> So if we change that, pg_basebackup should be changed as well I think.

I'm okay with fixing pg_basebackup as well! I'm unsure about the best 
place to put such a function, though. Probably under "src/common/" where 
there is already some front-end specific code ("fe_memutils.c").

> Maybe the code could be factored out to some common file in the future.

I would be okay with a progress printing function shared between some 
commands. It just needs some place. pg_rewind also has a --rewind option.

>> Maybe it would be nice to show elapsed time and expected completion time
>> based on the current speed.
>
> Maybe; this could be added to the factored out common code I mentioned
> above.

Yep.

>> I would be in favor or having this turned on by default, and silenced with
>> some option. I'm not sure other people would agree, though, so it is an open
>> question as well.
>
> If this runs in a script, it would be pretty annoying, so everybody
> would have to add --no-progress so I am not convinced. Also, both
> pg_basebackup and pgbench don't show progress by default.
>

Ok.

>> I do not see much advantage in using intermediate string buffers for values:
>>
>>  + snprintf(totalstr, sizeof(totalstr), INT64_FORMAT,
>>  +          total_size / 1024);
>>  + snprintf(currentstr, sizeof(currentstr), INT64_FORMAT,
>>  +          current_size / 1024);
>>  + snprintf(currspeedstr, sizeof(currspeedstr), INT64_FORMAT,
>>  +          (current_size / 1024) / (((time(NULL) - scan_started) == 0) ? 1 : (time(NULL) - scan_started)));
>>  + fprintf(stderr, "%s/%s kB (%d%%, %s kB/s)",
>>  +          currentstr, totalstr, total_percent, currspeedstr);
>>
>> ISTM that just one simple fprintf would be fine:
>>
>>   fprintf(stderr, INT64_FORMAT "/" INT64_FORMAT " ...",
>>           formulas for each slot...);
>
> That code is adopted from pg_basebackup.c and the comment there says:
>
> |     * Separate step to keep platform-dependent format code out of
> |     * translatable strings.  And we only test for INT64_FORMAT
> |     * availability in snprintf, not fprintf.
>
> So that sounds legit to me.

Hmmm. Translation. Not sure I like much the idea of translating units,
but why not.

> |     * If we are reporting to a terminal, send a carriage return so that
> |     * we stay on the same line.  If not, send a newline.
>
>> And it runs a little amok with "-v".
>
> Do you suggest we should make those mutually exlusive?  I agree that in
> general, -P is not very useful if -v is on, but if you have a really big
> table, it should still be, no?

Yep. I was just mentionning that they interfere on a terminal, but I agree
that there is no obvious fix.

>>  + memset(&act, '\0', sizeof(act));
>>
>> pg uses 0 instead of '\0' everywhere else.
>
> Ok.

Not '0', plain 0 (the integer of value zero).

-- 
Fabien.


pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: pgbench - add pseudo-random permutation function
Next
From: Tomas Vondra
Date:
Subject: Re: [PATCH] Improve geometric types