Re: Progress reporting for pg_verify_checksums - Mailing list pgsql-hackers

From Michael Banck
Subject Re: Progress reporting for pg_verify_checksums
Date
Msg-id 1553758003.4884.39.camel@credativ.de
Whole thread Raw
In response to Re: Progress reporting for pg_verify_checksums  (Fabien COELHO <coelho@cri.ensmp.fr>)
Responses Re: Progress reporting for pg_verify_checksums
List pgsql-hackers
Hi,

Am Mittwoch, den 27.03.2019, 21:31 +0100 schrieb Fabien COELHO:
> > > I still think that the speed should compute a double to avoid integer
> > > rounding errors within the computation. ISTM that rounding should only be
> > > done for display in the end.
> > 
> > Ok, also done this way. I decided to print only full MB and not any
> > further digits as those don't seem very relevant.
> 
> For the computation to be in double, you must convert to double somewhere 
> in the formula, otherwise it is computed as ints and only cast as a double 
> afterwards. Maybe:
> 
>    current_speed = (1000.0 / MEGABYTES) * current_size / elapsed;
> 
> Or anything which converts to double early.

Are you sure, seeing elapsed is a double already? If I print out two
additional fractional digits for current_speed, I get two non-zero
numbers, are those garbage?

Anyway, I've done that now as it doesn't hurt.

> Instead of casting percent to int, maybe use "%.0f" as well, just like 
> current_speed?

Ok.

>   +  if ((blockno % 100 == 0) && show_progress)
> 
> I'd invert the condition to avoid a modulo when not needed.

Ok.

> > > I'm okay with calling the report on each file even if this means every few
> > > GB...
> > 
> > For my I/O throttling branch I've backed off to only call the report
> > every 100 blocks (and at the end of the file). I've added that logic to
> > this patch as well.
> 
> The 100 blocks = 800 KiB looks arbitrary. What is the rational? ISTM that 
> postgres will tend to store a power of two number of pages on full 
> segments, so maybe there could be a rational for 1024. Not sure.

It was arbitrary. I've changed it to 1024 now which looks a bit better.

> > > Someone suggested ETA, and it seems rather simple to do. What about
> > > adding it?
> > 
> > I don't know, just computing the ETA is easy of course. But you'd have
> > to fiddle with seconds/minutes/hours conversions cause the runtime could
> > be hours. Then, you don't want to have it bumping around weirdly when
> > your I/O rate changes, cf. https://xkcd.com/612/ That doesn't sounds
> > simpler than the signal trigger we might get rid of.
> 
> Ok, it is more complicated that it looks for large sizes if second is not 
> the right display unit.

Right, new version attached.


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.banck@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz
Attachment

pgsql-hackers by date:

Previous
From: Haribabu Kommi
Date:
Subject: Re: Transaction commits VS Transaction commits (with parallel) VSquery mean time
Next
From: Sergei Kornilov
Date:
Subject: Re: Planning counters in pg_stat_statements (using pgss_store)