Re: pg_stat_progress_basebackup - progress reporting forpg_basebackup, in the server side - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: pg_stat_progress_basebackup - progress reporting forpg_basebackup, in the server side
Date
Msg-id CABUevEyi5Zrh06EFPTcrBFk+q=2ASYNymjMZ8qoNryL=gPbRpQ@mail.gmail.com
Whole thread Raw
In response to Re: pg_stat_progress_basebackup - progress reporting forpg_basebackup, in the server side  (Fujii Masao <masao.fujii@oss.nttdata.com>)
Responses Re: pg_stat_progress_basebackup - progress reporting forpg_basebackup, in the server side  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
On Fri, Mar 6, 2020 at 1:51 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
>
>
> On 2020/03/06 0:45, Magnus Hagander wrote:
> > On Wed, Mar 4, 2020 at 11:15 PM Peter Eisentraut
> > <peter.eisentraut@2ndquadrant.com> wrote:
> >>
> >> On 2020-03-05 05:53, Fujii Masao wrote:
> >>> Or, as another approach, it might be worth considering to make
> >>> the server always estimate the total backup size whether --progress is
> >>> specified or not, as Amit argued upthread. If the time required to
> >>> estimate the backup size is negligible compared to total backup time,
> >>> IMO this approach seems better. If we adopt this, we can also get
> >>> rid of PROGESS option from BASE_BACKUP replication command.
> >>
> >> I think that would be preferable.
> >
> >  From a UI perspective I definitely agree.
> >
> > The problem with that one is that it can take a non-trivlal amount of
> > time, that's why it was made an option (in the protocol) in the first
> > place. Particularly if you have a database with many small objets.
>
> Yeah, this is why I made the server estimate the total backup size
> only when --progress is specified.
>
> Another idea is;
> - Make pg_basebackup specify PROGRESS option in BASE_BACKUP command
>    whether --progress is specified or not. This causes the server to estimate
>    the total backup size even when users don't specify --progress.
> - Change pg_basebackup so that it treats --progress option as just a knob to
>    determine whether to report the progress in a client-side.
> - Add new option like --no-estimate-backup-size (better name?) to
>    pg_basebackup. If this option is specified, pg_basebackup doesn't use
>    PROGRESS in BASE_BACKUP and the server doesn't estimate the backup size.
>
> I believe that the time required to estimate the backup size is not so large
> in most cases, so in the above idea, most users don't need to specify more
> option for the estimation. This is good for UI perspective.
>
> OTOH, users who are worried about the estimation time can use
> --no-estimate-backup-size option and skip the time-consuming estimation.

Personally, I think this is the best idea. it brings a "reasonable
default", since most people are not going to have this problem, and
yet a good way to get out from the issue for those that potentially
have it. Especially since we are now already showing the state that
"walsender is estimating the size", it should be easy enugh for people
to determine if they need to use this flag or not.

In nitpicking mode, I'd just call the flag --no-estimate-size -- it's
pretty clear things are about backups when you call pg_basebackup, and
it keeps the option a bit more reasonable in length.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager
Next
From: Andres Freund
Date:
Subject: Re: explain HashAggregate to report bucket and memory stats