Re: pg_stat_progress_basebackup - progress reporting forpg_basebackup, in the server side - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: pg_stat_progress_basebackup - progress reporting forpg_basebackup, in the server side
Date
Msg-id 20969226-5617-9c60-68a6-06b3768ce03f@oss.nttdata.com
Whole thread Raw
In response to Re: pg_stat_progress_basebackup - progress reporting forpg_basebackup, in the server side  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: pg_stat_progress_basebackup - progress reporting forpg_basebackup, in the server side  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
List pgsql-hackers

On 2020/01/30 12:58, Kyotaro Horiguchi wrote:
> At Wed, 29 Jan 2020 23:16:08 +0900, Fujii Masao <masao.fujii@oss.nttdata.com> wrote in
>> Hi,
>>
>> pg_basebackup reports the backup progress if --progress option is
>> specified,
>> and we can monitor it in the client side. I think that it's useful if
>> we can
>> monitor the progress information also in the server side because, for
>> example,
>> we can easily check that by using SQL. So I'd like to propose
>> pg_stat_progress_basebackup view that allows us to monitor the
>> progress
>> of pg_basebackup, in the server side. Thought?
>>
>> POC patch is attached.
> 
> | postgres=# \d pg_stat_progress_basebackup
> |          View "pg_catalog.pg_stat_progress_basebackup"
> |        Column        |  Type   | Collation | Nullable | Default
> | ---------------------+---------+-----------+----------+---------
> |  pid                 | integer |           |          |
> |  phase               | text    |           |          |
> |  backup_total        | bigint  |           |          |
> |  backup_streamed     | bigint  |           |          |
> |  tablespace_total    | bigint  |           |          |
> |  tablespace_streamed | bigint  |           |          |
> 
> I think the view needs client identity such like host/port pair
> besides pid (host/port pair fails identify client in the case of
> unix-sockets.).

I don't think this is job of a progress reporting. If those information
is necessary, we can join this view with pg_stat_replication using
pid column as the join key.

> Also elapsed time from session start might be
> useful.

+1
I think that not only pg_stat_progress_basebackup but also all the other
progress views should report the time when the target command was
started and the time when the phase was last changed. Those times
would be useful to estimate the remaining execution time from the
progress infromation.

> pg_stat_progress_acuum has datid, datname and relid.
> 
> +    if (backup_total > 0 && backup_streamed > backup_total)
> +    {
> +        backup_total = backup_streamed;
> 
> Do we need the condition "backup_total > 0"?

I added the condition for the case where --progress option is not supplied
in pg_basebackup, i.e., the case where the total amount of backup is not
estimated at the beginning of the backup. In this case, total_backup is
always 0.

> +        int        tblspc_streamed = 0;
> +
> +        pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
> +                                     PROGRESS_BASEBACKUP_PHASE_STREAM_BACKUP);
> 
> This starts "streaming backup" phase with backup_total = 0. Coudln't
> we move to that phase after setting backup total and tablespace total?
> That is, just after calling SendBackupHeader().

OK, that's a bit less confusing for users. I will change in that way.

> +            WHEN 3 THEN 'stopping backup'::text
> 
> I'm not sure, but the "stop" seems suggesting the backup is terminated
> before completion. If it is following the name of the function
> pg_stop_backup, I think the name is suggesting to stop "the state for
> performing backup", not a backup.
> 
> In the first place, the progress is about "backup" so it seems strange
> that we have another phase after the "stopping backup" phase.  It
> might be better that it is "finishing file transfer" or such.
> 
>     "initializing"
> -> "starting file transfer"
> -> "transferring files"
> -> "finishing file transfer"
> -> "transaferring WAL"

Better name is always welcome! If "stopping back" is confusing,
what about "performing pg_stop_backup"? So

initializing
performing pg_start_backup
streaming database files
performing pg_stop_backup
transfering WAL files

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: parens cleanup
Next
From: Robert Haas
Date:
Subject: Re: Enabling B-Tree deduplication by default