Re: adding status for COPY progress report - Mailing list pgsql-hackers
From | Zhihong Yu |
---|---|
Subject | Re: adding status for COPY progress report |
Date | |
Msg-id | CALNJ-vSjiHxNDC-2s6R6cC3TFQ9r08tbGOKgj0vxAqXVxWhWZA@mail.gmail.com Whole thread Raw |
In response to | Re: adding status for COPY progress report (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Responses |
Re: adding status for COPY progress report
|
List | pgsql-hackers |
On Wed, May 25, 2022 at 8:20 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
On Wed, 25 May 2022 at 16:32, Zhihong Yu <zyu@yugabyte.com> wrote:
>
> On Wed, May 25, 2022 at 3:55 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote:
>>
>> On Wed, 25 May 2022 at 10:15, Zhihong Yu <zyu@yugabyte.com> wrote:
>> >
>> > Hi,
>> > Patch v3 follows advice from Matthias (status field has been dropped).
>>
>> Could you argue why you think that this should be added to the
>> pg_stat_progress_copy view? Again, the progress reporting subsystem is
>> built to "report the progress of certain commands during command
>> execution". Why do you think we need to go further than that and allow
>> some commands to retain their report even after they've finished
>> executing?
>>
>> Of note: The contents of >st_progress_param are only defined and
>> guaranteed to be consistent when the reporting command is running.
>> Even if no other progress-reporting command is running other commands
>> or processes in that backend may call functions that update the fields
>> with somewhat arbitrary values when no progress-reporting command is
>> actively running, thus corrupting the information for the progress
>> reporting view.
>>
>> Could you please provide some insights on why you think that we should
>> change the progress reporting guts to accomodate something that it was
>> not built for?
>>
>>
>> Kind regards,
>>
>> Matthias van de Meent
>
> Hi, Matthias:
> When I first followed the procedure in https://paquier.xyz/postgresql-2/postgres-14-monitoring-copy/ , I didn't see the output from the view.
> This was because the example used 10 rows where the COPY command finishes quickly.
> I had to increase the row count in order to see output from the system view.
>
> With my patch, the user would be able to see the result of COPY command even if the duration for command execution is very short.
I see that that indeed now happens, but the point of the _progress
-views is that they show progress on tasks that are expected to take a
very long time while the connection that initiated the task does not
receive any updates. Good examples being REINDEX and CLUSTER, that
need to process tables of data (potentially terabytes in size) without
completing or sending meaningful data to the client. To show that
there is progress for such long-running tasks the pgstat_progress
subsystem was developed so that some long-running tasks now would show
their (lack of) progress.
The patch you sent, however, is not expected to be updated with
progress of the command: it is the final state of the command that
won't change. In my view, a backend that finished it's command
shouldn't be shown in pg_stat_progress -views.
Kind regards,
Matthias van de Meent.
Hi, Matthias:
Thanks for taking time to evaluate my patch.
I understand that pg_stat_progress views should show progress for on-going operation.
Let's look at the sequences of user activity for long running COPY command.
The user would likely issue queries to pg_stat_progress_copy over time.
Let's say on Nth invocation, the user sees X tuples copied.
On (N+1)st invocation, the view returns nothing.
The user knows that the COPY may have completed - but did the operation succeed or end up with some error ?
I would think that the user should be allowed to know the answer to the above question using the same query to pg_stat_progress_copy view.
What do you think ?
Cheers
pgsql-hackers by date: