Re: adding status for COPY progress report - Mailing list pgsql-hackers
From | Matthias van de Meent |
---|---|
Subject | Re: adding status for COPY progress report |
Date | |
Msg-id | CAEze2Wg7-9NW_WPjPECVGcxp8SuxWxjzv8fT163=OdEHvHvT2A@mail.gmail.com Whole thread Raw |
In response to | Re: adding status for COPY progress report (Zhihong Yu <zyu@yugabyte.com>) |
Responses |
Re: adding status for COPY progress report
|
List | pgsql-hackers |
On Tue, 24 May 2022 at 22:12, Zhihong Yu <zyu@yugabyte.com> wrote: > > On Tue, May 24, 2022 at 12:37 PM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: >> >> On Tue, 24 May 2022 at 19:13, Zhihong Yu <zyu@yugabyte.com> wrote: >> > >> > Hi, >> > Please see attached for enhancement to COPY command progress. >> > >> > The added status column would allow users to get the status of the most recent COPY command. >> >> I fail to see the merit of retaining completed progress reporting >> commands in their views after completion, other than making the >> behaviour of the pg_stat_progress-views much more complicated and >> adding overhead in places where we want the system to have as little >> overhead as possible. >> >> Trying to get the status of a COPY command after it finished on a >> different connection seems weird, as that other connection is likely >> to have already disconnected / started another task. To be certain >> that a backend can see the return status of the COPY command, you'd >> have to be certain that the connection doesn't run any other >> _progress-able commands in the following seconds / minutes, which >> implies control over the connection, which means you already have >> access to the resulting status of your COPY command. >> >> Regarding the patch: I really do not like that this leaks entries into >> all _progress views: I get garbage data from e.g. the _create_index >> and _copy views when VACUUM is running, etc, because you removed the >> filter on cmdtype. >> Also, the added fields in CopyToStateData / CopyFromStateData seem >> useless when a pgstat_progress_update_param in the right place should >> suffice. >> >> Kind regards, >> >> Matthias van de Meent > > Hi, > For #2 above, can you let me know where the pgstat_progress_update_param() call(s) should be added ? > In my patch, pgstat_progress_update_param() is called from error callback and EndCopy(). In the places that the patch currently sets cstate->status it could instead directly call pgstat_progress_update_param(..., STATUS_VALUE). I'm fairly certain that EndCopy is not called when the error callback is called, so status reporting should not be overwritten when unconditionally setting the status to OK in EndCopy. > For #1, if I use param18 (which is not used by other views), would that be better ? No: /* - * Report values for only those backends which are running the given - * command. + * Report values for only those backends which are running or have run. */ - if (!beentry || beentry->st_progress_command != cmdtype) + if (!beentry || beentry->st_progress_command_target == InvalidOid) continue; This change removes the filter that ensures that we only return the backends which have a st_progress_command of the correct cmdtype (i.e. for _progress_copy only those that have st_progress_command == PROGRESS_COMMAND_COPY. Without that filter, you'll return all backends that have (or have had) their progress fields set at any point. Which means that the expectation of "the backends returned by pg_stat_get_progress_info are those running the requested command" will be incorrect - you'll violate the contract / documented behaviour of the function: "Returns command progress information for the named command.". The numerical index of the column thus doesn't matter, what matters is that you want special behaviour for only the COPY progress reporting that doesn't fit with the rest of the progress-reporting infrastructure, and that the patch as-is breaks all progress reporting views. Sidenote: The change is also invalid because the rows that we expect to return for pg_stat_progress_basebackup always have st_progress_command_target == InvalidOid, so the backends running BASE_BACKUP would never be visible with the change as-is. COPY (SELECT stuff) also would not show up, because that too reports a command_target of InvalidOid. Either way, I don't think that this idea is worth pursuing: the progress views are explicitly there to show the progress of currently active backends, and not to show the last progress state of backends that at some point ran a progress-reporting-enabled command. Kind regards, Matthias van de Meent
pgsql-hackers by date: