Re: [PROPOSAL] VACUUM Progress Checker. - Mailing list pgsql-hackers
From | Amit Langote |
---|---|
Subject | Re: [PROPOSAL] VACUUM Progress Checker. |
Date | |
Msg-id | 56DD2ACE.5050208@lab.ntt.co.jp Whole thread Raw |
In response to | Re: [PROPOSAL] VACUUM Progress Checker. (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: [PROPOSAL] VACUUM Progress Checker.
|
List | pgsql-hackers |
Horiguchi-san, Thanks a lot for taking a look! On 2016/03/07 13:02, Kyotaro HORIGUCHI wrote: > At Sat, 5 Mar 2016 16:41:29 +0900, Amit Langote wrote: >> On Sat, Mar 5, 2016 at 4:24 PM, Amit Langote <amitlangote09@gmail.com> wrote: >>> So, I took the Vinayak's latest patch and rewrote it a little >> ... >>> I broke it into two: >>> >>> 0001-Provide-a-way-for-utility-commands-to-report-progres.patch >>> 0002-Implement-progress-reporting-for-VACUUM-command.patch >> >> Oops, unamended commit messages in those patches are misleading. So, >> please find attached corrected versions. > > The 0001-P.. adds the following interface functions. > > +extern void pgstat_progress_set_command(BackendCommandType cmdtype); > +extern void pgstat_progress_set_command_target(Oid objid); > +extern void pgstat_progress_update_param(int index, uint32 val); > +extern void pgstat_reset_local_progress(void); > +extern int pgstat_progress_get_num_param(BackendCommandType cmdtype); > > I don't like to treat the target object id differently from other > parameters. It could not be needed at all, or could be needed two > or more in contrast. Although oids are not guaranteed to fit > uint32, we have already stored BlockNumber there. I thought giving cmdtype and objid each its own slot would make things a little bit clearer than stuffing them into st_progress_param[0] and st_progress_param[1], respectively. Is that what you are suggesting? Although as I've don, a separate field st_command_objid may be a bit too much. If they are not special fields, I think we don't need special interface functions *set_command() and *set_command_target(). But I am still inclined toward keeping the former. > # I think that integer arrays might be needed to be passed as a > # parameter, but it would be the another issue. Didn't really think about it. Maybe we should consider a scenario that would require it. > pg_stat_get_progress_info returns a tuple with 10 integer columns > (plus an object id). The reason why I suggested use of an integer > array is that it allows the API to serve arbitrary number of > parmeters without a modification of API, and array indexes are > coloreless than any concrete names. Howerver I don't stick to > that if we agree that it is ok to have fixed number of paremters. I think the fixed number of parameters in the form of a fixed-size array is because st_progress_param[] is part of a shared memory structure as discussed before. Although such interface has been roughly modeled on how pg_statistic catalog and pg_stats view or get_attstatsslot() function work, shared memory structures take the place of the catalog, so there are some restrictions (fixed size array being one). Regarding index into st_progress_param[], pgstat.c/pgstatfuncs.c should not bother what it is. As exemplified in patch 0002, individual index numbers can be defined as macros by individual command modules (suggested by Robert recently) with certain convention for readability such as the following in lazyvacuum.c: #define PROG_PAR_VAC_RELID 0 #define PROG_PAR_VAC_PHASE_ID 1 #define PROG_PAR_VAC_HEAP_BLKS 2 #define PROG_PAR_VAC_CUR_HEAP_BLK 3 ... so on. Then, to report a changed parameter: pgstat_progress_update_param(PROG_PAR_VAC_PHASE_ID, LV_PHASE_SCAN_HEAP); ... pgstat_progress_update_param(PROG_PAR_VAC_CUR_HEAP_BLK, blkno); by the way, following is proargnames[] for pg_stat_get_progress_info(): cmdtype integer, OUT pid integer, OUT param1 integer, OUT param2 integer, ... OUT param10 integer So, it is a responsibility of a command specific progress view definition that it interprets values of param1..param10 appropriately. In fact, the implementer of the progress reporting for a command determines what goes into which slot of st_progress_param[], to begin with. > pgstat_progress_get_num_param looks not good in the aspect of > genericity. I'd like to define it as an integer array by idexed > by the command type if it is needed. However it seems to me to be > enough that pg_stat_get_progress_info always returns 10 integers > regardless of what the numbers are for. The user sql function, > pg_stat_vacuum_progress as the first user, knows how many numbers > should be read for its work. It reads zeroes safely even if it > reads more than what the producer side offered (unless it tries > to divide something with it). Thinking a bit, perhaps we don't need num_param(cmdtpye) function or array at all as you seem to suggest. It serves no useful purpose now that I see it. pg_stat_get_progress_info() should simply copy st_progress_param[0...PG_STAT_GET_PROGRESS_COLS-1] to the result and view definer knows what's what. Attached updated patches which incorporate above mentioned changes. If Vinayak has something else in mind about anything, he can weigh in. Thanks, Amit
Attachment
pgsql-hackers by date: