Re: [PROPOSAL] VACUUM Progress Checker. - Mailing list pgsql-hackers

From Amit Langote
Subject Re: [PROPOSAL] VACUUM Progress Checker.
Date
Msg-id CA+HiwqGMrT0DeJ6Mduw4qOAnZnATQLLrtsEmYXjc1GosHzQvQw@mail.gmail.com
Whole thread Raw
In response to Re: [PROPOSAL] VACUUM Progress Checker.  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
Responses Re: [PROPOSAL] VACUUM Progress Checker.  (Rahila Syed <rahilasyed90@gmail.com>)
Re: [PROPOSAL] VACUUM Progress Checker.  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Mar 11, 2016 at 2:31 PM, Amit Langote
<Langote_Amit_f8@lab.ntt.co.jp> wrote:
> On 2016/03/11 13:16, Robert Haas wrote:
>> On Thu, Mar 10, 2016 at 9:04 PM, Amit Langote
>> <Langote_Amit_f8@lab.ntt.co.jp> wrote:
>>> So, from what I understand here, we should not put total count of index
>>> pages into st_progress_param; rather, have the client (reading
>>> pg_stat_progress_vacuum) derive it using pg_indexes_size() (?), as and
>>> when necessary.  However, only server is able to tell the current position
>>> within an index vacuuming round (or how many pages into a given index
>>> vacuuming round), so report that using some not-yet-existent mechanism.
>>
>> Isn't that mechanism what you are trying to create in 0003?
>
> Right, 0003 should hopefully become that mechanism.

About 0003:

Earlier, it was trying to report vacuumed index block count using
lazy_tid_reaped() callback for which I had added a index_blkno
argument to IndexBulkDeleteCallback. Turns out it's not such a good
place to do what we are trying to do.  This callback is called for
every heap pointer in an index. Not all index pages contain heap
pointers, which means the existing callback does not allow to count
all the index blocks that AM would read to finish a given index vacuum
run.

Instead, the attached patch adds a IndexBulkDeleteProgressCallback
which AMs should call for every block that's read (say, right before a
call to ReadBufferExtended) as part of a given vacuum run. The
callback with help of some bookkeeping state can count each block and
report to pgstat_progress API. Now, I am not sure if all AMs read 1..N
blocks for every vacuum or if it's possible that some blocks are read
more than once in single vacuum, etc.  IOW, some AM's processing may
be non-linear and counting blocks 1..N (where N is reported total
index blocks) may not be possible.  However, this is the best I could
think of as doing what we are trying to do here. Maybe index AM
experts can chime in on that.

Thoughts?

Thanks,
Amit

Attachment

pgsql-hackers by date:

Previous
From: Haribabu Kommi
Date:
Subject: Re: pam auth - add rhost item
Next
From: Mithun Cy
Date:
Subject: Re: Explain [Analyze] produces parallel scan for select Into table statements.