Re: display offset along with block number in vacuum errors - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: display offset along with block number in vacuum errors
Date
Msg-id CAA4eK1Kq-VMB6u5RWtK95EGV3uXgTrAx7O8j4vsK-VbKinZ6Dg@mail.gmail.com
Whole thread Raw
In response to Re: display offset along with block number in vacuum errors  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
Responses Re: display offset along with block number in vacuum errors  (Masahiko Sawada <masahiko.sawada@2ndquadrant.com>)
List pgsql-hackers
On Wed, Aug 19, 2020 at 12:54 PM Masahiko Sawada
<masahiko.sawada@2ndquadrant.com> wrote:
>
> On Tue, 18 Aug 2020 at 13:06, Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > I don't think we need to expose LVRelStats. We can just pass the
> > address of vacrelstats->offset_num to achieve what we want. I have
> > tried that and it works, see the
> > v6-0002-additinal-error-context-information-in-heap_page_ patch
> > attached. Do you see any problem with it?
>
> Yes, you're right. I'm concerned a bit the number of arguments passing
> to heap_page_prune() might get higher when we need other values to
> update for errcontext, but I'm okay with the current patch.
>

Yeah, we might need to think if we want to increase the number of
parameters but not sure we need to worry at this stage. If required, I
think we can either expose LVRelStats or extract a few parameters from
it and form a separate structure that could be passed to
heap_page_prune.

> Currently, we're in SCAN_HEAP phase in heap_page_prune() but should it
> be VACUUM_HEAP instead?
>

I think it is currently similar to what we do in progress reporting.
We set to VACUUM_HEAP phase where the progress reporting is also set
to *HEAP_BLKS_VACUUMED. Currently, heap_page_prune() is covered under
*HEAP_BLKS_SCANNED, so I don't see a pressing need to change the error
context phase for heap_page_prune(). And also, we need to add some
more smarts in heap_page_prune() for this which I want to avoid.

> Also, I've tested the patch with log_min_messages = 'info' and get the
> following sever logs:
>
..
>
> This is not directly related to the patch but it looks like we can
> improve the current errcontext settings. For instance, the message
> from lazy_vacuum_index(): there are two messages reporting the phases.
> I've attached the patch that improves the current errcontext setting,
> which can be applied before the patch adding offset number.
>

After your patch, I see output like below with log_min_messages=info,

2020-08-20 10:11:46.769 IST [2640] INFO:  scanned index "idx_test_c1"
to remove 10000 row versions
2020-08-20 10:11:46.769 IST [2640] DETAIL:  CPU: user: 0.06 s, system:
0.01 s, elapsed: 0.06 s
2020-08-20 10:11:46.769 IST [2640] CONTEXT:  while vacuuming index
"idx_test_c1" of relation "public.test_vac"

2020-08-20 10:11:46.901 IST [2640] INFO:  scanned index "idx_test_c2"
to remove 10000 row versions
2020-08-20 10:11:46.901 IST [2640] DETAIL:  CPU: user: 0.10 s, system:
0.01 s, elapsed: 0.13 s
2020-08-20 10:11:46.901 IST [2640] CONTEXT:  while vacuuming index
"idx_test_c2" of relation "public.test_vac"

2020-08-20 10:11:46.917 IST [2640] INFO:  "test_vac": removed 10000
row versions in 667 pages
2020-08-20 10:11:46.917 IST [2640] DETAIL:  CPU: user: 0.01 s, system:
0.00 s, elapsed: 0.01 s

2020-08-20 10:11:46.928 IST [2640] INFO:  index "idx_test_c1" now
contains 50000 row versions in 276 pages
2020-08-20 10:11:46.928 IST [2640] DETAIL:  10000 index row versions
were removed.
        136 index pages have been deleted, 109 are currently reusable.
        CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
2020-08-20 10:11:46.928 IST [2640] CONTEXT:  while cleaning up index
"idx_test_c1" of relation "public.test_vac"

Here, we can notice that for the index, we are getting context
information but not for the heap. The reason is that in
vacuum_error_callback, we are not printing additional information for
phases VACUUM_ERRCB_PHASE_SCAN_HEAP and VACUUM_ERRCB_PHASE_VACUUM_HEAP
when block number is invalid. If we want to cover the 'info' messages
then won't it be better if we print a message in those phases even
block number is invalid (something like 'while scanning relation
\"%s.%s\"")

-- 
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: David Rowley
Date:
Subject: Re: [PG13] Planning (time + buffers) data structure in explain plan (format text)
Next
From: Michael Paquier
Date:
Subject: Re: "ccold" left by reindex concurrently are droppable?