Re: error context for vacuum to include block number - Mailing list pgsql-hackers

From Justin Pryzby
Subject Re: error context for vacuum to include block number
Date
Msg-id 20200319202931.GT26184@telsasoft.com
Whole thread Raw
In response to Re: error context for vacuum to include block number  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: error context for vacuum to include block number  (Justin Pryzby <pryzby@telsasoft.com>)
List pgsql-hackers
On Thu, Mar 19, 2020 at 03:18:32PM +0530, Amit Kapila wrote:
> > You're right.  PHASE_SCAN_HEAP was set, but only inside a conditional.
> 
> I think if we do it inside for loop, then we don't need to set it
> conditionally at multiple places.  I have changed like that in the
> attached patch, see if that makes sense to you.

Yes, makes sense, and it's right near pgstat_progress_update_param, which is
nice.

> > Both those issues are due to a change in the most recent patch.  In the
> > previous patch, the PHASE_VACUUM_HEAP was set only by lazy_vacuum_heap(), and I
> > moved it recently to vacuum_page.  But it needs to be copied, as you point out.
> >
> > That's unfortunate due to a lack of symmetry: lazy_vacuum_page does its own
> > progress update, which suggests to me that it should also set its own error
> > callback.  It'd be nicer if EITHER the calling functions did that (scan_heap()
> > and vacuum_heap()) or if it was sufficient for the called function
> > (vacuum_page()) to do it.
> 
> Right, but adding in callers will spread at multiple places.
> 
> I have made a few additional changes in the attached. (a) Removed
> VACUUM_ERRCB_PHASE_VACUUM_FSM as I think we have to add it at many
> places, you seem to have added for FreeSpaceMapVacuumRange() but not
> for RecordPageWithFreeSpace(), (b) Reset the phase to
> VACUUM_ERRCB_PHASE_UNKNOWN after finishing the work for a particular
> phase, so that the new phase shouldn't continue in the callers.
> 
> I have another idea to make (b) better.  How about if a call to
> update_vacuum_error_cbarg returns information of old phase (blkno,
> phase, and indname) along with what it is doing now and then once the
> work for the current phase is over it can reset it back with old phase
> information?   This way the callee after finishing the new phase work
> would be able to reset back to the old phase.  This will work
> something similar to our MemoryContextSwitchTo.

I was going to suggest that we could do that by passing in a pointer to a local
variable "LVRelStats olderrcbarg", like:
|        update_vacuum_error_cbarg(vacrelstats, VACUUM_ERRCB_PHASE_SCAN_HEAP,
|                                  blkno, NULL, &olderrcbarg);

and then later call:
|update_vacuum_error_cbarg(vacrelstats, olderrcbarg.phase,
|                                       olderrcbarg.blkno,
|                                       olderrcbarg.indname,
|                                       NULL);

I implemented it in a separate patch, but it may be a bad idea, due to freeing
indname.  To exercise it, I tried to cause a crash by changing "else if
(errcbarg->indname)" to "if" without else, but wasn't able to cause a crash,
probably just due to having a narrow timing window.

As written, we only pfree indname if we do actually "reset" the cbarg, which is
in the two routines handling indexes.  It's probably a good idea to pass the
indname rather than the relation in any case.

I rebased the rest of my patches on top of yours.

-- 
Justin

Attachment

pgsql-hackers by date:

Previous
From: Chaitanya bodlapati
Date:
Subject: invalid byte sequence for encoding "UTF8": 0x95-while using PGPEncryption -PostgreSQL
Next
From: Chaitanya bodlapati
Date:
Subject: Fwd: invalid byte sequence for encoding "UTF8": 0x95-while using PGPEncryption -PostgreSQL