Thread: Checkpoint start logging is done inside critical section
The LogCheckpointStart() call inside CreateCheckPoint() is done while inside a critical section. The elog call could trigger errors due to memory allocations or from a logging hook, resulting in a panic. It seems better to postpone the logging until after the critical section is done. It's only a few lwlock acquisitions away and shouldn't make any material difference. Patch to do so is attached. Regards, Ants Aasma
Attachment
On Thu, Oct 18, 2018 at 4:44 AM Ants Aasma <ants.aasma@eesti.ee> wrote: > > The LogCheckpointStart() call inside CreateCheckPoint() is done while > inside a critical section. The elog call could trigger errors due to > memory allocations or from a logging hook, resulting in a panic. > Yeah, but we use logging hook inside LWLockAcquire as well which is also called inside critical section in the nearby code, not sure if we can do anything about it. > It > seems better to postpone the logging until after the critical section > is done. It's only a few lwlock acquisitions away and shouldn't make > any material difference. Patch to do so is attached. > +1. I think we should backpatch this as well. I can take it forward unless people have objections to it. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Hi, On 2018-10-18 10:21:39 +0530, Amit Kapila wrote: > On Thu, Oct 18, 2018 at 4:44 AM Ants Aasma <ants.aasma@eesti.ee> wrote: > > > > The LogCheckpointStart() call inside CreateCheckPoint() is done while > > inside a critical section. The elog call could trigger errors due to > > memory allocations or from a logging hook, resulting in a panic. > > > > Yeah, but we use logging hook inside LWLockAcquire as well which is > also called inside critical section in the nearby code, not sure if we > can do anything about it. > > It > > seems better to postpone the logging until after the critical section > > is done. It's only a few lwlock acquisitions away and shouldn't make > > any material difference. Patch to do so is attached. > > > > +1. I think we should backpatch this as well. I can take it forward > unless people have objections to it. We do a fair bit of allocations inside a critical section during a checkpoint (that's why we mark the ctx as being ok with that). I've a very hard time seing this small allocation be a problem. I definitely don't think there's a need to backpatch that. Greetings, Andres Freund
On Thu, Oct 18, 2018 at 10:27 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2018-10-18 10:21:39 +0530, Amit Kapila wrote: > > On Thu, Oct 18, 2018 at 4:44 AM Ants Aasma <ants.aasma@eesti.ee> wrote: > > > > > > The LogCheckpointStart() call inside CreateCheckPoint() is done while > > > inside a critical section. The elog call could trigger errors due to > > > memory allocations or from a logging hook, resulting in a panic. > > > > > > > Yeah, but we use logging hook inside LWLockAcquire as well which is > > also called inside critical section in the nearby code, not sure if we > > can do anything about it. > > > > It > > > seems better to postpone the logging until after the critical section > > > is done. It's only a few lwlock acquisitions away and shouldn't make > > > any material difference. Patch to do so is attached. > > > > > > > +1. I think we should backpatch this as well. I can take it forward > > unless people have objections to it. > > We do a fair bit of allocations inside a critical section during a > checkpoint > The other possibility of allocation is during LocalSetXLogInsertAllowed(), but we make provision for that before starting critical section. Which other cases do you see where allocation is possible? I think in general, we take care that we don't allocate inside a critical section, otherwise, it will lead to at the very least assertion failure unless we have marked the memory context as "allowed in crit section". > (that's why we mark the ctx as being ok with that). > Yeah, as the palloc for log message would be called in an ErrorContext where it is safe to do the allocation, so ideally this shouldn't be a problem. So, it seems to me that this is not a problem, Ants, do you see any problem in any particular scenario or was this based on theoretical analysis? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Thu, Oct 18, 2018 at 9:02 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Oct 18, 2018 at 10:27 AM Andres Freund <andres@anarazel.de> wrote: > > (that's why we mark the ctx as being ok with that). > > > > Yeah, as the palloc for log message would be called in an ErrorContext > where it is safe to do the allocation, so ideally this shouldn't be a > problem. So, it seems to me that this is not a problem, Ants, do you > see any problem in any particular scenario or was this based on > theoretical analysis? This was purely theoretical, as also evidenced by lack of complaints even though the code has been like that for a very long time. I was actually mostly worried about extension code run by logging hook causing the panic. Regards, Ants Aasma