Re: crash with assertions and WAL_DEBUG - Mailing list pgsql-hackers

From Alvaro Herrera
Subject Re: crash with assertions and WAL_DEBUG
Date
Msg-id 20140624144722.GH5032@eldon.alvh.no-ip.org
Whole thread Raw
In response to Re: crash with assertions and WAL_DEBUG  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: crash with assertions and WAL_DEBUG  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
Heikki Linnakangas wrote:
> On 06/21/2014 01:58 PM, Heikki Linnakangas wrote:
> >It's a bit difficult to attach the mark to the palloc calls, as neither
> >the WAL_DEBUG or LWLOCK_STATS code is calling palloc directly, but
> >marking specific MemoryContexts as sanctioned ought to work. I'll take a
> >stab at that.
> 
> I came up with the attached patch. It adds a function called
> MemoryContextAllowInCriticalSection(), which can be used to exempt
> specific memory contexts from the assertion. The following contexts
> are exempted:

There is a typo in the comment to that function, "This functions can be
used", s/functions/function/

Andres Freund wrote:

> > @@ -1258,6 +1259,25 @@ begin:;
> >      if (XLOG_DEBUG)
> >      {
> >          StringInfoData buf;
> > +        static MemoryContext walDebugCxt = NULL;
> > +        MemoryContext oldCxt;
> > +
> > +        /*
> > +         * Allocations within a critical section are normally not allowed,
> > +         * because allocation failure would lead to a PANIC. But this is just
> > +         * debugging code that no-one is going to enable in production, so we
> > +         * don't care. Use a memory context that's exempt from the rule.
> > +         */
> > +        if (walDebugCxt == NULL)
> > +        {
> > +            walDebugCxt = AllocSetContextCreate(TopMemoryContext,
> > +                                                "WAL Debug",
> > +                                                ALLOCSET_DEFAULT_MINSIZE,
> > +                                                ALLOCSET_DEFAULT_INITSIZE,
> > +                                                ALLOCSET_DEFAULT_MAXSIZE);
> > +            MemoryContextAllowInCriticalSection(walDebugCxt, true);
> > +        }
> > +        oldCxt = MemoryContextSwitchTo(walDebugCxt);
> 
> This will only work though if the first XLogInsert() isn't called from a
> critical section. I'm not sure it's a good idea to rely on that.

Ah, true -- AllocSetContextCreate cannot be called from within a
critical section.

> > diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
> > index 3c1c81a..4264373 100644
> > --- a/src/backend/storage/smgr/md.c
> > +++ b/src/backend/storage/smgr/md.c
> > @@ -219,6 +219,16 @@ mdinit(void)
> >                                        &hash_ctl,
> >                                     HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT);
> >          pendingUnlinks = NIL;
> > +
> > +        /*
> > +         * XXX: The checkpointer needs to add entries to the pending ops
> > +         * table when absorbing fsync requests. That is done within a critical
> > +         * section. It means that there's a theoretical possibility that you
> > +         * run out of memory while absorbing fsync requests, which leads to
> > +         * a PANIC. Fortunately the hash table is small so that's unlikely to
> > +         * happen in practice.
> > +         */
> > +        MemoryContextAllowInCriticalSection(MdCxt, true);
> >      }
> >  }
> 
> Isn't that allowing a bit too much? We e.g. shouldn't allow
> _fdvec_alloc() within a crritical section. Might make sense to create a
> child context for it.

I agree.

Rahila Syed wrote:

> The patch on compilation gives following error,
> 
> mcxt.c: In function ‘MemoryContextAllowInCriticalSection’:
> mcxt.c:322: error: ‘struct MemoryContextData’ has no member named
> ‘allowInCriticalSection’
> 
> The member in MemoryContextData is defined as 'allowInCritSection' while
> the MemoryContextAllowInCriticalSection accesses the field as
> 'context->allowInCriticalSection'.

It appears Heikki did a search'n replace for "->allowInCritSection"
before submitting, which failed to match the struct declaration.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services



pgsql-hackers by date:

Previous
From: David G Johnston
Date:
Subject: Re: idle_in_transaction_timeout
Next
From: Vik Fearing
Date:
Subject: Re: idle_in_transaction_timeout