Re: [HACKERS] Backend crashes - what's going on here??? - Mailing list pgsql-hackers

From jwieck@debis.com (Jan Wieck)
Subject Re: [HACKERS] Backend crashes - what's going on here???
Date
Msg-id m0y6weM-000BFRC@orion.SAPserv.Hamburg.dsh.de
Whole thread Raw
In response to Re: [HACKERS] Backend crashes - what's going on here???  (jwieck@debis.com (Jan Wieck))
List pgsql-hackers
Uhhh - much more ugly than I thought first :-(

I wrote:
>
>
> Whow - gdb is a nice tool
>
> >
> > >
> > > Hey,
> > >
> > >     the current snapshot dumps core on the 4th time doing
> > >
> > >     REVOKE ALL ON pg_user FROM public;
> > >
> > >     It  does  too in other situations but this is the simplest to
> > >     reproduce. The segmentation fault happens in nocachegetattr()
> > >     due  to  a  destroyed  tuple descriptor (natts = 0!!! and the
> > >     others don't look good either) for the syscache 21 (USENAME).
> > >     But the destruction must happen somewhere else.
> > >
> > >     With  the  02/13  snapshot  I haven't got any problems on it.
> > >     But cannot find the error with diff.
> > >
> > >     BTW: Doing last checks on view permissions - sending a  patch
> > >     soon.
> >
> > Yep, I saw this too when testing my password acl null patch.  Couldn't
> > reproduce it, so I thought it was a fluke.
> >
> > --
> > Bruce Momjian
> > maillist@candle.pha.pa.us
> >
>
>     Have  a  clue  now  what  causes  the  crash. It happens when
>     pg_user is looked up in the syscache. It must have to do with
>     the   fact   that  during  initialization  in  miscinit.c  on
>     SetUserId()    the    user    tuple    is    fetched    using
>     SearchSysCacheTuple().   Due  to  this  the SysCache entry 21
>     gets initialized but later on start transaction  through  the
>     cache  reset  the  memory  for the cc_tupdesc in the cache is
>     freed. So I assume when SetUserId() is called,  the  syscache
>     is not ready for use yet.
>
>     I  don't  have a solution right now. Is someone more familiar
>     with  the  handling  of  the  syscache  during  startup?   Is
>     SetUserId() just called a little too early or is the syscache
>     unusable during InitPostgres at all?
>
>     But the fact  that  CatalogCacheInitializeCache()  is  called
>     only  for  pg_user during startup makes me feel sure that the
>     lookup of the user using SearchSysCacheTuple()  is  wrong  at
>     this  time.  I  think  it  sould  be  done  without using the
>     syscache.
>
>     Back on monday - maybe with a solution.

    The  crash  is  due  to the cache invalidations on updates to
    pg_class (and can happen too on updates to  pg_attribute  and
    others).

    When a tuple in pg_class or the others is modified, its cache
    invalidation  causes  a   RelationFlushRelation()   for   the
    affected  relation.   revoking  from  pg_user e.g. means that
    RelationFlushRelation() is called for pg_user but this  frees
    the  tuple  desctiptor.  The tuple descriptor is also used in
    the SysCache, and this isn't flushed/freed!

    There are more possible errors on this. A simple

    UPDATE pg_class SET relname = relname;

    let's the backend crash on the very next command. And

    REVOKE ALL ON pg_class FROM public;

    crashes immediately because the cache invalidation needs  the
    just  invalidated heap tuple for pg_class in pg_class. Sounds
    a bit hairy.

    I think this is also the reason for  backend  crashes  I  had
    when  defining  rewrite rules on relations that already exist
    (where I expect others that already noticed them).

    I still don't have the solution.  But  this  must  get  fixed
    before  releasing 6.3. I think a walk through the SysCache on
    RelationFlushRelation() looking if this relation  is  in  the
    SysCache  and  if found resetting this cache can help (except
    for the revoke on pg_class).

    Append this to TODO!


Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#======================================== jwieck@debis.com (Jan Wieck) #

pgsql-hackers by date:

Previous
From: Mattias Kregert
Date:
Subject: Re: [HACKERS] Permissions on copy
Next
From: jwieck@debis.com (Jan Wieck)
Date:
Subject: Re: [HACKERS] Here it is - view permissions