Re: 8.3.0 Core with concurrent vacuum fulls - Mailing list pgsql-hackers

From Pavan Deolasee
Subject Re: 8.3.0 Core with concurrent vacuum fulls
Date
Msg-id 2e78013d0803050211t73fae3f2j1da5760e55c0173b@mail.gmail.com
Whole thread Raw
In response to Re: 8.3.0 Core with concurrent vacuum fulls  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: 8.3.0 Core with concurrent vacuum fulls
Re: 8.3.0 Core with concurrent vacuum fulls
List pgsql-hackers
On Wed, Mar 5, 2008 at 8:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Gavin M. Roy" <gmr@myyearbook.com> writes:
>  > (gdb) where
>  > #0  0x0000003fe362e21d in raise () from /lib64/tls/libc.so.6
>  > #1  0x0000003fe362fa1e in abort () from /lib64/tls/libc.so.6
>  > #2  0x000000000063a2e3 in errfinish ()
>  > #3  0x00000000005974c4 in DeadLockReport ()
>  > #4  0x000000000059381f in LockAcquire ()
>  > #5  0x0000000000592357 in LockRelationOid ()
>  > #6  0x0000000000457255 in relation_open ()
>  > #7  0x00000000004574c3 in heap_open ()
>  > #8  0x000000000062cf41 in CatalogCacheInitializeCache ()
>  > #9  0x000000000062dfad in PrepareToInvalidateCacheTuple ()
>  > #10 0x000000000062e8c5 in CacheInvalidateHeapTuple ()
>  > #11 0x000000000045c803 in heap_page_prune ()
>  > #12 0x00000000005086cd in vacuum_rel ()
>  > #13 0x00000000005096bb in vacuum ()
>  > #14 0x00000000005a163b in PortalRunUtility ()
>  > #15 0x00000000005a1714 in PortalRunMulti ()
>  > #16 0x00000000005a1d30 in PortalRun ()
>  > #17 0x000000000059f4b6 in PostgresMain ()
>  > #18 0x00000000005760c0 in ServerLoop ()
>  > #19 0x0000000000577770 in PostmasterMain ()
>  > #20 0x000000000052fd3e in main ()
>
>  So what did DeadLockReport put in the server log before panic'ing?
>
>  I'm wondering exactly why CatalogCacheInitializeCache is being called
>  here --- seems like that should have been done long before we got to
>  VACUUM.  But it would be useful to know just what deadlock it saw.
>

Seems like its possible that the second phase of catalog cache initialization
is not done when VACUUM FULL is called, especially if VACUUM FULL is
the first command/query in the backend.

InitCatalogCachePhase2() is called only if new cache file needs to be
written.

   /*    * Lastly, write out a new relcache cache file if one is needed.    */   if (needNewCacheFile)   {       /*
  * Force all the catcaches to finish initializing and thereby open the        * catalogs and indexes they use.  This
willpreload the relcache with        * entries for all the most important system catalogs and indexes, so        * that
theinit file will be most useful for future backends.        */       InitCatalogCachePhase2();
 
       /* now write the file */       write_relcache_init_file();   }


We haven't yet seen the deadlock message, but here is my theory of a possible
deadlock scenario:

Two backends try to vacuum full two different catalog tables. Each acquires an
exclusive lock on the respective catalog relation. Then each try to
initialize its
own catalog cache. But to do that they need AccessShareLock on each other's
table leading to a deadlock.

Why not just unconditionally finish the phase 2 as part of
InitPostgres ? I understand
that we may end up initializing caches that we don't need in that
session, but there
might be other places where this deadlock is waiting to happen.


Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB     http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: "Heikki Linnakangas"
Date:
Subject: Re: "could not open relation 1663/16384/16584: No such file or directory" in a specific combination of transactions with temp tables
Next
From: "Pavan Deolasee"
Date:
Subject: Re: 8.3.0 Core with concurrent vacuum fulls