Thread: Well, we seem to be proof against cache-inval problems now

Well, we seem to be proof against cache-inval problems now

From
Tom Lane
Date:
I just finished running the parallel regress tests with inval.c rigged
to flush the relcache and syscache at every available opportunity,
that is anytime we could recognize a shared-cache-inval message from
another backend (see diff below).  This setup gives a whole new universe
of meaning to the word "slow" --- it took *three full days* to run the
standard "make check" procedure, including eighteen hours just to do the
"vacuum template1" part of initdb.  I kid you not.  But it worked.
Looks like the unexpected-cache-entry-drop class of problems are indeed
gone.

BTW, the reason the diff is rigged not to allow recursive cache flush
is not that it wouldn't work, it's that I didn't expect to live long
enough to finish such a test.  I didn't originally have that restriction
in there (and indeed found a bug that way: relcache flush could go into
an infinite loop if hit with another SI inval before it'd finished the
initial flush).  After fixing that bug, initdb was making steady
progress, but not at a rate that I wanted to wait out...
        regards, tom lane

*** src/backend/utils/cache/inval.c.orig    Wed Nov 15 23:57:44 2000
--- src/backend/utils/cache/inval.c    Mon Jan  1 17:27:53 2001
***************
*** 643,649 ****
--- 643,661 ----     elog(DEBUG, "DiscardInvalid called"); #endif     /* defined(INVALIDDEBUG) */ 
+ #if 1
+     /* DEBUG CHECK ONLY ... force cache reset at any opportunity */
+     static bool inReset = false;
+ 
+     if (! IsBootstrapProcessingMode() && !inReset)
+     {
+         inReset = true;
+         ResetSystemCaches();
+         inReset  = false;
+     }
+ #else     InvalidateSharedInvalid(CacheIdInvalidate, ResetSystemCaches);
+ #endif }  /*


Re: Well, we seem to be proof against cache-inval problems now

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> I just finished running the parallel regress tests with inval.c rigged
> to flush the relcache and syscache at every available opportunity,
> that is anytime we could recognize a shared-cache-inval message from
> another backend (see diff below).  This setup gives a whole new universe
> of meaning to the word "slow" --- it took *three full days* to run the
> standard "make check" procedure, including eighteen hours just to do the
> "vacuum template1" part of initdb.  I kid you not.  But it worked.
> Looks like the unexpected-cache-entry-drop class of problems are indeed
> gone.
> 

Great.
Thanks.

Hiroshi Inoue


Re: Well, we seem to be proof against cache-inval problems now

From
Alex Pilosov
Date:
On Fri, 5 Jan 2001, Tom Lane wrote:

> I just finished running the parallel regress tests with inval.c rigged
> to flush the relcache and syscache at every available opportunity,
> that is anytime we could recognize a shared-cache-inval message from
> another backend (see diff below).  This setup gives a whole new universe
> of meaning to the word "slow" --- it took *three full days* to run the
> standard "make check" procedure, including eighteen hours just to do the
> "vacuum template1" part of initdb.  I kid you not.  But it worked.
> Looks like the unexpected-cache-entry-drop class of problems are indeed
> gone.
Tom, I'm not sure how (or whether) this relates to "alter table" happening
when someone else is doing a SELECT from table. Are you saying that it
should work without any locking or I'm completely off base?

-alex





Re: Well, we seem to be proof against cache-inval problems now

From
Tom Lane
Date:
Alex Pilosov <alex@pilosoft.com> writes:
> Tom, I'm not sure how (or whether) this relates to "alter table" happening
> when someone else is doing a SELECT from table.

The ALTER will wait for the SELECT to finish.  That's not related to the
internal cache problem that I was worried about.
        regards, tom lane


Re: Well, we seem to be proof against cache-inval problems now

From
Bruce Momjian
Date:
Can this now be marked as done?

* Modification  of  pg_class  can  happen while table in use by  another
backend.  Might  lead  to  MVCC  inside  of  syscache

> I just finished running the parallel regress tests with inval.c rigged
> to flush the relcache and syscache at every available opportunity,
> that is anytime we could recognize a shared-cache-inval message from
> another backend (see diff below).  This setup gives a whole new universe
> of meaning to the word "slow" --- it took *three full days* to run the
> standard "make check" procedure, including eighteen hours just to do the
> "vacuum template1" part of initdb.  I kid you not.  But it worked.
> Looks like the unexpected-cache-entry-drop class of problems are indeed
> gone.
> 
> BTW, the reason the diff is rigged not to allow recursive cache flush
> is not that it wouldn't work, it's that I didn't expect to live long
> enough to finish such a test.  I didn't originally have that restriction
> in there (and indeed found a bug that way: relcache flush could go into
> an infinite loop if hit with another SI inval before it'd finished the
> initial flush).  After fixing that bug, initdb was making steady
> progress, but not at a rate that I wanted to wait out...
> 
>             regards, tom lane
> 
> *** src/backend/utils/cache/inval.c.orig    Wed Nov 15 23:57:44 2000
> --- src/backend/utils/cache/inval.c    Mon Jan  1 17:27:53 2001
> ***************
> *** 643,649 ****
> --- 643,661 ----
>       elog(DEBUG, "DiscardInvalid called");
>   #endif     /* defined(INVALIDDEBUG) */
>   
> + #if 1
> +     /* DEBUG CHECK ONLY ... force cache reset at any opportunity */
> +     static bool inReset = false;
> + 
> +     if (! IsBootstrapProcessingMode() && !inReset)
> +     {
> +         inReset = true;
> +         ResetSystemCaches();
> +         inReset  = false;
> +     }
> + #else
>       InvalidateSharedInvalid(CacheIdInvalidate, ResetSystemCaches);
> + #endif
>   }
>   
>   /*
> 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Well, we seem to be proof against cache-inval problems now

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Can this now be marked as done?
> * Modification  of  pg_class  can  happen while table in use by  another
> backend.  Might  lead  to  MVCC  inside  of  syscache

I'm not sure.  Do you have any record of what the concern was, in
detail?  I don't understand what the TODO item is trying to say.
        regards, tom lane


Re: Well, we seem to be proof against cache-inval problems now

From
Bruce Momjian
Date:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Can this now be marked as done?
> > * Modification  of  pg_class  can  happen while table in use by  another
> > backend.  Might  lead  to  MVCC  inside  of  syscache
> 
> I'm not sure.  Do you have any record of what the concern was, in
> detail?  I don't understand what the TODO item is trying to say.

I assumed it was the problem of table lookups with no locking.  No idea
what the MVCC mention is about.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Well, we seem to be proof against cache-inval problems now

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> Bruce Momjian <pgman@candle.pha.pa.us> writes:
>>>> Can this now be marked as done?
>>>> * Modification  of  pg_class  can  happen while table in use by  another
>>>> backend.  Might  lead  to  MVCC  inside  of  syscache
>> 
>> I'm not sure.  Do you have any record of what the concern was, in
>> detail?  I don't understand what the TODO item is trying to say.

> I assumed it was the problem of table lookups with no locking.  No idea
> what the MVCC mention is about.

I checked the CVS archives and found that you added that TODO item on
4-Feb-2000.  I could not, however, find any relevant discussion in the
pghackers archives in the first few days of February.  Do you have
anything archived that might help narrow it down?
        regards, tom lane


Re: Well, we seem to be proof against cache-inval problems now

From
Bruce Momjian
Date:
No?  :-)

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >>>> Can this now be marked as done?
> >>>> * Modification  of  pg_class  can  happen while table in use by  another
> >>>> backend.  Might  lead  to  MVCC  inside  of  syscache
> >> 
> >> I'm not sure.  Do you have any record of what the concern was, in
> >> detail?  I don't understand what the TODO item is trying to say.
> 
> > I assumed it was the problem of table lookups with no locking.  No idea
> > what the MVCC mention is about.
> 
> I checked the CVS archives and found that you added that TODO item on
> 4-Feb-2000.  I could not, however, find any relevant discussion in the
> pghackers archives in the first few days of February.  Do you have
> anything archived that might help narrow it down?
> 
>             regards, tom lane
> 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: Well, we seem to be proof against cache-inval problems now

From
Bruce Momjian
Date:
I barely understand the items sometimes.

> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> >>>> Can this now be marked as done?
> >>>> * Modification  of  pg_class  can  happen while table in use by  another
> >>>> backend.  Might  lead  to  MVCC  inside  of  syscache
> >> 
> >> I'm not sure.  Do you have any record of what the concern was, in
> >> detail?  I don't understand what the TODO item is trying to say.
> 
> > I assumed it was the problem of table lookups with no locking.  No idea
> > what the MVCC mention is about.
> 
> I checked the CVS archives and found that you added that TODO item on
> 4-Feb-2000.  I could not, however, find any relevant discussion in the
> pghackers archives in the first few days of February.  Do you have
> anything archived that might help narrow it down?
> 
>             regards, tom lane
> 


--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026