Thread: Well, we seem to be proof against cache-inval problems now
I just finished running the parallel regress tests with inval.c rigged to flush the relcache and syscache at every available opportunity, that is anytime we could recognize a shared-cache-inval message from another backend (see diff below). This setup gives a whole new universe of meaning to the word "slow" --- it took *three full days* to run the standard "make check" procedure, including eighteen hours just to do the "vacuum template1" part of initdb. I kid you not. But it worked. Looks like the unexpected-cache-entry-drop class of problems are indeed gone. BTW, the reason the diff is rigged not to allow recursive cache flush is not that it wouldn't work, it's that I didn't expect to live long enough to finish such a test. I didn't originally have that restriction in there (and indeed found a bug that way: relcache flush could go into an infinite loop if hit with another SI inval before it'd finished the initial flush). After fixing that bug, initdb was making steady progress, but not at a rate that I wanted to wait out... regards, tom lane *** src/backend/utils/cache/inval.c.orig Wed Nov 15 23:57:44 2000 --- src/backend/utils/cache/inval.c Mon Jan 1 17:27:53 2001 *************** *** 643,649 **** --- 643,661 ---- elog(DEBUG, "DiscardInvalid called"); #endif /* defined(INVALIDDEBUG) */ + #if 1 + /* DEBUG CHECK ONLY ... force cache reset at any opportunity */ + static bool inReset = false; + + if (! IsBootstrapProcessingMode() && !inReset) + { + inReset = true; + ResetSystemCaches(); + inReset = false; + } + #else InvalidateSharedInvalid(CacheIdInvalidate, ResetSystemCaches); + #endif } /*
Tom Lane wrote: > > I just finished running the parallel regress tests with inval.c rigged > to flush the relcache and syscache at every available opportunity, > that is anytime we could recognize a shared-cache-inval message from > another backend (see diff below). This setup gives a whole new universe > of meaning to the word "slow" --- it took *three full days* to run the > standard "make check" procedure, including eighteen hours just to do the > "vacuum template1" part of initdb. I kid you not. But it worked. > Looks like the unexpected-cache-entry-drop class of problems are indeed > gone. > Great. Thanks. Hiroshi Inoue
On Fri, 5 Jan 2001, Tom Lane wrote: > I just finished running the parallel regress tests with inval.c rigged > to flush the relcache and syscache at every available opportunity, > that is anytime we could recognize a shared-cache-inval message from > another backend (see diff below). This setup gives a whole new universe > of meaning to the word "slow" --- it took *three full days* to run the > standard "make check" procedure, including eighteen hours just to do the > "vacuum template1" part of initdb. I kid you not. But it worked. > Looks like the unexpected-cache-entry-drop class of problems are indeed > gone. Tom, I'm not sure how (or whether) this relates to "alter table" happening when someone else is doing a SELECT from table. Are you saying that it should work without any locking or I'm completely off base? -alex
Alex Pilosov <alex@pilosoft.com> writes: > Tom, I'm not sure how (or whether) this relates to "alter table" happening > when someone else is doing a SELECT from table. The ALTER will wait for the SELECT to finish. That's not related to the internal cache problem that I was worried about. regards, tom lane
Can this now be marked as done? * Modification of pg_class can happen while table in use by another backend. Might lead to MVCC inside of syscache > I just finished running the parallel regress tests with inval.c rigged > to flush the relcache and syscache at every available opportunity, > that is anytime we could recognize a shared-cache-inval message from > another backend (see diff below). This setup gives a whole new universe > of meaning to the word "slow" --- it took *three full days* to run the > standard "make check" procedure, including eighteen hours just to do the > "vacuum template1" part of initdb. I kid you not. But it worked. > Looks like the unexpected-cache-entry-drop class of problems are indeed > gone. > > BTW, the reason the diff is rigged not to allow recursive cache flush > is not that it wouldn't work, it's that I didn't expect to live long > enough to finish such a test. I didn't originally have that restriction > in there (and indeed found a bug that way: relcache flush could go into > an infinite loop if hit with another SI inval before it'd finished the > initial flush). After fixing that bug, initdb was making steady > progress, but not at a rate that I wanted to wait out... > > regards, tom lane > > *** src/backend/utils/cache/inval.c.orig Wed Nov 15 23:57:44 2000 > --- src/backend/utils/cache/inval.c Mon Jan 1 17:27:53 2001 > *************** > *** 643,649 **** > --- 643,661 ---- > elog(DEBUG, "DiscardInvalid called"); > #endif /* defined(INVALIDDEBUG) */ > > + #if 1 > + /* DEBUG CHECK ONLY ... force cache reset at any opportunity */ > + static bool inReset = false; > + > + if (! IsBootstrapProcessingMode() && !inReset) > + { > + inReset = true; > + ResetSystemCaches(); > + inReset = false; > + } > + #else > InvalidateSharedInvalid(CacheIdInvalidate, ResetSystemCaches); > + #endif > } > > /* > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Can this now be marked as done? > * Modification of pg_class can happen while table in use by another > backend. Might lead to MVCC inside of syscache I'm not sure. Do you have any record of what the concern was, in detail? I don't understand what the TODO item is trying to say. regards, tom lane
> Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Can this now be marked as done? > > * Modification of pg_class can happen while table in use by another > > backend. Might lead to MVCC inside of syscache > > I'm not sure. Do you have any record of what the concern was, in > detail? I don't understand what the TODO item is trying to say. I assumed it was the problem of table lookups with no locking. No idea what the MVCC mention is about. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian <pgman@candle.pha.pa.us> writes: >> Bruce Momjian <pgman@candle.pha.pa.us> writes: >>>> Can this now be marked as done? >>>> * Modification of pg_class can happen while table in use by another >>>> backend. Might lead to MVCC inside of syscache >> >> I'm not sure. Do you have any record of what the concern was, in >> detail? I don't understand what the TODO item is trying to say. > I assumed it was the problem of table lookups with no locking. No idea > what the MVCC mention is about. I checked the CVS archives and found that you added that TODO item on 4-Feb-2000. I could not, however, find any relevant discussion in the pghackers archives in the first few days of February. Do you have anything archived that might help narrow it down? regards, tom lane
No? :-) > Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> Bruce Momjian <pgman@candle.pha.pa.us> writes: > >>>> Can this now be marked as done? > >>>> * Modification of pg_class can happen while table in use by another > >>>> backend. Might lead to MVCC inside of syscache > >> > >> I'm not sure. Do you have any record of what the concern was, in > >> detail? I don't understand what the TODO item is trying to say. > > > I assumed it was the problem of table lookups with no locking. No idea > > what the MVCC mention is about. > > I checked the CVS archives and found that you added that TODO item on > 4-Feb-2000. I could not, however, find any relevant discussion in the > pghackers archives in the first few days of February. Do you have > anything archived that might help narrow it down? > > regards, tom lane > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
I barely understand the items sometimes. > Bruce Momjian <pgman@candle.pha.pa.us> writes: > >> Bruce Momjian <pgman@candle.pha.pa.us> writes: > >>>> Can this now be marked as done? > >>>> * Modification of pg_class can happen while table in use by another > >>>> backend. Might lead to MVCC inside of syscache > >> > >> I'm not sure. Do you have any record of what the concern was, in > >> detail? I don't understand what the TODO item is trying to say. > > > I assumed it was the problem of table lookups with no locking. No idea > > what the MVCC mention is about. > > I checked the CVS archives and found that you added that TODO item on > 4-Feb-2000. I could not, however, find any relevant discussion in the > pghackers archives in the first few days of February. Do you have > anything archived that might help narrow it down? > > regards, tom lane > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026