Re: [HACKERS] Another nasty cache problem - Mailing list pgsql-hackers
From | Bruce Momjian |
---|---|
Subject | Re: [HACKERS] Another nasty cache problem |
Date | |
Msg-id | 200001301757.MAA12393@candle.pha.pa.us Whole thread Raw |
In response to | Another nasty cache problem (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: [HACKERS] Another nasty cache problem
|
List | pgsql-hackers |
> I'm down to the point where the parallel tests mostly work with a small > SI buffer --- but they do still sometimes fail. I've realized that > there is a whole class of bugs along the following lines: > > There are plenty of routines that do two or more SearchSysCacheTuple > calls to get the information they need. As the code stands, it is > unsafe to continue accessing the tuple returned by SearchSysCacheTuple > after making a second such call, because the second call could possibly > cause an SI cache reset message to be processed, thereby flushing the > contents of the caches. Yes, I have always been aware of this problem. The issue is that since cache entries are removed on a oldest-removed-first basis, I never thought that several cache lookups would be a problem. If you do many cache lookups and expect very old ones to still exist, that could be a problem. However, a full reset of the cache could cause major problems. Is there a way to re-load the cache after the reset with the most recently cached entries? Seems that would be easier. However, your issue is probably that the new cache entries would have different locations from the old entries. Is it possible to delay the cache reset of the five most recent cache entries, and do them later? I don't see many places where more than 2-3 cache entries are kept. Maybe we need to keep them around somehow during cache reset. > I am not sure what to do about it. One solution path is to make > all the potential trouble spots do SearchSysCacheTupleCopy and then > pfree the copied tuple when done. However, that adds a nontrivial > amount of overhead, and it'd be awfully easy to miss some trouble > spots or to introduce new ones in the future. Sounds like a lot of overhead to do the copy. > > Another possibility is to introduce some sort of notion of a reference > count, and to make the standard usage pattern be > tuple = SearchSysCacheTuple(...); > ... use tuple ... > ReleaseSysCacheTuple(tuple); > The idea here is that a tuple with positive refcount would not be > deleted during a cache reset, but would simply be removed from its > cache, and then finally deleted when released (or during elog > recovery). If you can do that, can't you just keep the few most recent ones by default. Seems that would be very clean. > This might allow us to get rid of SearchSysCacheTupleCopy, too, > since the refcount should be just as good as palloc'ing one's own > copy for most purposes. Yes, that would be nice. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
pgsql-hackers by date: