Re: BUG #1208: Invalid page header - Mailing list pgsql-bugs
From | Robert E. Bruccoleri |
---|---|
Subject | Re: BUG #1208: Invalid page header |
Date | |
Msg-id | 200408162341.i7GNfAqi209668@stone.congenomics.com Whole thread Raw |
In response to | Re: BUG #1208: Invalid page header (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: BUG #1208: Invalid page header
|
List | pgsql-bugs |
Dear All, Thanks for responding. First of all, the Intel compiler does not accept inline assembly code, so I substituted the following compiler intrinsic in its place: #if defined(__ia64__) || defined(__ia64) /* __ia64 used by ICC compiler? */ /* Intel Itanium */ #define HAS_TEST_AND_SET #include <ia64intrin.h> typedef unsigned long slock_t; #define TAS(lock) _InterlockedExchange64(lock, 1) #endif /* __ia64__ || __ia64 */ With this code in 7.4.3, I had the problem outlined in my bug report. I also ran the spin lock test program in s_lock.c, and it failed as expected with a stuck spinlock. Do you have a locking test program that is designed to detect errors in a multiprocessor environment? Second, I've just gotten the 8.0 beta 1 release, built it using normal optimization, and it did not encounter any invalid page headers. Here is a typical compiler command with flags: icc -w0 -ansi_alias- -O2 -fno-strict-aliasing It is important to keep in mind that the machine used for this test can keep all 14 PostgreSQL backends running simultaneously, so any race conditions would be more likely to occur on it than on a typical 2 or 4 way Linux box. Question: were there any significant changes made to the buffer management code between 7.4 and 8.0 that would explain the difference? I haven't tried rerunning 7.4.3 without optimization to see if the problem disappears in that release. Since the 8.0beta1 release appears OK, and the test run takes about three days, so I'm reluctant to do it unless there's some value in performing test. Please tell me if there is. Another question: on a machine which has this high level of parallelism, does it make sense to use a spinlock to control access to the buffer cache instead of a lightweight lock? Thanks. --Bob Tom Lane writes: > > > Peter Eisentraut <peter_e@gmx.net> writes: > > Tom Lane wrote: > >> But that code is gcc-only, and he's not using gcc. > > > I think the icc compiler claims to be gcc-compatible in that area, so > > it's quite likely that the gcc assembler code would be used. > > Oh, good point. > > In that case it seems entirely possible that the assembly code > tightening-up patch that I made for 8.0 is relevant. The point of that > patch was to prevent the compiler from making inappropriate > optimizations around a spinlock TAS. We have not seen any indication > that existing gcc releases would actually do anything unwanted ... but > icc might have different/more aggressive optimizations. > > [ looks at code... ] But it looks like 7.4's IA64 code already had the > memory-clobber constraint, so there doesn't appear to be any significant > change there since 7.4. I suppose it's worth trying the insignificant > change though: > > __asm__ __volatile__( > " xchg4 %0=%1,%2 \n" > : "=r"(ret), "=m"(*lock) > : "r"(1), "1"(*lock) > : "memory"); > > to > > __asm__ __volatile__( > " xchg4 %0=%1,%2 \n" > : "=r"(ret), "+m"(*lock) > : "r"(1) > : "memory"); > > at line 125ff of src/include/storage/s_lock.h. Note "=m" becomes "+m" > to replace the separate "1" constraint. > > Robert, have you tried backing off compiler optimization levels to see > if anything changes? > > regards, tom lane > +-----------------------------+------------------------------------+ | Robert E. Bruccoleri, Ph.D. | email: bruc@acm.org | | President, Congenair LLC | URL: http://www.congen.com/~bruc | | P.O. Box 314 | Phone: 609 818 7251 | | Pennington, NJ 08534 | | +-----------------------------+------------------------------------+
pgsql-bugs by date: