Thread: Compiling 6.4 on NetBSD-current/pc532
OK, I have a current version of PostgreSQL running on my pc532 now. (It's a NS32K based machine. Somewhat of an antique really...) I had to apply the following diffs to get it to compile, but it then passed the spinlock test, and most of the regression tests, including the int8 test. It did not, however, get the datetime test correct. For some reason which I haven't figured out yet, it thinks the following: template1=> select ('epoch'::datetime) as ZeroSecs; zerosecs ---------------------------- Fri Dec 31 18:00:00 1999 CST (1 row) template1=> select ('current'::datetime) as ZeroSecs; zerosecs ---------------------------- Fri Dec 31 18:00:00 1999 CST (1 row) template1=> select ('now'::datetime) as ZeroSecs; zerosecs ---------------------------- Sun Sep 13 19:05:43 1998 CDT (1 row) So it knows how to get the date and time, but not always... I'd think this was machine independent, but then I'd expect a MI problem like that to get fixed very quickly. So I don't know if it's a NetBSD/pc532 problem, a NetBSD problem, or a PostgreSQL problem, but I suspect the first due to a lack of people screaming. I did build 6.3.2 with -DDATEDEBUG, but I'm not coherent enough (yet?) to properly deduce anything yet. It appeared to all be correct until it printed the results, implying that libc or a syscall was returning some funny constant perhaps? Well, with all those disclaimers, here's the patch. (Beware, I think I have it reversed, so you probably want to type patch -r...) *** /usr/local/pgsql/src/include/storage/s_lock.h Fri Sep 11 19:00:55 1998 --- s_lock.h Sat Sep 12 00:27:51 1998 *************** *** 213,219 **** #endif /* NEED_I386_TAS_ASM */ ! /* NS32K code is in s_lock.c */ #endif /* defined(__GNUC__) */ --- 213,234 ---- #endif /* NEED_I386_TAS_ASM */ ! ! #if defined(NEED_NS32K_TAS_ASM) ! ! #define S_LOCK(lock) \ ! { \ ! slock_t res = 1; \ ! while (res) { \ ! __asm__("movqd 0, r0"); \ ! __asm__("sbitd r0, %0" : "=m"(*lock)); \ ! __asm__("sprb us, %0" : "=r" (res)); \ ! res = ((res >> 5) & 1); \ ! } \ ! } ! ! #endif /* NEED_NS32K_TAS_ASM */ ! #endif /* defined(__GNUC__) */ *** /usr/local/pgsql/src/backend/storage/buffer/s_lock.c Thu Sep 10 23:08:00 1998 --- s_lock.c Sat Sep 12 00:23:04 1998 *************** *** 118,134 **** #endif /* PPC */ - #if defined(__ns32k__) - int - tas(volatile slock_t *lock) - { - int res; - __asm__("sbitb 0, %0" : "=m"(*lock)); - __asm__("sprb us, %0" : "=r"(res)); - res = (res >> 5) & 1; - return res; - } - #endif #else /* defined(__GNUC__) */ /*************************************************************************** --- 118,123 ---- BTW, does the spinlock code need a TAS function so it can spin for a while and then declare itself stuck, or does a second process/thread take care of that. It would be simpler for the NS32K to just make the whole lock function be 2 lines of inline assembler, but that would contain an infinite loop if the lock was stuck... Jon Buller <jonb@metronet.com>
> I have a current version of PostgreSQL running on my pc532 now. > (It's a NS32K based machine. Somewhat of an antique really...) > It did not, however, get the datetime test correct. > ... I suspect (a NetBSD/pc532 problem) due to a lack of people > screaming. Screaming means you get to help do development :) > I did build 6.3.2 with -DDATEDEBUG, but I'm not coherent enough > (yet?) to properly deduce anything yet. It appeared to all be > correct until it printed the results, implying that libc or a > syscall was returning some funny constant perhaps? Ah. It's slowly coming back to me, so here are some suggestions: "epoch" and "current" are stored internally in the database so that the support code can pull one back out and say "oh! that's supposed to be 'epoch'", for example. So, I used _very_ small floating point numbers to represent those special constants (numbers much smaller than one could get by doing the usual arithmetic, at least under normal circumstances). I'll bet that your machine is somehow pushing those numbers to be exactly zero, which corresponds to Y2K (with the timezone offset, that is what you are seeing). So, try looking at the numbers, and try seeing what they are being set to. In src/include/utils/dt.h, you will see that current and epoch are being set to DBL_MIN and -DBL_MIN, respectively. Make sure that these are not identical to zero (they are something like 1e-308 on my machine). You can change dt.h to make these some other usually unique number, say +/-1e-20 for now. And track down why DBL_MIN on your machine isn't the smallest allowed double-precision number... - Tom
Thomas G. Lockhart writes: > > I have a current version of PostgreSQL running on my pc532 now. > > (It's a NS32K based machine. Somewhat of an antique really...) > > It did not, however, get the datetime test correct. > > ... I suspect (a NetBSD/pc532 problem) due to a lack of people > > screaming. > > Screaming means you get to help do development :) > > > I did build 6.3.2 with -DDATEDEBUG, but I'm not coherent enough > > (yet?) to properly deduce anything yet. It appeared to all be > > correct until it printed the results, implying that libc or a > > syscall was returning some funny constant perhaps? > > Ah. It's slowly coming back to me, so here are some suggestions: > > "epoch" and "current" are stored internally in the database so that the > support code can pull one back out and say "oh! that's supposed to be > 'epoch'", for example. So, I used _very_ small floating point numbers to > represent those special constants (numbers much smaller than one could > get by doing the usual arithmetic, at least under normal circumstances). > > I'll bet that your machine is somehow pushing those numbers to be > exactly zero, which corresponds to Y2K (with the timezone offset, that > is what you are seeing). So, try looking at the numbers, and try seeing > what they are being set to. In src/include/utils/dt.h, you will see that > current and epoch are being set to DBL_MIN and -DBL_MIN, respectively. > Make sure that these are not identical to zero (they are something like > 1e-308 on my machine). You can change dt.h to make these some other > usually unique number, say +/-1e-20 for now. > > And track down why DBL_MIN on your machine isn't the smallest allowed > double-precision number... > > - Tom Well, finally somebody, who sees the same problems as me. I had the same problems with 6.3, which I didn't pursue due to lack off time. However, now I retried with the 6.4 snapshot from Sep. 11. I compiled under Siemens SINIX (Mips based SVR4) and the system cc. If I issue the following sql statement, the backend dies with a core. regression=> select ('now'::datetime - 'current'::datetime) as "Zero"; I guess the following result is ok. regression=> select ('current'::datetime) as "Zero"; Zero ------- current (1 row) I guess, somehow the applying math operators to datetime causes the core dumps. Here is the GDB stack trace from the core. (Only dt.c $ gdb ../../../bin/postgres core GDB is free software and you are welcome to distribute copies of it under certain conditions; type "show copying" to see the conditions. There is absolutely no warranty for GDB; type "show warranty" for details. GDB 4.15.1 (mips-sni-sysv4), Copyright 1995 Free Software Foundation, Inc... Core was generated by `/local/pgsql-6.4/bin/postmaster -i -p 2222 '. Program terminated with signal 10, Bus error. Reading symbols from /lib/libnsl.so...done. Reading symbols from /lib/libsocket.so...done. Reading symbols from /lib/libdl.so...done. Reading symbols from /lib/libmutex.so...done. Reading symbols from /usr/lib/libc.so.1...done. (gdb) where #0 0x51595c in tm2datetime (tm=0x7fffb334, fsec=1.0609881999384027e-314, tzp=0x5db6d4, result=0x542658) at dt.c:2517 #1 0x50e8cc in SetDateTime (dt=2.2250738585072022e-308) at dt.c:346 ^^^^^^^^^^^^^^^^^^^^^^^ this is DBL_MIN #2 0x510884 in datetime_mi (datetime1=0x60d370, datetime2=0x60d3d0) at dt.c:783 #3 0x5420b0 in fmgr_c () #4 0x46cf98 in ExecMakeFunctionResult () #5 0x46d034 in ExecEvalOper () #6 0x46d418 in ExecEvalExpr () #7 0x46d838 in ExecTargetList () #8 0x46dbf8 in ExecProject () #9 0x476e14 in ExecResult () #10 0x46b32c in ExecProcNode () #11 0x46a3e0 in ExecutePlan () #12 0x469ab0 in ExecutorRun () #13 0x4fc9dc in ProcessQueryDesc () #14 0x4fcabc in ProcessQuery () #15 0x4faa10 in pg_exec_query_dest () #16 0x4fa7dc in pg_exec_query () #17 0x4fc0a8 in PostgresMain () #18 0x4d6a64 in DoBackend () #19 0x4d6260 in BackendStartup () #20 0x4d5050 in ServerLoop () #21 0x4d4840 in PostmasterMain () #22 0x4875c0 in main () #23 0x418004 in __start () (gdb) Does this give you some additional hints? Oh, one more thing. With 6.3, when psql lost connection with the backend, I got an error message saying so. With 6.4 it psql seems to be simply hanging. -- MfG/Regards -- /==== Siemens Nixdorf Informationssysteme AG / Ridderbusch / , Abt.: OEC XS QM4 / /./ Heinz Nixdorf Ring /=== /,== ,===/ /,==, // 33106 Paderborn, Germany / // / / // / / \ Tel.: (49) 5251-8-15211 / / `==/\ / / / \ Email: ridderbusch.pad@sni.de Since I have taken all the Gates out of my computer, it finally works!!
> > > ... I suspect (a NetBSD/pc532 problem) due to a lack of people > > > screaming. > > Screaming means you get to help do development :) > Well, finally somebody, who sees the same problems as me. I had the > same problems with 6.3, which I didn't pursue due to lack off time. Frank, do I hear you screaming? :) > However, now I retried with the 6.4 snapshot from Sep. 11. I compiled > under Siemens SINIX (Mips based SVR4) and the system cc. > If I issue the following sql statement, the backend dies with a core. > regression=> select ('now'::datetime - 'current'::datetime); > I guess the following result is ok. > regression=> select ('current'::datetime) as "Zero"; Yes. > I guess, somehow the applying math operators to datetime causes the > core dumps. No. (or, not likely...) > Here is the GDB stack trace from the core. (Only dt.c ... > (gdb) where > #0 0x51595c in tm2datetime (tm=0x7fffb334, fsec=1.0609881999384027e-314, > tzp=0x5db6d4, result=0x542658) at dt.c:2517 Here is the call in the code: tm2datetime(&tt, 0, NULL, &dt); Note that fsec is not the zero-value it should be, and that tzp is not null (zero usually) as it should be. So, somehow your system is not converting the second argument zero in the call to (double) 0.0e0 and that is also screwing up the alignment for the next argument. Here is the declaration for tm2datetime() (in include/utils/dt.h): extern int tm2datetime(struct tm * tm, double fsec, int *tzp, DateTime *dt); which is pretty clear on what the arguments should be. So, you can probably change/coerce the call inside SetDateTime() to make sure that the second argument is a double zero: tm2datetime(&tt, 0.0e0, NULL, &dt); but I'm not sure why your system would be ignoring the declaration for the routine. Are there some switches you can set on your compiler so it does the right thing? Something like --stop-annoying-frank or --actually-use-standard-C-conventions ?? ;-) > Oh, one more thing. With 6.3, when psql lost connection with the > backend, I got an error message saying so. With 6.4 it psql seems to > be simply hanging. One thing at a time. Although I find that mine core dumps about as well as ever... - Tom
Frank Ridderbusch <ridderbusch.pad@sni.de> writes: > Oh, one more thing. With 6.3, when psql lost connection with the > backend, I got an error message saying so. With 6.4 it psql seems to > be simply hanging. Drat, I thought I fixed that. I better make sure all the libpq patches I sent in have been applied... What point is psql "hanging" at, exactly? Does it happen when you next try to send a command, or when? Are you using a TCP or Unix-socket connection to the backend? regards, tom lane
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> wrote: > > I have a current version of PostgreSQL running on my pc532 now. > > (It's a NS32K based machine. Somewhat of an antique really...) > > It did not, however, get the datetime test correct. > > ... I suspect (a NetBSD/pc532 problem) due to a lack of people > > screaming. > > Screaming means you get to help do development :) Doh! opened my mouth one too many times... 8-) > > I did build 6.3.2 with -DDATEDEBUG, but I'm not coherent enough > > (yet?) to properly deduce anything yet. It appeared to all be > > correct until it printed the results, implying that libc or a > > syscall was returning some funny constant perhaps? > > Ah. It's slowly coming back to me, so here are some suggestions: > > "epoch" and "current" are stored internally in the database so that the > support code can pull one back out and say "oh! that's supposed to be > 'epoch'", for example. So, I used _very_ small floating point numbers to > represent those special constants (numbers much smaller than one could > get by doing the usual arithmetic, at least under normal circumstances). > > I'll bet that your machine is somehow pushing those numbers to be > exactly zero, which corresponds to Y2K (with the timezone offset, that > is what you are seeing). So, try looking at the numbers, and try seeing > what they are being set to. In src/include/utils/dt.h, you will see that > current and epoch are being set to DBL_MIN and -DBL_MIN, respectively. > Make sure that these are not identical to zero (they are something like > 1e-308 on my machine). You can change dt.h to make these some other > usually unique number, say +/-1e-20 for now. > > And track down why DBL_MIN on your machine isn't the smallest allowed > double-precision number... I think I saw PostgreSQL say "0" in some of those debug messages, I'll recompile tonight using some number that's a bit bigger and see what happens. I'm not totally sure that's the problem though. (It probably used to be, since the NS32K floating point hardware has no concept of a denormalized number, except that it can trap to some software routine to take care of doing the real work for it. Ian Dall wrote all that last year, bless his heart.) However, I just ran this test code that turns up some problems with the libc routines, but conclusively shows that the number is not read correctly with scanf, but is not zero either... #include <stdio.h> #define DBL_MIN 2.2250738585072014E-308 void main () { double x; scanf ("%g", &x); printf ("%g %g %d %d\n", DBL_MIN, x, DBL_MIN == x, x == 0.0); } It spits out the following (notice that I cut and pasted the DBL_MIN value): a.out 2.2250738585072014E-308 2.22507e-308 6.79039e-313 0 0 I'll chase this further with Phil Nelson the NetBSD/pc532 port-master. Thanks everyone for all your help, Jon Buller
Thanks to your suggestion, the '0' is indeed the problem. Apparently a compiler problem, since the function prototypes are not correctly coerced. Prefixing the 0 with (double) did it. > Here is the call in the code: > > tm2datetime(&tt, 0, NULL, &dt); > Thanx again. Thomas G. Lockhart writes: > > > > ... I suspect (a NetBSD/pc532 problem) due to a lack of people > > > > screaming. > > > Screaming means you get to help do development :) > > Well, finally somebody, who sees the same problems as me. I had the > > same problems with 6.3, which I didn't pursue due to lack off time. > > Frank, do I hear you screaming? :) > > > However, now I retried with the 6.4 snapshot from Sep. 11. I compiled > > under Siemens SINIX (Mips based SVR4) and the system cc. > > If I issue the following sql statement, the backend dies with a core. > > regression=> select ('now'::datetime - 'current'::datetime); > > I guess the following result is ok. > > regression=> select ('current'::datetime) as "Zero"; > > Yes. > > > I guess, somehow the applying math operators to datetime causes the > > core dumps. > > No. (or, not likely...) > > > Here is the GDB stack trace from the core. (Only dt.c ... > > (gdb) where > > #0 0x51595c in tm2datetime (tm=0x7fffb334, fsec=1.0609881999384027e-314, > > tzp=0x5db6d4, result=0x542658) at dt.c:2517 > > Here is the call in the code: > > tm2datetime(&tt, 0, NULL, &dt); > > Note that fsec is not the zero-value it should be, and that tzp is not > null (zero usually) as it should be. So, somehow your system is not > converting the second argument zero in the call to > (double) 0.0e0 > and that is also screwing up the alignment for the next argument. > > Here is the declaration for tm2datetime() (in include/utils/dt.h): > > extern int tm2datetime(struct tm * tm, double fsec, > int *tzp, DateTime *dt); > > which is pretty clear on what the arguments should be. So, you can > probably change/coerce the call inside SetDateTime() to make sure that > the second argument is a double zero: > > tm2datetime(&tt, 0.0e0, NULL, &dt); > > but I'm not sure why your system would be ignoring the declaration for > the routine. Are there some switches you can set on your compiler so it > does the right thing? Something like > > --stop-annoying-frank > or > --actually-use-standard-C-conventions > > ?? ;-) > > > Oh, one more thing. With 6.3, when psql lost connection with the > > backend, I got an error message saying so. With 6.4 it psql seems to > > be simply hanging. > > One thing at a time. Although I find that mine core dumps about as well > as ever... > > - Tom
Applied. > OK, > > I have a current version of PostgreSQL running on my pc532 now. > (It's a NS32K based machine. Somewhat of an antique really...) > > I had to apply the following diffs to get it to compile, but it > then passed the spinlock test, and most of the regression tests, > including the int8 test. It did not, however, get the datetime > test correct. > > For some reason which I haven't figured out yet, it thinks the > following: > > template1=> select ('epoch'::datetime) as ZeroSecs; > zerosecs > ---------------------------- > Fri Dec 31 18:00:00 1999 CST > (1 row) > > template1=> select ('current'::datetime) as ZeroSecs; > zerosecs > ---------------------------- > Fri Dec 31 18:00:00 1999 CST > (1 row) > > template1=> select ('now'::datetime) as ZeroSecs; > zerosecs > ---------------------------- > Sun Sep 13 19:05:43 1998 CDT > (1 row) > > > So it knows how to get the date and time, but not always... I'd > think this was machine independent, but then I'd expect a MI problem > like that to get fixed very quickly. So I don't know if it's a > NetBSD/pc532 problem, a NetBSD problem, or a PostgreSQL problem, > but I suspect the first due to a lack of people screaming. > > I did build 6.3.2 with -DDATEDEBUG, but I'm not coherent enough > (yet?) to properly deduce anything yet. It appeared to all be > correct until it printed the results, implying that libc or a > syscall was returning some funny constant perhaps? > > Well, with all those disclaimers, here's the patch. (Beware, I > think I have it reversed, so you probably want to type patch -r...) > > > *** /usr/local/pgsql/src/include/storage/s_lock.h Fri Sep 11 19:00:55 1998 > --- s_lock.h Sat Sep 12 00:27:51 1998 > *************** > *** 213,219 **** > #endif /* NEED_I386_TAS_ASM */ > > > ! /* NS32K code is in s_lock.c */ > > #endif /* defined(__GNUC__) */ > > --- 213,234 ---- > #endif /* NEED_I386_TAS_ASM */ > > > ! > ! #if defined(NEED_NS32K_TAS_ASM) > ! > ! #define S_LOCK(lock) \ > ! { \ > ! slock_t res = 1; \ > ! while (res) { \ > ! __asm__("movqd 0, r0"); \ > ! __asm__("sbitd r0, %0" : "=m"(*lock)); \ > ! __asm__("sprb us, %0" : "=r" (res)); \ > ! res = ((res >> 5) & 1); \ > ! } \ > ! } > ! > ! #endif /* NEED_NS32K_TAS_ASM */ > ! > > #endif /* defined(__GNUC__) */ > > *** /usr/local/pgsql/src/backend/storage/buffer/s_lock.c Thu Sep 10 23:08:00 1998 > --- s_lock.c Sat Sep 12 00:23:04 1998 > *************** > *** 118,134 **** > #endif /* PPC */ > > > - #if defined(__ns32k__) > - int > - tas(volatile slock_t *lock) > - { > - int res; > - __asm__("sbitb 0, %0" : "=m"(*lock)); > - __asm__("sprb us, %0" : "=r"(res)); > - res = (res >> 5) & 1; > - return res; > - } > - #endif > > #else /* defined(__GNUC__) */ > /*************************************************************************** > --- 118,123 ---- > > > BTW, does the spinlock code need a TAS function so it can spin for > a while and then declare itself stuck, or does a second process/thread > take care of that. It would be simpler for the NS32K to just make > the whole lock function be 2 lines of inline assembler, but that > would contain an infinite loop if the lock was stuck... > > Jon Buller <jonb@metronet.com> > > -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 http://www.op.net/~candle | (610) 353-9879(w) + If your life is a hard drive, | (610) 853-3000(h) + Christ can be your backup. |
> > Applied. > > > OK, > > > > I have a current version of PostgreSQL running on my pc532 now. > > (It's a NS32K based machine. Somewhat of an antique really...) > > > > > > *** /usr/local/pgsql/src/include/storage/s_lock.h Fri Sep 11 19:00:55 1998 > > --- s_lock.h Sat Sep 12 00:27:51 1998 > > *************** > > *** 213,219 **** > > #endif /* NEED_I386_TAS_ASM */ > > > > > > ! /* NS32K code is in s_lock.c */ > > > > #endif /* defined(__GNUC__) */ > > > > --- 213,234 ---- > > #endif /* NEED_I386_TAS_ASM */ > > > > > > ! > > ! #if defined(NEED_NS32K_TAS_ASM) > > ! > > ! #define S_LOCK(lock) \ > > ! { \ > > ! slock_t res = 1; \ > > ! while (res) { \ > > ! __asm__("movqd 0, r0"); \ > > ! __asm__("sbitd r0, %0" : "=m"(*lock)); \ > > ! __asm__("sprb us, %0" : "=r" (res)); \ > > ! res = ((res >> 5) & 1); \ > > ! } \ > > ! } > > ! > > ! #endif /* NEED_NS32K_TAS_ASM */ > > ! > > > > #endif /* defined(__GNUC__) */ > > > > *** /usr/local/pgsql/src/backend/storage/buffer/s_lock.c Thu Sep 10 23:08:00 1998 > > --- s_lock.c Sat Sep 12 00:23:04 1998 > > *************** > > *** 118,134 **** > > #endif /* PPC */ > > > > > > - #if defined(__ns32k__) > > - int > > - tas(volatile slock_t *lock) > > - { > > - int res; > > - __asm__("sbitb 0, %0" : "=m"(*lock)); > > - __asm__("sprb us, %0" : "=r"(res)); > > - res = (res >> 5) & 1; > > - return res; > > - } > > - #endif > > > > #else /* defined(__GNUC__) */ > > /*************************************************************************** > > --- 118,123 ---- > > > > > > BTW, does the spinlock code need a TAS function so it can spin for > > a while and then declare itself stuck, or does a second process/thread > > take care of that. It would be simpler for the NS32K to just make > > the whole lock function be 2 lines of inline assembler, but that > > would contain an infinite loop if the lock was stuck... > > > > Jon Buller <jonb@metronet.com> I wish I had noticed this before Bruce applied it. The TAS function is needed so that stuck spinlocks can be recovered from. Also, it enables the pseudo random back off which helps performance when there are many backends. In any case, this patch does not "follow the one true path" that I tried to outline in s_lock.c and s_lock.h. In fact it is exactly backwards. Basically the preferred way is: - in s_lock.h do nothing, the defaults should take care of you. -in s_lock.c define a TAS function that sets the spinlock and returns the previous state of the lock. I see from your asm()s that you are using gcc. In this case, your TAS function should be called tas(), and should be defined inside the __GNUC__ section. -dg David Gould dg@informix.com 510.628.3783 or 510.305.9468 Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612 - If simplicity worked, the world would be overrun with insects. -
> > > > Applied. > > > > > OK, > > > > > > I have a current version of PostgreSQL running on my pc532 now. > > > (It's a NS32K based machine. Somewhat of an antique really...) > > > > > > > > > *** /usr/local/pgsql/src/include/storage/s_lock.h Fri Sep 11 19:00:55 1998 > > > --- s_lock.h Sat Sep 12 00:27:51 1998 > > > *************** > > > *** 213,219 **** > > > #endif /* NEED_I386_TAS_ASM */ > > > > > > > > > ! /* NS32K code is in s_lock.c */ > > > > > > #endif /* defined(__GNUC__) */ > > > > > > --- 213,234 ---- > > > #endif /* NEED_I386_TAS_ASM */ > > > > > > > > > ! > > > ! #if defined(NEED_NS32K_TAS_ASM) > > > ! > > > ! #define S_LOCK(lock) \ > > > ! { \ > > > ! slock_t res = 1; \ > > > ! while (res) { \ > > > ! __asm__("movqd 0, r0"); \ > > > ! __asm__("sbitd r0, %0" : "=m"(*lock)); \ > > > ! __asm__("sprb us, %0" : "=r" (res)); \ > > > ! res = ((res >> 5) & 1); \ > > > ! } \ > > > ! } > > > ! > > > ! #endif /* NEED_NS32K_TAS_ASM */ > > > ! > > > > > > #endif /* defined(__GNUC__) */ > > > > > > *** /usr/local/pgsql/src/backend/storage/buffer/s_lock.c Thu Sep 10 23:08:00 1998 > > > --- s_lock.c Sat Sep 12 00:23:04 1998 > > > *************** > > > *** 118,134 **** > > > #endif /* PPC */ > > > > > > > > > - #if defined(__ns32k__) > > > - int > > > - tas(volatile slock_t *lock) > > > - { > > > - int res; > > > - __asm__("sbitb 0, %0" : "=m"(*lock)); > > > - __asm__("sprb us, %0" : "=r"(res)); > > > - res = (res >> 5) & 1; > > > - return res; > > > - } > > > - #endif > > > > > > #else /* defined(__GNUC__) */ > > > /*************************************************************************** > > > --- 118,123 ---- > > > > > > > > > BTW, does the spinlock code need a TAS function so it can spin for > > > a while and then declare itself stuck, or does a second process/thread > > > take care of that. It would be simpler for the NS32K to just make > > > the whole lock function be 2 lines of inline assembler, but that > > > would contain an infinite loop if the lock was stuck... > > > > > > Jon Buller <jonb@metronet.com> > > I wish I had noticed this before Bruce applied it. > > The TAS function is needed so that stuck spinlocks can be recovered from. > Also, it enables the pseudo random back off which helps performance when > there are many backends. > > In any case, this patch does not "follow the one true path" that I tried > to outline in s_lock.c and s_lock.h. In fact it is exactly backwards. > > Basically the preferred way is: > > - in s_lock.h do nothing, the defaults should take care of you. > > -in s_lock.c define a TAS function that sets the spinlock and returns the > previous state of the lock. > > I see from your asm()s that you are using gcc. In this case, your TAS function > should be called tas(), and should be defined inside the __GNUC__ section. The patch had been on the list for a while, so I figured it was safe. I will back it out, and the user will have to study what is already there and resubmit a new patch. I belive the problem is that 6.3 users are trying to get their platforms into the new locking code, and naturally using their old code to do it. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 http://www.op.net/~candle | (610) 353-9879(w) + If your life is a hard drive, | (610) 853-3000(h) + Christ can be your backup. |
dg@informix.com (David Gould) writes: [ reversed, and possibly bad NS32K patch removed from quoted material. ] > I wish I had noticed this before Bruce applied it. Me too, I wrote a quick hack that worked, and I always like people who know what's really going on to double check things when possible. > The TAS function is needed so that stuck spinlocks can be recovered from. > Also, it enables the pseudo random back off which helps performance when > there are many backends. I suspected as much, which is why I wrote a tas function to fix the link errors, rather than try to remove the code that called tas. > In any case, this patch does not "follow the one true path" that I tried > to outline in s_lock.c and s_lock.h. In fact it is exactly backwards. > > Basically the preferred way is: > > - in s_lock.h do nothing, the defaults should take care of you. > > -in s_lock.c define a TAS function that sets the spinlock and returns the > previous state of the lock. > > I see from your asm()s that you are using gcc. In this case, your TAS function > should be called tas(), and should be defined inside the __GNUC__ section. Your quote removes the comment that says something to the effect of: I think I built a bogus patch, you might need patch -r to make it work because it's backwards. In fact you might need more than that to get the right pathnames in the right places. (My quoting removes the whole patch... 8-) I built the patch by hand, since cvs patch spewed out a whole bunch of stuff in files I know I never touched. I figured they were altered in the build process or something... However, what I did on my machine was to remove a #define S_LOCK ... from s_lock.h and add function tas in s_lock.c. It wouldn't link the way it came out of cvs checkout, but it would with the changes I just described. I suspect Bruce got it right, otherwise hed have a handful of garbled code. I think it *might* be like you describe it should be, but please double check it, I will. (Like I said above, the more the better, since I don't really know what I'm doing, just making some guesses.)
> > dg@informix.com (David Gould) writes: > > [ reversed, and possibly bad NS32K patch removed from quoted material. ] > > > I wish I had noticed this before Bruce applied it. > > Me too, I wrote a quick hack that worked, and I always like people > who know what's really going on to double check things when possible. > > > The TAS function is needed so that stuck spinlocks can be recovered from. > > Also, it enables the pseudo random back off which helps performance when > > there are many backends. > > I suspected as much, which is why I wrote a tas function to fix > the link errors, rather than try to remove the code that called > tas. > > > In any case, this patch does not "follow the one true path" that I tried > > to outline in s_lock.c and s_lock.h. In fact it is exactly backwards. > > > > Basically the preferred way is: > > > > - in s_lock.h do nothing, the defaults should take care of you. > > > > -in s_lock.c define a TAS function that sets the spinlock and returns the > > previous state of the lock. > > > > I see from your asm()s that you are using gcc. In this case, your TAS function > > should be called tas(), and should be defined inside the __GNUC__ section. > > Your quote removes the comment that says something to the effect > of: I think I built a bogus patch, you might need patch -r to make > it work because it's backwards. In fact you might need more than > that to get the right pathnames in the right places. (My quoting > removes the whole patch... 8-) I built the patch by hand, since > cvs patch spewed out a whole bunch of stuff in files I know I never > touched. I figured they were altered in the build process or > something... > > However, what I did on my machine was to remove a #define S_LOCK ... > from s_lock.h and add function tas in s_lock.c. It wouldn't link > the way it came out of cvs checkout, but it would with the changes > I just described. > > I suspect Bruce got it right, otherwise hed have a handful of > garbled code. I think it *might* be like you describe it should > be, but please double check it, I will. (Like I said above, the > more the better, since I don't really know what I'm doing, just > making some guesses.) I completely missed the part about the patch being reversed... duh. Ok, so if what you did is undefine the S_LOCK for your platform (to select the default), and add a tas() function to s_lock.c then that is exactly what was supposed to happen and I was just confused by the reversed patch. Since Bruce un-applied your change, the simplest way to fix things from here might be for you to submit a new (un-reversed this time) patch and after Bruce applies it I will pull it down and verify it. Sorry for the confusion. -dg David Gould dg@informix.com 510.628.3783 or 510.305.9468 Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612 - If simplicity worked, the world would be overrun with insects. -