make check hang on AIX 5L p690 4way/I have two solutions - Mailing list pgsql-patches
From | Tomoyuki Niijima |
---|---|
Subject | make check hang on AIX 5L p690 4way/I have two solutions |
Date | |
Msg-id | OF397DD310.F74CCBE2-ON49256C24.0056F9C3@LocalDomain Whole thread Raw |
Responses |
Re: make check hang on AIX 5L p690 4way/I have two solutions
|
List | pgsql-patches |
Your name : Tomoyuki Niijima Your email address : niijima@jp.ibm.com System Configuration --------------------- Architecture (example: Intel Pentium) : IBM 7040-681 (pSeries 690) 4way (LPAR) Operating System (example: Linux 2.0.26 ELF) : AIX 5L 5.1 PostgreSQL version (example: PostgreSQL-7.2.1): PostgreSQL-7.2.1 Compiler used (example: gcc 2.95.2) : gcc 2.9 Please enter a FULL description of your problem: ------------------------------------------------ I tried to build PostgreSQL with the following step to see backends hung during the regression test. The problem has been reproduced on two machine but both of these are the same type of hardware and software. I also tried to recreate the problem on other machines, on older version of AIX but I couldn't. Please describe a way to repeat the problem. Please try to provide a concise reproducible example, if at all possible: ---------------------------------------------------------------------- ./configure --enable-multibyte=EUC_JP --with-CC=gcc make I learned that backend slept in semop() by attaching dbx (AIX debugger) to one of 'postgres:' processes. If you know how this problem might be fixed, list the solution below: --------------------------------------------------------------------- After looked through pgsql-hackers mailing list, I focused on spin lock issue to solve the problem. The easiest and may not be the best solution for the problem is to give up HAS_TEST_AND_SET. This actually works. *** src/include/port/aix.h.org Tue Feb 13 23:32:52 2001 --- src/include/port/aix.h Fri Aug 30 01:02:28 2002 *************** *** 1,8 **** #define CLASS_CONFLICT #define DISABLE_XOPEN_NLS ! #define HAS_TEST_AND_SET #define NO_MKTIME_BEFORE_1970 ! typedef unsigned int slock_t; #include <sys/machine.h> /* ENDIAN definitions for network * communication */ --- 1,8 ---- #define CLASS_CONFLICT #define DISABLE_XOPEN_NLS ! /* #define HAS_TEST_AND_SET */ #define NO_MKTIME_BEFORE_1970 ! /* typedef unsigned int slock_t; */ #include <sys/machine.h> /* ENDIAN definitions for network * communication */ One another and better solution for the problem is to use _check_lock() and _clear_lock() as spin lock. Important thing here is to define S_UNLOCK() with _clear_lock(). This will solve the so called "Compiler bug" issue someone wrote on the mailing list. We have some other API such as cs(), compare_and_swap() and fetch_and_or() to do test and set on AIX, but any of these didn't solve my problem. I wrote tiny testing program to see if we have any bug of these API of AIX, but I couldn't see any problem except for compare_and_swap(). It seems that you can not use compare_and_swap() for the purpose, as it would not work as spin lock on any SMP machines I tested. I don't know the reason why cs() nor fetch_and_or()/fetch_and_and() will not work with PostgreSQL on p690. These worked with my testing program on all machines I tested. *** ./src/include/storage/s_lock.h.org Fri Aug 30 01:13:15 2002 --- ./src/include/storage/s_lock.h Wed Jan 30 00:44:42 2002 *************** *** 440,447 **** * Note that slock_t on POWER/POWER2/PowerPC is int instead of char * (see storage/ipc.h). */ ! #define TAS(lock) _check_lock(lock, 0, 1) ! #define S_UNLOCK(lock) _clear_lock(lock, 0) #endif /* _AIX */ --- 440,446 ---- * Note that slock_t on POWER/POWER2/PowerPC is int instead of char * (see storage/ipc.h). */ ! #define TAS(lock) cs((int *) (lock), 0, 1) #endif /* _AIX */
pgsql-patches by date: