Thread: BUG #5267: initdb fails on AIX: could not identify current directory
The following bug has been logged online: Bug reference: 5267 Logged by: Michael Felt Email address: mamfelt@gmail.com PostgreSQL version: 8.4.2 Operating system: AIX Description: initdb fails on AIX: could not identify current directory Details: I have compiled postgresql on AIX 6.1 without a great deal of difficulity (./configure -without-readline -without-zlib). However, the fist step - initdb - fails. I have not been able to determine which directory it is complaining about. Suggestions welcome. ======== michael@x054:[/data/home/michael]su - postgres postgres@x054:[/home/postgres]/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data could not identify current directory: The file access permissions do not allow the specified action. could not identify current directory: The file access permissions do not allow the specified action. could not identify current directory: The file access permissions do not allow the specified action. The program "postgres" is needed by initdb but was not found in the same directory as "initdb". Check your installation. postgres@x054:[/home/postgres]ls -l /usr/local/pgsql/bin total 18744 -rwxr-xr-x 1 root system 79898 Jan 05 13:33 clusterdb -rwxr-xr-x 1 root system 79554 Jan 05 13:33 createdb -rwxr-xr-x 1 root system 79732 Jan 05 13:33 createlang -rwxr-xr-x 1 root system 82515 Jan 05 13:33 createuser -rwxr-xr-x 1 root system 76661 Jan 05 13:33 dropdb -rwxr-xr-x 1 root system 82087 Jan 05 13:33 droplang -rwxr-xr-x 1 root system 76613 Jan 05 13:33 dropuser -rwxr-xr-x 1 root system 686273 Jan 05 13:33 ecpg -rwxr-xr-x 1 root system 101041 Jan 05 13:33 initdb -rwxr-xr-x 1 root system 37341 Jan 05 13:33 pg_config -rwxr-xr-x 1 root system 31437 Jan 05 13:33 pg_controldata -rwxr-xr-x 1 root system 52157 Jan 05 13:33 pg_ctl -rwxr-xr-x 1 root system 392643 Jan 05 13:33 pg_dump -rwxr-xr-x 1 root system 103892 Jan 05 13:33 pg_dumpall -rwxr-xr-x 1 root system 46554 Jan 05 13:33 pg_resetxlog -rwxr-xr-x 1 root system 186965 Jan 05 13:33 pg_restore -rwxr-xr-x 1 root system 6842028 Jan 05 13:33 postgres lrwxrwxrwx 1 root system 8 Jan 05 13:44 postmaster -> postgres -rwxr-xr-x 1 root system 388374 Jan 05 13:33 psql -rwxr-xr-x 1 root system 82335 Jan 05 13:33 reindexdb -rwxr-xr-x 1 root system 47359 Jan 05 13:33 vacuumdb postgres@x054:[/home/postgres]ls -l . total 0 postgres@x054:[/home/postgres]ls -la . total 24 drwxr-xr-x 2 postgres staff 256 Jan 05 13:45 . drwxr-xr-x 31 bin bin 4096 Jan 05 13:44 .. -rwx------ 1 postgres staff 254 Jan 05 13:44 .profile -rw------- 1 postgres staff 884 Jan 05 18:38 .sh_history postgres@x054:[/home/postgres]mkgroup postgres ksh: mkgroup: 0403-006 Execute permission denied. postgres@x054:[/home/postgres] michael@x054:[/data/home/michael]mkgroup postgres michael@x054:[/data/home/michael]chuser pgrp=postgres postgres michael@x054:[/data/home/michael]chgrp postgres /usr/local/pgsql/data michael@x054:[/data/home/michael]su - postgres postgres@x054:[/home/postgres]id uid=2010(postgres) gid=2013(postgres) groups=1(staff) postgres@x054:[/home/postgres]/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data could not identify current directory: The file access permissions do not allow the specified action. could not identify current directory: The file access permissions do not allow the specified action. could not identify current directory: The file access permissions do not allow the specified action. The program "postgres" is needed by initdb but was not found in the same directory as "initdb". Check your installation. postgres@x054:[/home/postgres]ls -ld /usr drwxr-xr-x 43 root system 4096 Jan 05 13:40 /usr postgres@x054:[/home/postgres]ls -ld /usr/local drwxr-xr-x 19 root system 4096 Jan 05 13:32 /usr/local postgres@x054:[/home/postgres]ls -ld /usr/local/pgsql drwxr-xr-x 7 root system 256 Jan 05 13:45 /usr/local/pgsql postgres@x054:[/home/postgres]ls -ld /usr/local/pgsql/data drwxr-xr-x 2 postgres postgres 256 Jan 05 13:45 /usr/local/pgsql/data postgres@x054:[/home/postgres]ls -ld /usr/local/pgsql/bin drwxr-xr-x 2 root system 4096 Jan 05 13:33 /usr/local/pgsql/bin postgres@x054:[/home/postgres]/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data postgres@x054:[/home/postgres]
Michael Felt wrote: > The following bug has been logged online: > > Bug reference: 5267 > Logged by: Michael Felt > Email address: mamfelt@gmail.com > PostgreSQL version: 8.4.2 > Operating system: AIX > Description: initdb fails on AIX: could not identify current > directory > Details: > > I have compiled postgresql on AIX 6.1 without a great deal of difficulity > (./configure -without-readline -without-zlib). > > However, the fist step - initdb - fails. I have not been able to determine > which directory it is complaining about. > > Suggestions welcome. > ======== > michael@x054:[/data/home/michael]su - postgres > postgres@x054:[/home/postgres]/usr/local/pgsql/bin/initdb -D > /usr/local/pgsql/data > could not identify current directory: The file access permissions do not > allow the specified action. > could not identify current directory: The file access permissions do not > allow the specified action. > could not identify current directory: The file access permissions do not > allow the specified action. > The program "postgres" is needed by initdb but was not found in the > same directory as "initdb". > Check your installation. > postgres@x054:[/home/postgres]ls -l /usr/local/pgsql/bin > total 18744 > -rwxr-xr-x 1 root system 79898 Jan 05 13:33 clusterdb > -rwxr-xr-x 1 root system 79554 Jan 05 13:33 createdb > -rwxr-xr-x 1 root system 79732 Jan 05 13:33 createlang > -rwxr-xr-x 1 root system 82515 Jan 05 13:33 createuser > -rwxr-xr-x 1 root system 76661 Jan 05 13:33 dropdb > -rwxr-xr-x 1 root system 82087 Jan 05 13:33 droplang > -rwxr-xr-x 1 root system 76613 Jan 05 13:33 dropuser > -rwxr-xr-x 1 root system 686273 Jan 05 13:33 ecpg > -rwxr-xr-x 1 root system 101041 Jan 05 13:33 initdb > -rwxr-xr-x 1 root system 37341 Jan 05 13:33 pg_config > -rwxr-xr-x 1 root system 31437 Jan 05 13:33 pg_controldata > -rwxr-xr-x 1 root system 52157 Jan 05 13:33 pg_ctl > -rwxr-xr-x 1 root system 392643 Jan 05 13:33 pg_dump > -rwxr-xr-x 1 root system 103892 Jan 05 13:33 pg_dumpall > -rwxr-xr-x 1 root system 46554 Jan 05 13:33 pg_resetxlog > -rwxr-xr-x 1 root system 186965 Jan 05 13:33 pg_restore > -rwxr-xr-x 1 root system 6842028 Jan 05 13:33 postgres > lrwxrwxrwx 1 root system 8 Jan 05 13:44 postmaster -> > postgres > -rwxr-xr-x 1 root system 388374 Jan 05 13:33 psql > -rwxr-xr-x 1 root system 82335 Jan 05 13:33 reindexdb > -rwxr-xr-x 1 root system 47359 Jan 05 13:33 vacuumdb > postgres@x054:[/home/postgres]ls -l . > total 0 > postgres@x054:[/home/postgres]ls -la . > total 24 > drwxr-xr-x 2 postgres staff 256 Jan 05 13:45 . > drwxr-xr-x 31 bin bin 4096 Jan 05 13:44 .. > -rwx------ 1 postgres staff 254 Jan 05 13:44 .profile > -rw------- 1 postgres staff 884 Jan 05 18:38 .sh_history > postgres@x054:[/home/postgres]mkgroup postgres > ksh: mkgroup: 0403-006 Execute permission denied. > postgres@x054:[/home/postgres] > michael@x054:[/data/home/michael]mkgroup postgres > michael@x054:[/data/home/michael]chuser pgrp=postgres postgres > michael@x054:[/data/home/michael]chgrp postgres /usr/local/pgsql/data > michael@x054:[/data/home/michael]su - postgres > postgres@x054:[/home/postgres]id > uid=2010(postgres) gid=2013(postgres) groups=1(staff) > postgres@x054:[/home/postgres]/usr/local/pgsql/bin/initdb -D > /usr/local/pgsql/data > could not identify current directory: The file access permissions do not > allow the specified action. > could not identify current directory: The file access permissions do not > allow the specified action. > could not identify current directory: The file access permissions do not > allow the specified action. > The program "postgres" is needed by initdb but was not found in the > same directory as "initdb". > Check your installation. > postgres@x054:[/home/postgres]ls -ld /usr > drwxr-xr-x 43 root system 4096 Jan 05 13:40 /usr > postgres@x054:[/home/postgres]ls -ld /usr/local > drwxr-xr-x 19 root system 4096 Jan 05 13:32 /usr/local > postgres@x054:[/home/postgres]ls -ld /usr/local/pgsql > drwxr-xr-x 7 root system 256 Jan 05 13:45 /usr/local/pgsql > postgres@x054:[/home/postgres]ls -ld /usr/local/pgsql/data > drwxr-xr-x 2 postgres postgres 256 Jan 05 13:45 > /usr/local/pgsql/data > postgres@x054:[/home/postgres]ls -ld /usr/local/pgsql/bin > drwxr-xr-x 2 root system 4096 Jan 05 13:33 > /usr/local/pgsql/bin > postgres@x054:[/home/postgres]/usr/local/pgsql/bin/initdb -D > /usr/local/pgsql/data > postgres@x054:[/home/postgres] > > Just thought I would add - I sent this in 24 hours ago using this link in the FAQ - and obviously, for me, it never arrived. The text may need updating: === Where to report bugs In general, send bug reports to the bug report mailing list at |<pgsql-bugs@postgresql.org <mailto:pgsql-bugs@postgresql.org>>|. You are requested to use a descriptive subject for your email message, perhaps parts of the error message. ===
Michael Felt escribió: > > The following bug has been logged online: > > Bug reference: 5267 > Logged by: Michael Felt > Email address: mamfelt@gmail.com > PostgreSQL version: 8.4.2 > Operating system: AIX > Description: initdb fails on AIX: could not identify current > directory > Details: > > I have compiled postgresql on AIX 6.1 without a great deal of difficulity > (./configure -without-readline -without-zlib). > > However, the fist step - initdb - fails. I have not been able to determine > which directory it is complaining about. > > Suggestions welcome. What user was /usr/local/pgsql/data owned by originally? -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
"Michael Felt" <mamfelt@gmail.com> writes: > I have compiled postgresql on AIX 6.1 without a great deal of difficulity > (./configure -without-readline -without-zlib). > However, the fist step - initdb - fails. I have not been able to determine > which directory it is complaining about. > could not identify current directory: The file access permissions do not > allow the specified action. This message means that getcwd() failed, apparently with an EACCESS code. Typically what that means is that some parent directory of the current directory is not readable or searchable by the user you ran initdb as. I believe that initdb also invokes getcwd for the directory that its own executable is in, and possibly for the target data directory, so you should check the paths leading to those as well. regards, tom lane
michael@x054:[/data/home/ michael]ls -ld / drwxr-xr-x 27 root system 4096 Jan 04 17:20 / michael@x054:[/data/home/michael]ls -ld /usr drwxr-xr-x 43 root system 4096 Jan 05 13:40 /usr michael@x054:[/data/home/michael]ls -ld /usr/local drwxr-xr-x 19 root system 4096 Jan 05 13:32 /usr/local michael@x054:[/data/home/michael]ls -ld /usr/local/pgsql drwxr-xr-x 7 root system 256 Jan 05 13:45 /usr/local/pgsql michael@x054:[/data/home/michael]ls -ld /usr/local/pgsql/data drwxr-xr-x 2 postgres postgres 256 Jan 05 13:45 /usr/local/pgsql/data michael@x054:[/data/home/michael]ls -ld /usr/local/pgsql/bin drwxr-xr-x 2 root system 4096 Jan 05 13:33 /usr/local/pgsql/bin To answer the other question: root:system was the original owner. Changed to pgsql before calling application. Because of the error I made the group, changed user postgres to be postgres:postgres by default and made the directory postgres:postgres. I suppose I could turn on audit and see if it is trying to access a hard coded directory. But, in any case, I would update the error message to at least mention the directory name it is having issues with. On Wed, Jan 6, 2010 at 5:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Michael Felt" <mamfelt@gmail.com> writes: > > I have compiled postgresql on AIX 6.1 without a great deal of difficulity > > (./configure -without-readline -without-zlib). > > > However, the fist step - initdb - fails. I have not been able to > determine > > which directory it is complaining about. > > > could not identify current directory: The file access permissions do not > > allow the specified action. > > This message means that getcwd() failed, apparently with an EACCESS > code. Typically what that means is that some parent directory of > the current directory is not readable or searchable by the user you > ran initdb as. > > I believe that initdb also invokes getcwd for the directory that > its own executable is in, and possibly for the target data directory, > so you should check the paths leading to those as well. > > regards, tom lane >
Michael Felt <mamfelt@gmail.com> writes: > I suppose I could turn on audit and see if it is trying to access a hard > coded directory. But, in any case, I would update the error message to at > least mention the directory name it is having issues with. Well, the problem is what to print? The failure we are trying to report is exactly that we *can't get* the name of the directory. regards, tom lane
Well, there is an argument that a system call is using to get somewhere? Even if it is a number, it is something. I could do an ncheck or whatever to at least find what it is calling. As I am not at all familiar with the code - just give me source to debug, and I'll work from that. On Thu, Jan 7, 2010 at 3:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Felt <mamfelt@gmail.com> writes: > > I suppose I could turn on audit and see if it is trying to access a hard > > coded directory. But, in any case, I would update the error message to at > > least mention the directory name it is having issues with. > > Well, the problem is what to print? The failure we are trying to report > is exactly that we *can't get* the name of the directory. > > regards, tom lane >
Michael Felt <mamfelt@gmail.com> writes: > On Thu, Jan 7, 2010 at 3:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Well, the problem is what to print? The failure we are trying to report >> is exactly that we *can't get* the name of the directory. > Well, there is an argument that a system call is using to get somewhere? getcwd() has no input parameters. regards, tom lane
michael@x054:[/data/prj/postgresql-8.4.2/src]grep cwd */*.c Well, unless you redefine it... port/exec.c:#define getcwd(cwd,len) GetCurrentDirectory(len, cwd) port/exec.c: char cwd[MAXPGPATH], port/exec.c: if (!getcwd(cwd, MAXPGPATH)) port/exec.c: join_path_components(retpath, cwd, argv0); port/exec.c: join_path_components(retpath, cwd, argv0); port/exec.c: join_path_components(retpath, cwd, test_path); port/exec.c: * getcwd() to figure out where the heck we're at. port/exec.c: * getcwd() to give us an accurate, symlink-free path. port/exec.c: if (!getcwd(orig_wd, MAXPGPATH)) port/exec.c: if (!getcwd(path, MAXPGPATH)) Now I have no idea what is being called. I hope you do. On Thu, Jan 7, 2010 at 3:57 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Michael Felt <mamfelt@gmail.com> writes: > > On Thu, Jan 7, 2010 at 3:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Well, the problem is what to print? The failure we are trying to report > >> is exactly that we *can't get* the name of the directory. > > > Well, there is an argument that a system call is using to get somewhere? > > getcwd() has no input parameters. > > regards, tom lane >
On Thu, Jan 7, 2010 at 10:00 AM, Michael Felt <mamfelt@gmail.com> wrote: > michael@x054:[/data/prj/postgresql-8.4.2/src]grep cwd */*.c > Well, unless you redefine it... > port/exec.c:#define getcwd(cwd,len) GetCurrentDirectory(len, cwd) If you look at the context of this #define you'll see that it only applies to Windows. > port/exec.c:=A0=A0=A0 char=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 cwd[MAXPGPATH= ], > port/exec.c:=A0=A0=A0 if (!getcwd(cwd, MAXPGPATH)) > port/exec.c:=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 joi= n_path_components(retpath, cwd, argv0); > port/exec.c:=A0=A0=A0 join_path_components(retpath, cwd, argv0); > port/exec.c:=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0= =A0=A0=A0=A0=A0=A0=A0 join_path_components(retpath, cwd, > test_path); > port/exec.c:=A0=A0=A0=A0 * getcwd() to figure out where the heck we're at. > port/exec.c:=A0=A0=A0=A0 * getcwd() to give us an accurate, symlink-free = path. > port/exec.c:=A0=A0=A0 if (!getcwd(orig_wd, MAXPGPATH)) > port/exec.c:=A0=A0=A0 if (!getcwd(path, MAXPGPATH)) > > > Now I have no idea what is being called. I hope you do. It looks to me like bin/initdb/initdb.c:main() is calling port/exec.c:find_other_exec() which is calling port/exec.c:find_my_exec() which is calling getcwd(). So it's probably the directory you were in when you ran initdb that is the problem. For example: cd $HOME mkdir delete-me cd delete-me rmdir $HOME/delete-me initdb Produces: could not identify current directory: No such file or directory could not identify current directory: No such file or directory could not identify current directory: No such file or directory The program "postgres" is needed by initdb but was not found in the same directory as "initdb". Check your installation. This is very similar to what you got except that for you it's complaining about permissions rather than existence. I would try running initdb from someplace like / or /tmp and see if that works. I have to say that the error message that is produced by the above test case could easily send one looking in the wrong direction, and could perhaps stand to be improved. Could we just do getcwd() once, bail out if it fails, and then stash the results, rather than continuing on and eventually producing a misleading error message? ...Robert
I wrote a simple program - just calling getcwd() and I added some extra text to exec.c - to know where it is at. You should recognize it. michael@x054:[/data/home/michael]ls -l /usr/local/pgsql/bin/mytest -rwx------ 1 root system 4793 Jan 07 15:39 /usr/local/pgsql/bin/mytest michael@x054:[/data/home/michael]chmod a+rx /usr/local/pgsql/bin/mytest michael@x054:[/data/home/michael]r su su - postgres postgres@x054:[/home/postgres]/usr/local/pgsql/bin/mytest /usr/local/pgsql/bin/mytest: 1 PATH_MAX:1023 /home/postgres postgres@x054:[/home/postgres] michael@x054:[/data/home/michael] michael@x054:[/data/home/michael]cat test.c #include <limits.h> #include <unistd.h> main(int argc, char **argv) { char buf[PATH_MAX]; printf("%s: %d\n", argv[0], argc); printf("PATH_MAX:%d\n", PATH_MAX); printf("%s\n", getcwd(buf, PATH_MAX)); } =========== And I am only running the command from /home/postgres postgres@x054:[/home/postgres]ls -ld . drwxr-xr-x 2 postgres staff 256 Jan 05 13:45 . postgres@x054:[/home/postgres]ls -l /usr/local/pgsql/bin total 18760 -rwxr-xr-x 1 root system 79898 Jan 05 13:33 clusterdb -rwxr-xr-x 1 root system 79554 Jan 05 13:33 createdb -rwxr-xr-x 1 root system 79732 Jan 05 13:33 createlang -rwxr-xr-x 1 root system 82515 Jan 05 13:33 createuser -rwxr-xr-x 1 root system 76661 Jan 05 13:33 dropdb -rwxr-xr-x 1 root system 82087 Jan 05 13:33 droplang -rwxr-xr-x 1 root system 76613 Jan 05 13:33 dropuser -rwxr-xr-x 1 root system 686273 Jan 05 13:33 ecpg -rwxr-xr-x 1 root system 101639 Jan 07 15:34 initdb -rwxr-xr-x 1 root system 4793 Jan 07 15:39 mytest -rwxr-xr-x 1 root system 37341 Jan 05 13:33 pg_config -rwxr-xr-x 1 root system 31437 Jan 05 13:33 pg_controldata -rwxr-xr-x 1 root system 52157 Jan 05 13:33 pg_ctl -rwxr-xr-x 1 root system 392643 Jan 05 13:33 pg_dump -rwxr-xr-x 1 root system 103892 Jan 05 13:33 pg_dumpall -rwxr-xr-x 1 root system 46554 Jan 05 13:33 pg_resetxlog -rwxr-xr-x 1 root system 186965 Jan 05 13:33 pg_restore -rwxr-xr-x 1 root system 6842028 Jan 05 13:33 postgres lrwxrwxrwx 1 root system 8 Jan 05 13:44 postmaster -> postgres -rwxr-xr-x 1 root system 388374 Jan 05 13:33 psql -rwxr-xr-x 1 root system 82335 Jan 05 13:33 reindexdb -rwxr-xr-x 1 root system 47359 Jan 05 13:33 vacuumdb postgres@x054:[/home/postgres]/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data resolve_symlinks() orig_wd: MAXPGPATH == 1024 could not identify path directory: The file access permissions do not allow the specified action. find_my_exec(): MAXPGPATH == 1024 could not identify current directory: The file access permissions do not allow the specified action. find_my_exec(): MAXPGPATH == 1024 could not identify current directory: The file access permissions do not allow the specified action. The program "postgres" is needed by initdb but was not found in the same directory as "initdb". Check your installation. postgres@x054:[/home/postgres] ========= From /tmp gives the same result. postgres@x054:[/home/postgres]cd /tmp postgres@x054:[/tmp]/usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data resolve_symlinks() orig_wd: MAXPGPATH == 1024 could not identify path directory: The file access permissions do not allow the specified action. find_my_exec(): MAXPGPATH == 1024 could not identify current directory: The file access permissions do not allow the specified action. find_my_exec(): MAXPGPATH == 1024 could not identify current directory: The file access permissions do not allow the specified action. The program "postgres" is needed by initdb but was not found in the same directory as "initdb". Check your installation. postgres@x054:[/tmp] ========= I ran my test program with larger and smaller MAXPGPATH constants. 2046 (1023 * 2) was the largest I tested and it worked fine. When it was shorted the call failed. I did not test the error message. On Thu, Jan 7, 2010 at 4:25 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Thu, Jan 7, 2010 at 10:00 AM, Michael Felt <mamfelt@gmail.com> wrote: > > michael@x054:[/data/prj/postgresql-8.4.2/src]grep cwd */*.c > > Well, unless you redefine it... > > port/exec.c:#define getcwd(cwd,len) GetCurrentDirectory(len, cwd) > > If you look at the context of this #define you'll see that it only > applies to Windows. > > > port/exec.c: char cwd[MAXPGPATH], > > port/exec.c: if (!getcwd(cwd, MAXPGPATH)) > > port/exec.c: join_path_components(retpath, cwd, > argv0); > > port/exec.c: join_path_components(retpath, cwd, argv0); > > port/exec.c: join_path_components(retpath, > cwd, > > test_path); > > port/exec.c: * getcwd() to figure out where the heck we're at. > > port/exec.c: * getcwd() to give us an accurate, symlink-free path. > > port/exec.c: if (!getcwd(orig_wd, MAXPGPATH)) > > port/exec.c: if (!getcwd(path, MAXPGPATH)) > > > > > > Now I have no idea what is being called. I hope you do. > > It looks to me like bin/initdb/initdb.c:main() is calling > port/exec.c:find_other_exec() which is calling > port/exec.c:find_my_exec() which is calling getcwd(). So it's > probably the directory you were in when you ran initdb that is the > problem. For example: > > cd $HOME > mkdir delete-me > cd delete-me > rmdir $HOME/delete-me > initdb > > Produces: > > could not identify current directory: No such file or directory > could not identify current directory: No such file or directory > could not identify current directory: No such file or directory > The program "postgres" is needed by initdb but was not found in the > same directory as "initdb". > Check your installation. > > This is very similar to what you got except that for you it's > complaining about permissions rather than existence. I would try > running initdb from someplace like / or /tmp and see if that works. > > I have to say that the error message that is produced by the above > test case could easily send one looking in the wrong direction, and > could perhaps stand to be improved. Could we just do getcwd() once, > bail out if it fails, and then stash the results, rather than > continuing on and eventually producing a misleading error message? > > ...Robert >
I turned on audit - it continues to say michael as user for accountability. Notice: su changes to /home/postgres and initdb changes to /usr/local/pgsql/bin FS_Chdir michael OK Thu Jan 07 16:06:35 2010 su Global change current directory to: /home/postgres MLS Data: Not supported FS_Chdir michael OK Thu Jan 07 16:06:38 2010 initdb Global change current directory to: /usr/local/pgsql/bin MLS Data: Not supported On Thu, Jan 7, 2010 at 3:41 PM, Michael Felt <mamfelt@gmail.com> wrote: > Well, there is an argument that a system call is using to get somewhere? > Even if it is a number, it is something. I could do an ncheck or whatever to > at least find what it is calling. > > As I am not at all familiar with the code - just give me source to debug, > and I'll work from that. > > > On Thu, Jan 7, 2010 at 3:35 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Michael Felt <mamfelt@gmail.com> writes: >> > I suppose I could turn on audit and see if it is trying to access a hard >> > coded directory. But, in any case, I would update the error message to >> at >> > least mention the directory name it is having issues with. >> >> Well, the problem is what to print? The failure we are trying to report >> is exactly that we *can't get* the name of the directory. >> >> regards, tom lane >> > >
Robert Haas <robertmhaas@gmail.com> writes: > I have to say that the error message that is produced by the above > test case could easily send one looking in the wrong direction, and > could perhaps stand to be improved. Could we just do getcwd() once, > bail out if it fails, and then stash the results, rather than > continuing on and eventually producing a misleading error message? How does that help? We still can't print the directory name. regards, tom lane
Michael Felt <mamfelt@gmail.com> writes: > I ran my test program with larger and smaller MAXPGPATH constants. 2046 > (1023 * 2) was the largest I tested and it worked fine. When it was shorted > the call failed. I did not test the error message. [ scratches head... ] This seems to be misbehavior of getcwd() itself, since the paths involved are certainly not 2K long. Perhaps you need to file a bug with IBM. regards, tom lane
On Thu, Jan 7, 2010 at 11:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I have to say that the error message that is produced by the above >> test case could easily send one looking in the wrong direction, and >> could perhaps stand to be improved. =A0Could we just do getcwd() once, >> bail out if it fails, and then stash the results, rather than >> continuing on and eventually producing a misleading error message? > > How does that help? =A0We still can't print the directory name. Well, as it is, it looks like the failure of getcwd() might be an incidental problem, and the inability to find postgres was what sunk the ship. In fact, the inability to find postgres is an entirely illusory problem created by the failure of getcwd(). If you just got one error message saying "getcwd failed", I think it would be more clear what the problem was. I had to go read the code to figure out that the failure of getcwd() would result in a guaranteed failure to find the postgres executable. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > On Thu, Jan 7, 2010 at 11:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> How does that help? We still can't print the directory name. > Well, as it is, it looks like the failure of getcwd() might be an > incidental problem, and the inability to find postgres was what sunk > the ship. In fact, the inability to find postgres is an entirely > illusory problem created by the failure of getcwd(). If you just got > one error message saying "getcwd failed", I think it would be more > clear what the problem was. I had to go read the code to figure out > that the failure of getcwd() would result in a guaranteed failure to > find the postgres executable. Should we just turn find_my_exec() into a routine that elogs/exits on failure, instead of returning an error code? There are a couple of call sites that have the idea that they can survive a failure, but I think they're pretty bogus. There are actually two distinct cases that we need to worry about (and I'm not entirely certain that I know which one Michael is hitting). Case 1 is where getcwd() fails on the program's starting current directory. Case 2 is where it fails after we do a series of chdir's following symlinks. In case 1 there really is no additional information available, whereas in case 2 we could perhaps print the name of the first or last symlink we tried to follow. Also, while I think it might be fair to treat case 1 as a hard error, it's a bit more plausible that a caller might have a recovery strategy for case 2. So maybe treating these two cases differently would be a good thing. regards, tom lane
Update: I have reinstalled my server and now it works fine. (I did not recomply the initdb program). As I did this to enable a security feature, not related to this (called Trusted Executition) I can only guess what might have been to problem. As I dont have the old system - and I am not going to reinstall it to "test" - we are at a dead end here. I am glad that my build procedure worked properly though! I am looking for other AIX users to test the build as I am building up a repository of pre-built AIX open source packages - especially those not found elsewhere. Where can I best make this known within postgres? Thanks for your assistance, Michael On Mon, Jan 11, 2010 at 8:37 AM, Michael Felt <mamfelt@gmail.com> wrote: > Are you still thinking about this - and what might be the problem? > > In any case I am not starting from an orphaned directory. I have started > from postgres home directory and /tmp as requested. The audit log I sent > shows that initdb has performed at least one chdir command. > > Michael > > > On Fri, Jan 8, 2010 at 4:59 AM, Michael Felt <mamfelt@gmail.com> wrote: > >> is there a debug or trace mode wherein the system reports all its >> activity? Perhaps a compile time option? >> >> >> On Thu, Jan 7, 2010 at 7:43 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >>> Robert Haas <robertmhaas@gmail.com> writes: >>> > On Thu, Jan 7, 2010 at 11:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> >> How does that help? We still can't print the directory name. >>> >>> > Well, as it is, it looks like the failure of getcwd() might be an >>> > incidental problem, and the inability to find postgres was what sunk >>> > the ship. In fact, the inability to find postgres is an entirely >>> > illusory problem created by the failure of getcwd(). If you just got >>> > one error message saying "getcwd failed", I think it would be more >>> > clear what the problem was. I had to go read the code to figure out >>> > that the failure of getcwd() would result in a guaranteed failure to >>> > find the postgres executable. >>> >>> Should we just turn find_my_exec() into a routine that elogs/exits on >>> failure, instead of returning an error code? There are a couple of call >>> sites that have the idea that they can survive a failure, but I think >>> they're pretty bogus. >>> >>> There are actually two distinct cases that we need to worry about (and >>> I'm not entirely certain that I know which one Michael is hitting). >>> Case 1 is where getcwd() fails on the program's starting current >>> directory. Case 2 is where it fails after we do a series of chdir's >>> following symlinks. In case 1 there really is no additional information >>> available, whereas in case 2 we could perhaps print the name of the >>> first or last symlink we tried to follow. Also, while I think it might >>> be fair to treat case 1 as a hard error, it's a bit more plausible that >>> a caller might have a recovery strategy for case 2. So maybe treating >>> these two cases differently would be a good thing. >>> >>> regards, tom lane >>> >> >> >