Thread: Strange issue with initdb on 8.0 and Solaris automounts
Folks, I ran into an interesting issue when installing PostgreSQL 8.0 that I'm not sure how to resolve correctly. My system is a Sun machine (Blade 1000) running Solaris 9, with relatively recent patches. After install- ing 8.0, I went to run the 'initdb' command and was greeted with the following: [delirium:postgres] ~ (11) initdb -D /software/postgresql-8.0.0/data The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale C. creating directory /software/postgresql-8.0.0/data ... initdb: could not create directory "/software/postgresql-8.0.0": Operationnot applicable The error message was a bit confusing, so I decided to run a truss on the process to see what might be happening, and this is what I came across: [...] 8802/1: write(1, " c r e a t i n g d i r".., 62) = 62 8802/1: umask(0) = 077 8802/1: umask(077) = 0 8802/1: mkdir("/software", 0777) Err#17 EEXIST 8802/1: stat64("/software", 0xFFBFC858) = 0 8802/1: mkdir("/software/postgresql-8.0.0", 0777) Err#89 ENOSYS [...] The last error in that section, ENOSYS, is very strange, as the Solaris manpage for 'mkdir' does not mention it as a possible error. One thing to note in this, however, is that '/software/postgresql-8.0.0' is not a regular directory, but an automount point (which in this case is just a local loopback mount). So the indication is that Solaris seems to have a bug not in mkdir, but deeper in their VFS code that's causing this seemingly strange issue. Two workarounds for this problem have been found: running 'initdb' with a directory that's *not* an automount point and then moving the 'data' directory to its final destination worked fine, along with a suggestion from Andrew Dunstan (on the #postgresql IRC channel) with using a rela- tive path for the data directory. Both were successful in avoiding the issue, but I decided to mention this here in case someone felt it might be worth looking into to see if the Sun problem can be avoided; I am going to notify Sun of their bug, just don't know how long it will take them to actually resolve it (if they ever do). While I can fully understand that a code change here may not be desire- able, might some notes in the documentation be useful for those who might stumble across the problem as well? Just a suggestion... I hope I gave sufficient information on the problem, though I'm always willing to give any clarification needed. Thank you for your time. Ken Lareau elessar@numenor.org
Coincidentally I JUST NOW built 8.0 on Solaris 9, and ran into the same problem. As they say, "this used to work"..... We build databases as part of the build of our product, and I'm looking into what we need to do to upgrade from 7.4.5, and this was the first thing I ran into. I hadn't gotten as far as truss yet, so thanks Kenneth for that extra info. Did initdb previously just assume the -D path existed, and now it is trying to create the whole path, if necessary? - DAP >-----Original Message----- >From: pgsql-hackers-owner@postgresql.org >[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kenneth Lareau >Sent: Thursday, January 27, 2005 5:23 PM >To: pgsql-hackers@postgresql.org >Subject: [HACKERS] Strange issue with initdb on 8.0 and >Solaris automounts > >Folks, > >I ran into an interesting issue when installing PostgreSQL 8.0 >that I'm not sure how to resolve correctly. My system is a >Sun machine (Blade >1000) running Solaris 9, with relatively recent patches. After >install- ing 8.0, I went to run the 'initdb' command and was >greeted with the >following: > >[delirium:postgres] ~ >(11) initdb -D /software/postgresql-8.0.0/data The files >belonging to this database system will be owned by user "postgres". >This user must also own the server process. > >The database cluster will be initialized with locale C. > >creating directory /software/postgresql-8.0.0/data ... initdb: >could not create directory "/software/postgresql-8.0.0": >Operation not applicable > > >The error message was a bit confusing, so I decided to run a >truss on the process to see what might be happening, and this >is what I came >across: > >[...] >8802/1: write(1, " c r e a t i n g d i r".., 62) = 62 >8802/1: umask(0) = 077 >8802/1: umask(077) = 0 >8802/1: mkdir("/software", 0777) > Err#17 EEXIST >8802/1: stat64("/software", 0xFFBFC858) = 0 >8802/1: mkdir("/software/postgresql-8.0.0", 0777) > Err#89 ENOSYS >[...] > > >The last error in that section, ENOSYS, is very strange, as >the Solaris manpage for 'mkdir' does not mention it as a >possible error. One thing to note in this, however, is that >'/software/postgresql-8.0.0' is not a regular directory, but >an automount point (which in this case is just a local >loopback mount). So the indication is that Solaris seems to >have a bug not in mkdir, but deeper in their VFS code that's >causing this seemingly strange issue. > >Two workarounds for this problem have been found: running >'initdb' with a directory that's *not* an automount point and >then moving the 'data' >directory to its final destination worked fine, along with a >suggestion from Andrew Dunstan (on the #postgresql IRC >channel) with using a rela- tive path for the data directory. >Both were successful in avoiding the issue, but I decided to >mention this here in case someone felt it might be worth >looking into to see if the Sun problem can be avoided; I am >going to notify Sun of their bug, just don't know how long it >will take them to actually resolve it (if they ever do). > >While I can fully understand that a code change here may not >be desire- able, might some notes in the documentation be >useful for those who might stumble across the problem as well? > Just a suggestion... > >I hope I gave sufficient information on the problem, though >I'm always willing to give any clarification needed. Thank >you for your time. > > >Ken Lareau >elessar@numenor.org > >---------------------------(end of >broadcast)--------------------------- >TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
"David Parker" <dparker@tazznetworks.com> writes: > Did initdb previously just assume the -D path existed, and now it is > trying to create the whole path, if necessary? Pre-8.0 it was using mkdir(1), which might possibly contain some weird workaround for this case on Solaris. I suppose that manually creating the data directory before running initdb would also avoid this issue, since the mkdir(2) loop is only entered if we don't find the directory in existence. regards, tom lane
I tried that, and it just runs into the problem with the first sub dir it tries to create: ed9i03:/home/dparker/temp % initdb -D /home/dparker/temp/testdb The files belonging to this database system will be owned by user "dparker". This user must also own the server process. The database cluster will be initialized with locale C. fixing permissions on existing directory /home/dparker/temp/testdb ... ok creating directory /home/dparker/temp/testdb/global ... initdb: could not create directory "/home/dparker": Operation not applicable initdb: removing contents of data directory "/home/dparker/temp/testdb" ed9i03:/home/dparker/temp truss: chmod("/home/dparker/temp/testdb", 0700) = 0 ok write(1, " o k\n", 3) = 3 creating directory /home/dparker/temp/testdb/global ... write(1, " c r e a t i n g d i r".., 56) = 56 umask(0) = 077 umask(077) = 0 mkdir("/home", 0777) Err#17 EEXIST xstat(2, "/home", 0x08045C20) = 0 mkdir("/home/dparker", 0777) Err#89 ENOSYS umask(077) = 077 fstat64(2, 0x08045000) = 0 initdbwrite(2, " i n i t d b", 6) = 6 : could not create directory "write(2, " : c o u l d n o t ".., 30) = 30 /home/dparkerwrite(2, " / h o m e / d p a r k e".., 13) = 13 ": write(2, " " : ", 3) = 3 Operation not applicablewrite(2, " O p e r a t i o n n o".., 24) = 24 - DAP >-----Original Message----- >From: Tom Lane [mailto:tgl@sss.pgh.pa.us] >Sent: Thursday, January 27, 2005 6:22 PM >To: David Parker >Cc: Kenneth Lareau; pgsql-hackers@postgresql.org >Subject: Re: [HACKERS] Strange issue with initdb on 8.0 and >Solaris automounts > >"David Parker" <dparker@tazznetworks.com> writes: >> Did initdb previously just assume the -D path existed, and now it is >> trying to create the whole path, if necessary? > >Pre-8.0 it was using mkdir(1), which might possibly contain >some weird workaround for this case on Solaris. > >I suppose that manually creating the data directory before >running initdb would also avoid this issue, since the mkdir(2) >loop is only entered if we don't find the directory in existence. > > regards, tom lane >
In message <21723.1106868138@sss.pgh.pa.us>, Tom Lane writes: >"David Parker" <dparker@tazznetworks.com> writes: >> Did initdb previously just assume the -D path existed, and now it is >> trying to create the whole path, if necessary? > >Pre-8.0 it was using mkdir(1), which might possibly contain some weird >workaround for this case on Solaris. > >I suppose that manually creating the data directory before running >initdb would also avoid this issue, since the mkdir(2) loop is only >entered if we don't find the directory in existence. > > regards, tom lane > Actually, creating the 'data' directory first doesn't work either: [delirium:postgres] ~ (17) mkdir data [delirium:postgres] ~ (18) initdb -D /software/postgresql-8.0.0/data The files belonging to this database system will be owned by user "postgres". This user must also own the server process. The database cluster will be initialized with locale C. fixing permissions on existing directory /software/postgresql-8.0.0/data ... ok creating directory /software/postgresql-8.0.0/data/global ... initdb: could not create directory "/software/postgresql-8.0.0":Operation not applicable initdb: removing contents of data directory "/software/postgresql-8.0.0/data" Since there's subdirectories that need to be created, it still runs into the problem. I don't know why the command 'mkdir' doesn't exhibit the same problem as the function 'mkdir', but running: mkdir /software/postgresql-8.0.0 produces the correct error "File exists" on my system. I suspect the 'mkdir' command probably checks to see if the directory exists first before trying to create it, which avoids the problem. Ken Lareau elessar@numenor.org
Kenneth Lareau <elessar@numenor.org> writes: > In message <21723.1106868138@sss.pgh.pa.us>, Tom Lane writes: >> I suppose that manually creating the data directory before running >> initdb would also avoid this issue, since the mkdir(2) loop is only >> entered if we don't find the directory in existence. > Actually, creating the 'data' directory first doesn't work either: Good point. > I don't know why the command 'mkdir' doesn't exhibit the > same problem as the function 'mkdir', but running: > mkdir /software/postgresql-8.0.0 > produces the correct error "File exists" on my system. Could you truss that and see what it does? It would be a simple change in initdb to make it stat before mkdir instead of after, but I'm not totally convinced that would fix the problem. If mkdir returns a funny error code then stat might as well ... regards, tom lane
In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes: >Kenneth Lareau <elessar@numenor.org> writes: >> In message <21723.1106868138@sss.pgh.pa.us>, Tom Lane writes: >>> I suppose that manually creating the data directory before running >>> initdb would also avoid this issue, since the mkdir(2) loop is only >>> entered if we don't find the directory in existence. > >> Actually, creating the 'data' directory first doesn't work either: > >Good point. > >> I don't know why the command 'mkdir' doesn't exhibit the >> same problem as the function 'mkdir', but running: > >> mkdir /software/postgresql-8.0.0 > >> produces the correct error "File exists" on my system. > >Could you truss that and see what it does? It would be a simple change >in initdb to make it stat before mkdir instead of after, but I'm not >totally convinced that would fix the problem. If mkdir returns a funny >error code then stat might as well ... > > regards, tom lane > Here's the relevant truss output from 'mkdir /software/postgresql-8.0.0' on my Solaris 9 system: 10832: umask(0) = 077 10832: umask(077) = 0 10832: mkdir("/software/postgresql-8.0.0", 0777) Err#89 ENOSYS 10832: stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0 10832: fstat64(2, 0xFFBFEB78) = 0 10832: write(2, " m k d i r", 5) = 5 10832: write(2, " : ", 2) = 2 10832: write(2, " c a n n o t c r e a t".., 24) = 24 10832: write(2, " ` / s o f t w a r e / p".., 28) = 28 10832: write(2, " : ", 2) = 2 10832: write(2, " F i l e e x i s t s", 11) = 11 10832: write(2, "\n", 1) = 1 10832: _exit(1) It's doing the stat after the mkdir attempt it seems, and coming back with the correct response. Hmm, maybe I should look at the Solaris 8 code for the mkdir command... Ken Lareau elessar@numenor.org
Tom Lane wrote: >>I don't know why the command 'mkdir' doesn't exhibit the >>same problem as the function 'mkdir', but running: >> >> > > > >> mkdir /software/postgresql-8.0.0 >> >> > > > >>produces the correct error "File exists" on my system. >> >> > >Could you truss that and see what it does? It would be a simple change >in initdb to make it stat before mkdir instead of after, but I'm not >totally convinced that would fix the problem. If mkdir returns a funny >error code then stat might as well ... > > > > There's also a tiny race condition, which I guess isn't worth worrying about. Returning ENOSYS is pretty bogus ... cheers andrew
Kenneth Lareau <elessar@numenor.org> writes: > In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes: >> Could you truss that and see what it does? > Here's the relevant truss output from 'mkdir /software/postgresql-8.0.0' > on my Solaris 9 system: > 10832: mkdir("/software/postgresql-8.0.0", 0777) Err#89 ENOSYS > 10832: stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0 > It's doing the stat after the mkdir attempt it seems, and coming back > with the correct response. Hmm, maybe I should look at the Solaris 8 > code for the mkdir command... Well, the important point is that the stat does succeed. I'm not going to put in anything as specific as a check for ENOSYS, but it seems reasonable to try the stat first and mkdir only if stat fails. I've applied the attached patch. regards, tom lane *** src/bin/initdb/initdb.c.orig Sat Jan 8 17:51:12 2005 --- src/bin/initdb/initdb.c Thu Jan 27 19:23:49 2005 *************** *** 476,481 **** --- 476,484 ---- * this tries to build all the elements of a path to a directory a la mkdir -p * we assume the path isin canonical form, i.e. uses / as the separator * we also assume it isn't null. + * + * note that on failure, the path arg has been modified to show the particular + * directory level we had problems with. */ static int mkdir_p(char *path, mode_t omode) *************** *** 544,573 **** } if (last) (void) umask(oumask); ! if (mkdir(path, last ? omode : S_IRWXU | S_IRWXG | S_IRWXO) < 0) { ! if (errno == EEXIST || errno == EISDIR) ! { ! if (stat(path, &sb) < 0) ! { ! retval = 1; ! break; ! } ! else if (!S_ISDIR(sb.st_mode)) ! { ! if (last) ! errno = EEXIST; ! else ! errno = ENOTDIR; ! retval = 1; ! break; ! } ! } ! else { retval = 1; break; } } if (!last) *p = '/'; --- 547,570 ---- } if (last) (void) umask(oumask); ! ! /* check for pre-existing directory; ok if it's a parent */ ! if (stat(path, &sb) == 0) { ! if (!S_ISDIR(sb.st_mode)) { + if (last) + errno = EEXIST; + else + errno = ENOTDIR; retval = 1; break; } + } + else if (mkdir(path, last ? omode : S_IRWXU | S_IRWXG | S_IRWXO) < 0) + { + retval = 1; + break; } if (!last) *p = '/';
Andrew Dunstan <andrew@dunslane.net> writes: > There's also a tiny race condition, which I guess isn't worth worrying > about. Considering that we're not checking ownership or permissions of the parent directories, I'd say not. regards, tom lane
In message <22687.1106872653@sss.pgh.pa.us>, Tom Lane writes: >Kenneth Lareau <elessar@numenor.org> writes: >> In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes: >>> Could you truss that and see what it does? > >> Here's the relevant truss output from 'mkdir /software/postgresql-8.0.0' >> on my Solaris 9 system: > >> 10832: mkdir("/software/postgresql-8.0.0", 0777) Err#89 ENOSYS >> 10832: stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0 > >> It's doing the stat after the mkdir attempt it seems, and coming back >> with the correct response. Hmm, maybe I should look at the Solaris 8 >> code for the mkdir command... > >Well, the important point is that the stat does succeed. I'm not going >to put in anything as specific as a check for ENOSYS, but it seems >reasonable to try the stat first and mkdir only if stat fails. >I've applied the attached patch. > > regards, tom lane Tom, thank you very much for the patch, it worked like a charm. Ken Lareau elessar@numenor.org
Yes, thanks very much! - DAP >-----Original Message----- >From: pgsql-hackers-owner@postgresql.org >[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kenneth Lareau >Sent: Thursday, January 27, 2005 8:10 PM >To: Tom Lane >Cc: Kenneth Lareau; pgsql-hackers@postgresql.org >Subject: Re: [HACKERS] Strange issue with initdb on 8.0 and >Solaris automounts > >In message <22687.1106872653@sss.pgh.pa.us>, Tom Lane writes: >>Kenneth Lareau <elessar@numenor.org> writes: >>> In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes: >>>> Could you truss that and see what it does? >> >>> Here's the relevant truss output from 'mkdir >/software/postgresql-8.0.0' >>> on my Solaris 9 system: >> >>> 10832: mkdir("/software/postgresql-8.0.0", 0777) >Err#89 ENOSYS >>> 10832: stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0 >> >>> It's doing the stat after the mkdir attempt it seems, and >coming back >>> with the correct response. Hmm, maybe I should look at the >Solaris 8 >>> code for the mkdir command... >> >>Well, the important point is that the stat does succeed. I'm >not going >>to put in anything as specific as a check for ENOSYS, but it seems >>reasonable to try the stat first and mkdir only if stat fails. >>I've applied the attached patch. >> >> regards, tom lane > > >Tom, thank you very much for the patch, it worked like a charm. > > >Ken Lareau >elessar@numenor.org > >---------------------------(end of >broadcast)--------------------------- >TIP 9: the planner will ignore your desire to choose an index >scan if your > joining column's datatypes do not match >
Will this make it into 8.1? >-----Original Message----- >From: Tom Lane [mailto:tgl@sss.pgh.pa.us] >Sent: Thursday, January 27, 2005 7:38 PM >To: Kenneth Lareau >Cc: David Parker; pgsql-hackers@postgresql.org >Subject: Re: [HACKERS] Strange issue with initdb on 8.0 and >Solaris automounts > >Kenneth Lareau <elessar@numenor.org> writes: >> In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes: >>> Could you truss that and see what it does? > >> Here's the relevant truss output from 'mkdir >/software/postgresql-8.0.0' >> on my Solaris 9 system: > >> 10832: mkdir("/software/postgresql-8.0.0", 0777) Err#89 ENOSYS >> 10832: stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0 > >> It's doing the stat after the mkdir attempt it seems, and >coming back >> with the correct response. Hmm, maybe I should look at the >Solaris 8 >> code for the mkdir command... > >Well, the important point is that the stat does succeed. I'm >not going to put in anything as specific as a check for >ENOSYS, but it seems reasonable to try the stat first and >mkdir only if stat fails. >I've applied the attached patch. > > regards, tom lane > >*** src/bin/initdb/initdb.c.orig Sat Jan 8 17:51:12 2005 >--- src/bin/initdb/initdb.c Thu Jan 27 19:23:49 2005 >*************** >*** 476,481 **** >--- 476,484 ---- > * this tries to build all the elements of a path to a >directory a la mkdir -p > * we assume the path is in canonical form, i.e. uses / as >the separator > * we also assume it isn't null. >+ * >+ * note that on failure, the path arg has been modified to show the >+ particular >+ * directory level we had problems with. > */ > static int > mkdir_p(char *path, mode_t omode) >*************** >*** 544,573 **** > } > if (last) > (void) umask(oumask); >! if (mkdir(path, last ? omode : S_IRWXU | >S_IRWXG | S_IRWXO) < 0) > { >! if (errno == EEXIST || errno == EISDIR) >! { >! if (stat(path, &sb) < 0) >! { >! retval = 1; >! break; >! } >! else if (!S_ISDIR(sb.st_mode)) >! { >! if (last) >! errno = EEXIST; >! else >! errno = ENOTDIR; >! retval = 1; >! break; >! } >! } >! else > { > retval = 1; > break; > } > } > if (!last) > *p = '/'; >--- 547,570 ---- > } > if (last) > (void) umask(oumask); >! >! /* check for pre-existing directory; ok if it's >a parent */ >! if (stat(path, &sb) == 0) > { >! if (!S_ISDIR(sb.st_mode)) > { >+ if (last) >+ errno = EEXIST; >+ else >+ errno = ENOTDIR; > retval = 1; > break; > } >+ } >+ else if (mkdir(path, last ? omode : S_IRWXU | >S_IRWXG | S_IRWXO) < 0) >+ { >+ retval = 1; >+ break; > } > if (!last) > *p = '/'; >