Thread: Strange issue with initdb on 8.0 and Solaris automounts

Strange issue with initdb on 8.0 and Solaris automounts

From
Kenneth Lareau
Date:
Folks,

I ran into an interesting issue when installing PostgreSQL 8.0 that I'm
not sure how to resolve correctly.  My system is a Sun machine (Blade
1000) running Solaris 9, with relatively recent patches. After install-
ing 8.0, I went to run the 'initdb' command and was greeted with the
following:

[delirium:postgres] ~
(11) initdb -D /software/postgresql-8.0.0/data
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale C.

creating directory /software/postgresql-8.0.0/data ... initdb: could not create directory "/software/postgresql-8.0.0":
Operationnot applicable
 


The error message was a bit confusing, so I decided to run a truss on
the process to see what might be happening, and this is what I came
across:

[...]
8802/1:         write(1, " c r e a t i n g   d i r".., 62)      = 62
8802/1:         umask(0)                                        = 077
8802/1:         umask(077)                                      = 0
8802/1:         mkdir("/software", 0777)                        Err#17 EEXIST
8802/1:         stat64("/software", 0xFFBFC858)                 = 0
8802/1:         mkdir("/software/postgresql-8.0.0", 0777)       Err#89 ENOSYS
[...]


The last error in that section, ENOSYS, is very strange, as the Solaris
manpage for 'mkdir' does not mention it as a possible error.  One thing
to note in this, however, is that '/software/postgresql-8.0.0' is not a
regular directory, but an automount point (which in this case is just a
local loopback mount).  So the indication is that Solaris seems to have
a bug not in mkdir, but deeper in their VFS code that's causing this
seemingly strange issue.

Two workarounds for this problem have been found: running 'initdb' with
a directory that's *not* an automount point and then moving the 'data'
directory to its final destination worked fine, along with a suggestion
from Andrew Dunstan (on the #postgresql IRC channel) with using a rela-
tive path for the data directory.  Both were successful in avoiding the
issue, but I decided to mention this here in case someone felt it might
be worth looking into to see if the Sun problem can be avoided; I am
going to notify Sun of their bug, just don't know how long it will take
them to actually resolve it (if they ever do).

While I can fully understand that a code change here may not be desire-
able, might some notes in the documentation be useful for those who
might stumble across the problem as well?  Just a suggestion...

I hope I gave sufficient information on the problem, though I'm always
willing to give any clarification needed.  Thank you for your time.


Ken Lareau
elessar@numenor.org


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
"David Parker"
Date:
Coincidentally I JUST NOW built 8.0 on Solaris 9, and ran into the same
problem. As they say, "this used to work".....

We build databases as part of the build of our product, and I'm looking
into what we need to do to upgrade from 7.4.5, and this was the first
thing I ran into. I hadn't gotten as far as truss yet, so thanks Kenneth
for that extra info.

Did initdb previously just assume the -D path existed, and now it is
trying to create the whole path, if necessary?

- DAP

>-----Original Message-----
>From: pgsql-hackers-owner@postgresql.org
>[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kenneth Lareau
>Sent: Thursday, January 27, 2005 5:23 PM
>To: pgsql-hackers@postgresql.org
>Subject: [HACKERS] Strange issue with initdb on 8.0 and
>Solaris automounts
>
>Folks,
>
>I ran into an interesting issue when installing PostgreSQL 8.0
>that I'm not sure how to resolve correctly.  My system is a
>Sun machine (Blade
>1000) running Solaris 9, with relatively recent patches. After
>install- ing 8.0, I went to run the 'initdb' command and was
>greeted with the
>following:
>
>[delirium:postgres] ~
>(11) initdb -D /software/postgresql-8.0.0/data The files
>belonging to this database system will be owned by user "postgres".
>This user must also own the server process.
>
>The database cluster will be initialized with locale C.
>
>creating directory /software/postgresql-8.0.0/data ... initdb:
>could not create directory "/software/postgresql-8.0.0":
>Operation not applicable
>
>
>The error message was a bit confusing, so I decided to run a
>truss on the process to see what might be happening, and this
>is what I came
>across:
>
>[...]
>8802/1:         write(1, " c r e a t i n g   d i r".., 62)      = 62
>8802/1:         umask(0)                                        = 077
>8802/1:         umask(077)                                      = 0
>8802/1:         mkdir("/software", 0777)
> Err#17 EEXIST
>8802/1:         stat64("/software", 0xFFBFC858)                 = 0
>8802/1:         mkdir("/software/postgresql-8.0.0", 0777)
> Err#89 ENOSYS
>[...]
>
>
>The last error in that section, ENOSYS, is very strange, as
>the Solaris manpage for 'mkdir' does not mention it as a
>possible error.  One thing to note in this, however, is that
>'/software/postgresql-8.0.0' is not a regular directory, but
>an automount point (which in this case is just a local
>loopback mount).  So the indication is that Solaris seems to
>have a bug not in mkdir, but deeper in their VFS code that's
>causing this seemingly strange issue.
>
>Two workarounds for this problem have been found: running
>'initdb' with a directory that's *not* an automount point and
>then moving the 'data'
>directory to its final destination worked fine, along with a
>suggestion from Andrew Dunstan (on the #postgresql IRC
>channel) with using a rela- tive path for the data directory.
>Both were successful in avoiding the issue, but I decided to
>mention this here in case someone felt it might be worth
>looking into to see if the Sun problem can be avoided; I am
>going to notify Sun of their bug, just don't know how long it
>will take them to actually resolve it (if they ever do).
>
>While I can fully understand that a code change here may not
>be desire- able, might some notes in the documentation be
>useful for those who might stumble across the problem as well?
> Just a suggestion...
>
>I hope I gave sufficient information on the problem, though
>I'm always willing to give any clarification needed.  Thank
>you for your time.
>
>
>Ken Lareau
>elessar@numenor.org
>
>---------------------------(end of
>broadcast)---------------------------
>TIP 6: Have you searched our list archives?
>
>               http://archives.postgresql.org
>


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
Tom Lane
Date:
"David Parker" <dparker@tazznetworks.com> writes:
> Did initdb previously just assume the -D path existed, and now it is
> trying to create the whole path, if necessary?

Pre-8.0 it was using mkdir(1), which might possibly contain some weird
workaround for this case on Solaris.

I suppose that manually creating the data directory before running
initdb would also avoid this issue, since the mkdir(2) loop is only
entered if we don't find the directory in existence.
        regards, tom lane


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
"David Parker"
Date:
I tried that, and it just runs into the problem with the first sub dir
it tries to create:

ed9i03:/home/dparker/temp
% initdb -D /home/dparker/temp/testdb
The files belonging to this database system will be owned by user
"dparker".
This user must also own the server process.

The database cluster will be initialized with locale C.

fixing permissions on existing directory /home/dparker/temp/testdb ...
ok
creating directory /home/dparker/temp/testdb/global ... initdb: could
not create directory "/home/dparker": Operation not applicable
initdb: removing contents of data directory "/home/dparker/temp/testdb"
ed9i03:/home/dparker/temp

truss:

chmod("/home/dparker/temp/testdb", 0700)        = 0
ok
write(1, " o k\n", 3)                           = 3
creating directory /home/dparker/temp/testdb/global ... write(1, " c r e
a t i n g   d i r".., 56)      = 56
umask(0)                                        = 077
umask(077)                                      = 0
mkdir("/home", 0777)                            Err#17 EEXIST
xstat(2, "/home", 0x08045C20)                   = 0
mkdir("/home/dparker", 0777)                    Err#89 ENOSYS
umask(077)                                      = 077
fstat64(2, 0x08045000)                          = 0
initdbwrite(2, " i n i t d b", 6)                       = 6
: could not create directory "write(2, " :   c o u l d   n o t  ".., 30)
= 30
/home/dparkerwrite(2, " / h o m e / d p a r k e".., 13) = 13
": write(2, " " :  ", 3)                                = 3
Operation not applicablewrite(2, " O p e r a t i o n   n o".., 24)
= 24


- DAP

>-----Original Message-----
>From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>Sent: Thursday, January 27, 2005 6:22 PM
>To: David Parker
>Cc: Kenneth Lareau; pgsql-hackers@postgresql.org
>Subject: Re: [HACKERS] Strange issue with initdb on 8.0 and
>Solaris automounts
>
>"David Parker" <dparker@tazznetworks.com> writes:
>> Did initdb previously just assume the -D path existed, and now it is
>> trying to create the whole path, if necessary?
>
>Pre-8.0 it was using mkdir(1), which might possibly contain
>some weird workaround for this case on Solaris.
>
>I suppose that manually creating the data directory before
>running initdb would also avoid this issue, since the mkdir(2)
>loop is only entered if we don't find the directory in existence.
>
>            regards, tom lane
>


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
Kenneth Lareau
Date:
In message <21723.1106868138@sss.pgh.pa.us>, Tom Lane writes:
>"David Parker" <dparker@tazznetworks.com> writes:
>> Did initdb previously just assume the -D path existed, and now it is
>> trying to create the whole path, if necessary?
>
>Pre-8.0 it was using mkdir(1), which might possibly contain some weird
>workaround for this case on Solaris.
>
>I suppose that manually creating the data directory before running
>initdb would also avoid this issue, since the mkdir(2) loop is only
>entered if we don't find the directory in existence.
>
>            regards, tom lane
>

Actually, creating the 'data' directory first doesn't work either:

[delirium:postgres] ~
(17) mkdir data
[delirium:postgres] ~
(18) initdb -D /software/postgresql-8.0.0/data
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale C.

fixing permissions on existing directory /software/postgresql-8.0.0/data ... ok
creating directory /software/postgresql-8.0.0/data/global ... initdb: could not create directory
"/software/postgresql-8.0.0":Operation not applicable
 
initdb: removing contents of data directory "/software/postgresql-8.0.0/data"


Since there's subdirectories that need to be created, it still runs into
the problem.  I don't know why the command 'mkdir' doesn't exhibit the
same problem as the function 'mkdir', but running:
  mkdir /software/postgresql-8.0.0

produces the correct error "File exists" on my system.  I suspect the
'mkdir' command probably checks to see if the directory exists first
before trying to create it, which avoids the problem.


Ken Lareau
elessar@numenor.org


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
Tom Lane
Date:
Kenneth Lareau <elessar@numenor.org> writes:
> In message <21723.1106868138@sss.pgh.pa.us>, Tom Lane writes:
>> I suppose that manually creating the data directory before running
>> initdb would also avoid this issue, since the mkdir(2) loop is only
>> entered if we don't find the directory in existence.

> Actually, creating the 'data' directory first doesn't work either:

Good point.

> I don't know why the command 'mkdir' doesn't exhibit the
> same problem as the function 'mkdir', but running:

>    mkdir /software/postgresql-8.0.0

> produces the correct error "File exists" on my system.

Could you truss that and see what it does?  It would be a simple change
in initdb to make it stat before mkdir instead of after, but I'm not
totally convinced that would fix the problem.  If mkdir returns a funny
error code then stat might as well ...
        regards, tom lane


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
Kenneth Lareau
Date:
In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes:
>Kenneth Lareau <elessar@numenor.org> writes:
>> In message <21723.1106868138@sss.pgh.pa.us>, Tom Lane writes:
>>> I suppose that manually creating the data directory before running
>>> initdb would also avoid this issue, since the mkdir(2) loop is only
>>> entered if we don't find the directory in existence.
>
>> Actually, creating the 'data' directory first doesn't work either:
>
>Good point.
>
>> I don't know why the command 'mkdir' doesn't exhibit the
>> same problem as the function 'mkdir', but running:
>
>>    mkdir /software/postgresql-8.0.0
>
>> produces the correct error "File exists" on my system.
>
>Could you truss that and see what it does?  It would be a simple change
>in initdb to make it stat before mkdir instead of after, but I'm not
>totally convinced that would fix the problem.  If mkdir returns a funny
>error code then stat might as well ...
>
>            regards, tom lane
>

Here's the relevant truss output from 'mkdir /software/postgresql-8.0.0'
on my Solaris 9 system:

10832:  umask(0)                                        = 077
10832:  umask(077)                                      = 0
10832:  mkdir("/software/postgresql-8.0.0", 0777)       Err#89 ENOSYS
10832:  stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0
10832:  fstat64(2, 0xFFBFEB78)                          = 0
10832:  write(2, " m k d i r", 5)                       = 5
10832:  write(2, " :  ", 2)                             = 2
10832:  write(2, " c a n n o t   c r e a t".., 24)      = 24
10832:  write(2, " ` / s o f t w a r e / p".., 28)      = 28
10832:  write(2, " :  ", 2)                             = 2
10832:  write(2, " F i l e   e x i s t s", 11)          = 11
10832:  write(2, "\n", 1)                               = 1
10832:  _exit(1)


It's doing the stat after the mkdir attempt it seems, and coming back
with the correct response.  Hmm, maybe I should look at the Solaris 8
code for the mkdir command...


Ken Lareau
elessar@numenor.org


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
Andrew Dunstan
Date:

Tom Lane wrote:

>>I don't know why the command 'mkdir' doesn't exhibit the
>>same problem as the function 'mkdir', but running:
>>    
>>
>
>  
>
>>   mkdir /software/postgresql-8.0.0
>>    
>>
>
>  
>
>>produces the correct error "File exists" on my system.
>>    
>>
>
>Could you truss that and see what it does?  It would be a simple change
>in initdb to make it stat before mkdir instead of after, but I'm not
>totally convinced that would fix the problem.  If mkdir returns a funny
>error code then stat might as well ...
>
>
>  
>

There's also a tiny race condition, which I guess isn't worth worrying 
about.

Returning ENOSYS is pretty bogus ...

cheers

andrew


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
Tom Lane
Date:
Kenneth Lareau <elessar@numenor.org> writes:
> In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes:
>> Could you truss that and see what it does?

> Here's the relevant truss output from 'mkdir /software/postgresql-8.0.0'
> on my Solaris 9 system:

> 10832:  mkdir("/software/postgresql-8.0.0", 0777)       Err#89 ENOSYS
> 10832:  stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0

> It's doing the stat after the mkdir attempt it seems, and coming back
> with the correct response.  Hmm, maybe I should look at the Solaris 8
> code for the mkdir command...

Well, the important point is that the stat does succeed.  I'm not going
to put in anything as specific as a check for ENOSYS, but it seems
reasonable to try the stat first and mkdir only if stat fails.
I've applied the attached patch.
        regards, tom lane

*** src/bin/initdb/initdb.c.orig    Sat Jan  8 17:51:12 2005
--- src/bin/initdb/initdb.c    Thu Jan 27 19:23:49 2005
***************
*** 476,481 ****
--- 476,484 ----  * this tries to build all the elements of a path to a directory a la mkdir -p  * we assume the path
isin canonical form, i.e. uses / as the separator  * we also assume it isn't null.
 
+  *
+  * note that on failure, the path arg has been modified to show the particular
+  * directory level we had problems with.  */ static int mkdir_p(char *path, mode_t omode)
***************
*** 544,573 ****         }         if (last)             (void) umask(oumask);
!         if (mkdir(path, last ? omode : S_IRWXU | S_IRWXG | S_IRWXO) < 0)         {
!             if (errno == EEXIST || errno == EISDIR)
!             {
!                 if (stat(path, &sb) < 0)
!                 {
!                     retval = 1;
!                     break;
!                 }
!                 else if (!S_ISDIR(sb.st_mode))
!                 {
!                     if (last)
!                         errno = EEXIST;
!                     else
!                         errno = ENOTDIR;
!                     retval = 1;
!                     break;
!                 }
!             }
!             else             {                 retval = 1;                 break;             }         }         if
(!last)            *p = '/';
 
--- 547,570 ----         }         if (last)             (void) umask(oumask);
! 
!         /* check for pre-existing directory; ok if it's a parent */
!         if (stat(path, &sb) == 0)         {
!             if (!S_ISDIR(sb.st_mode))             {
+                 if (last)
+                     errno = EEXIST;
+                 else
+                     errno = ENOTDIR;                 retval = 1;                 break;             }
+         }
+         else if (mkdir(path, last ? omode : S_IRWXU | S_IRWXG | S_IRWXO) < 0)
+         {
+             retval = 1;
+             break;         }         if (!last)             *p = '/';


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> There's also a tiny race condition, which I guess isn't worth worrying 
> about.

Considering that we're not checking ownership or permissions of the
parent directories, I'd say not.
        regards, tom lane


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
Kenneth Lareau
Date:
In message <22687.1106872653@sss.pgh.pa.us>, Tom Lane writes:
>Kenneth Lareau <elessar@numenor.org> writes:
>> In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes:
>>> Could you truss that and see what it does?
>
>> Here's the relevant truss output from 'mkdir /software/postgresql-8.0.0'
>> on my Solaris 9 system:
>
>> 10832:  mkdir("/software/postgresql-8.0.0", 0777)       Err#89 ENOSYS
>> 10832:  stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0
>
>> It's doing the stat after the mkdir attempt it seems, and coming back
>> with the correct response.  Hmm, maybe I should look at the Solaris 8
>> code for the mkdir command...
>
>Well, the important point is that the stat does succeed.  I'm not going
>to put in anything as specific as a check for ENOSYS, but it seems
>reasonable to try the stat first and mkdir only if stat fails.
>I've applied the attached patch.
>
>            regards, tom lane


Tom, thank you very much for the patch, it worked like a charm.


Ken Lareau
elessar@numenor.org


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
"David Parker"
Date:
Yes, thanks very much!

- DAP

>-----Original Message-----
>From: pgsql-hackers-owner@postgresql.org
>[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kenneth Lareau
>Sent: Thursday, January 27, 2005 8:10 PM
>To: Tom Lane
>Cc: Kenneth Lareau; pgsql-hackers@postgresql.org
>Subject: Re: [HACKERS] Strange issue with initdb on 8.0 and
>Solaris automounts
>
>In message <22687.1106872653@sss.pgh.pa.us>, Tom Lane writes:
>>Kenneth Lareau <elessar@numenor.org> writes:
>>> In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes:
>>>> Could you truss that and see what it does?
>>
>>> Here's the relevant truss output from 'mkdir
>/software/postgresql-8.0.0'
>>> on my Solaris 9 system:
>>
>>> 10832:  mkdir("/software/postgresql-8.0.0", 0777)
>Err#89 ENOSYS
>>> 10832:  stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0
>>
>>> It's doing the stat after the mkdir attempt it seems, and
>coming back
>>> with the correct response.  Hmm, maybe I should look at the
>Solaris 8
>>> code for the mkdir command...
>>
>>Well, the important point is that the stat does succeed.  I'm
>not going
>>to put in anything as specific as a check for ENOSYS, but it seems
>>reasonable to try the stat first and mkdir only if stat fails.
>>I've applied the attached patch.
>>
>>            regards, tom lane
>
>
>Tom, thank you very much for the patch, it worked like a charm.
>
>
>Ken Lareau
>elessar@numenor.org
>
>---------------------------(end of
>broadcast)---------------------------
>TIP 9: the planner will ignore your desire to choose an index
>scan if your
>      joining column's datatypes do not match
>


Re: Strange issue with initdb on 8.0 and Solaris automounts

From
"David Parker"
Date:
Will this make it into 8.1?

>-----Original Message-----
>From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
>Sent: Thursday, January 27, 2005 7:38 PM
>To: Kenneth Lareau
>Cc: David Parker; pgsql-hackers@postgresql.org
>Subject: Re: [HACKERS] Strange issue with initdb on 8.0 and
>Solaris automounts
>
>Kenneth Lareau <elessar@numenor.org> writes:
>> In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes:
>>> Could you truss that and see what it does?
>
>> Here's the relevant truss output from 'mkdir
>/software/postgresql-8.0.0'
>> on my Solaris 9 system:
>
>> 10832:  mkdir("/software/postgresql-8.0.0", 0777)       Err#89 ENOSYS
>> 10832:  stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0
>
>> It's doing the stat after the mkdir attempt it seems, and
>coming back
>> with the correct response.  Hmm, maybe I should look at the
>Solaris 8
>> code for the mkdir command...
>
>Well, the important point is that the stat does succeed.  I'm
>not going to put in anything as specific as a check for
>ENOSYS, but it seems reasonable to try the stat first and
>mkdir only if stat fails.
>I've applied the attached patch.
>
>            regards, tom lane
>
>*** src/bin/initdb/initdb.c.orig    Sat Jan  8 17:51:12 2005
>--- src/bin/initdb/initdb.c    Thu Jan 27 19:23:49 2005
>***************
>*** 476,481 ****
>--- 476,484 ----
>   * this tries to build all the elements of a path to a
>directory a la mkdir -p
>   * we assume the path is in canonical form, i.e. uses / as
>the separator
>   * we also assume it isn't null.
>+  *
>+  * note that on failure, the path arg has been modified to show the
>+ particular
>+  * directory level we had problems with.
>   */
>  static int
>  mkdir_p(char *path, mode_t omode)
>***************
>*** 544,573 ****
>          }
>          if (last)
>              (void) umask(oumask);
>!         if (mkdir(path, last ? omode : S_IRWXU |
>S_IRWXG | S_IRWXO) < 0)
>          {
>!             if (errno == EEXIST || errno == EISDIR)
>!             {
>!                 if (stat(path, &sb) < 0)
>!                 {
>!                     retval = 1;
>!                     break;
>!                 }
>!                 else if (!S_ISDIR(sb.st_mode))
>!                 {
>!                     if (last)
>!                         errno = EEXIST;
>!                     else
>!                         errno = ENOTDIR;
>!                     retval = 1;
>!                     break;
>!                 }
>!             }
>!             else
>              {
>                  retval = 1;
>                  break;
>              }
>          }
>          if (!last)
>              *p = '/';
>--- 547,570 ----
>          }
>          if (last)
>              (void) umask(oumask);
>!
>!         /* check for pre-existing directory; ok if it's
>a parent */
>!         if (stat(path, &sb) == 0)
>          {
>!             if (!S_ISDIR(sb.st_mode))
>              {
>+                 if (last)
>+                     errno = EEXIST;
>+                 else
>+                     errno = ENOTDIR;
>                  retval = 1;
>                  break;
>              }
>+         }
>+         else if (mkdir(path, last ? omode : S_IRWXU |
>S_IRWXG | S_IRWXO) < 0)
>+         {
>+             retval = 1;
>+             break;
>          }
>          if (!last)
>              *p = '/';
>