Strange issue with initdb on 8.0 and Solaris automounts

Started by Kenneth Lareaualmost 21 years ago13 messages
#1Kenneth Lareau
elessar@numenor.org

Folks,

I ran into an interesting issue when installing PostgreSQL 8.0 that I'm
not sure how to resolve correctly. My system is a Sun machine (Blade
1000) running Solaris 9, with relatively recent patches. After install-
ing 8.0, I went to run the 'initdb' command and was greeted with the
following:

[delirium:postgres] ~
(11) initdb -D /software/postgresql-8.0.0/data
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale C.

creating directory /software/postgresql-8.0.0/data ... initdb: could not create directory "/software/postgresql-8.0.0": Operation not applicable

The error message was a bit confusing, so I decided to run a truss on
the process to see what might be happening, and this is what I came
across:

[...]
8802/1: write(1, " c r e a t i n g d i r".., 62) = 62
8802/1: umask(0) = 077
8802/1: umask(077) = 0
8802/1: mkdir("/software", 0777) Err#17 EEXIST
8802/1: stat64("/software", 0xFFBFC858) = 0
8802/1: mkdir("/software/postgresql-8.0.0", 0777) Err#89 ENOSYS
[...]

The last error in that section, ENOSYS, is very strange, as the Solaris
manpage for 'mkdir' does not mention it as a possible error. One thing
to note in this, however, is that '/software/postgresql-8.0.0' is not a
regular directory, but an automount point (which in this case is just a
local loopback mount). So the indication is that Solaris seems to have
a bug not in mkdir, but deeper in their VFS code that's causing this
seemingly strange issue.

Two workarounds for this problem have been found: running 'initdb' with
a directory that's *not* an automount point and then moving the 'data'
directory to its final destination worked fine, along with a suggestion
from Andrew Dunstan (on the #postgresql IRC channel) with using a rela-
tive path for the data directory. Both were successful in avoiding the
issue, but I decided to mention this here in case someone felt it might
be worth looking into to see if the Sun problem can be avoided; I am
going to notify Sun of their bug, just don't know how long it will take
them to actually resolve it (if they ever do).

While I can fully understand that a code change here may not be desire-
able, might some notes in the documentation be useful for those who
might stumble across the problem as well? Just a suggestion...

I hope I gave sufficient information on the problem, though I'm always
willing to give any clarification needed. Thank you for your time.

Ken Lareau
elessar@numenor.org

#2David Parker
dparker@tazznetworks.com
In reply to: Kenneth Lareau (#1)
Re: Strange issue with initdb on 8.0 and Solaris automounts

Coincidentally I JUST NOW built 8.0 on Solaris 9, and ran into the same
problem. As they say, "this used to work".....

We build databases as part of the build of our product, and I'm looking
into what we need to do to upgrade from 7.4.5, and this was the first
thing I ran into. I hadn't gotten as far as truss yet, so thanks Kenneth
for that extra info.

Did initdb previously just assume the -D path existed, and now it is
trying to create the whole path, if necessary?

- DAP

Show quoted text

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kenneth Lareau
Sent: Thursday, January 27, 2005 5:23 PM
To: pgsql-hackers@postgresql.org
Subject: [HACKERS] Strange issue with initdb on 8.0 and
Solaris automounts

Folks,

I ran into an interesting issue when installing PostgreSQL 8.0
that I'm not sure how to resolve correctly. My system is a
Sun machine (Blade
1000) running Solaris 9, with relatively recent patches. After
install- ing 8.0, I went to run the 'initdb' command and was
greeted with the
following:

[delirium:postgres] ~
(11) initdb -D /software/postgresql-8.0.0/data The files
belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale C.

creating directory /software/postgresql-8.0.0/data ... initdb:
could not create directory "/software/postgresql-8.0.0":
Operation not applicable

The error message was a bit confusing, so I decided to run a
truss on the process to see what might be happening, and this
is what I came
across:

[...]
8802/1: write(1, " c r e a t i n g d i r".., 62) = 62
8802/1: umask(0) = 077
8802/1: umask(077) = 0
8802/1: mkdir("/software", 0777)
Err#17 EEXIST
8802/1: stat64("/software", 0xFFBFC858) = 0
8802/1: mkdir("/software/postgresql-8.0.0", 0777)
Err#89 ENOSYS
[...]

The last error in that section, ENOSYS, is very strange, as
the Solaris manpage for 'mkdir' does not mention it as a
possible error. One thing to note in this, however, is that
'/software/postgresql-8.0.0' is not a regular directory, but
an automount point (which in this case is just a local
loopback mount). So the indication is that Solaris seems to
have a bug not in mkdir, but deeper in their VFS code that's
causing this seemingly strange issue.

Two workarounds for this problem have been found: running
'initdb' with a directory that's *not* an automount point and
then moving the 'data'
directory to its final destination worked fine, along with a
suggestion from Andrew Dunstan (on the #postgresql IRC
channel) with using a rela- tive path for the data directory.
Both were successful in avoiding the issue, but I decided to
mention this here in case someone felt it might be worth
looking into to see if the Sun problem can be avoided; I am
going to notify Sun of their bug, just don't know how long it
will take them to actually resolve it (if they ever do).

While I can fully understand that a code change here may not
be desire- able, might some notes in the documentation be
useful for those who might stumble across the problem as well?
Just a suggestion...

I hope I gave sufficient information on the problem, though
I'm always willing to give any clarification needed. Thank
you for your time.

Ken Lareau
elessar@numenor.org

---------------------------(end of
broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Parker (#2)
Re: Strange issue with initdb on 8.0 and Solaris automounts

"David Parker" <dparker@tazznetworks.com> writes:

Did initdb previously just assume the -D path existed, and now it is
trying to create the whole path, if necessary?

Pre-8.0 it was using mkdir(1), which might possibly contain some weird
workaround for this case on Solaris.

I suppose that manually creating the data directory before running
initdb would also avoid this issue, since the mkdir(2) loop is only
entered if we don't find the directory in existence.

regards, tom lane

#4David Parker
dparker@tazznetworks.com
In reply to: Tom Lane (#3)
Re: Strange issue with initdb on 8.0 and Solaris automounts

I tried that, and it just runs into the problem with the first sub dir
it tries to create:

ed9i03:/home/dparker/temp
% initdb -D /home/dparker/temp/testdb
The files belonging to this database system will be owned by user
"dparker".
This user must also own the server process.

The database cluster will be initialized with locale C.

fixing permissions on existing directory /home/dparker/temp/testdb ...
ok
creating directory /home/dparker/temp/testdb/global ... initdb: could
not create directory "/home/dparker": Operation not applicable
initdb: removing contents of data directory "/home/dparker/temp/testdb"
ed9i03:/home/dparker/temp

truss:

chmod("/home/dparker/temp/testdb", 0700) = 0
ok
write(1, " o k\n", 3) = 3
creating directory /home/dparker/temp/testdb/global ... write(1, " c r e
a t i n g d i r".., 56) = 56
umask(0) = 077
umask(077) = 0
mkdir("/home", 0777) Err#17 EEXIST
xstat(2, "/home", 0x08045C20) = 0
mkdir("/home/dparker", 0777) Err#89 ENOSYS
umask(077) = 077
fstat64(2, 0x08045000) = 0
initdbwrite(2, " i n i t d b", 6) = 6
: could not create directory "write(2, " : c o u l d n o t ".., 30)
= 30
/home/dparkerwrite(2, " / h o m e / d p a r k e".., 13) = 13
": write(2, " " : ", 3) = 3
Operation not applicablewrite(2, " O p e r a t i o n n o".., 24)
= 24

- DAP

Show quoted text

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Thursday, January 27, 2005 6:22 PM
To: David Parker
Cc: Kenneth Lareau; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Strange issue with initdb on 8.0 and
Solaris automounts

"David Parker" <dparker@tazznetworks.com> writes:

Did initdb previously just assume the -D path existed, and now it is
trying to create the whole path, if necessary?

Pre-8.0 it was using mkdir(1), which might possibly contain
some weird workaround for this case on Solaris.

I suppose that manually creating the data directory before
running initdb would also avoid this issue, since the mkdir(2)
loop is only entered if we don't find the directory in existence.

regards, tom lane

#5Kenneth Lareau
elessar@numenor.org
In reply to: Tom Lane (#3)
Re: Strange issue with initdb on 8.0 and Solaris automounts

In message <21723.1106868138@sss.pgh.pa.us>, Tom Lane writes:

"David Parker" <dparker@tazznetworks.com> writes:

Did initdb previously just assume the -D path existed, and now it is
trying to create the whole path, if necessary?

Pre-8.0 it was using mkdir(1), which might possibly contain some weird
workaround for this case on Solaris.

I suppose that manually creating the data directory before running
initdb would also avoid this issue, since the mkdir(2) loop is only
entered if we don't find the directory in existence.

regards, tom lane

Actually, creating the 'data' directory first doesn't work either:

[delirium:postgres] ~
(17) mkdir data
[delirium:postgres] ~
(18) initdb -D /software/postgresql-8.0.0/data
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale C.

fixing permissions on existing directory /software/postgresql-8.0.0/data ... ok
creating directory /software/postgresql-8.0.0/data/global ... initdb: could not create directory "/software/postgresql-8.0.0": Operation not applicable
initdb: removing contents of data directory "/software/postgresql-8.0.0/data"

Since there's subdirectories that need to be created, it still runs into
the problem. I don't know why the command 'mkdir' doesn't exhibit the
same problem as the function 'mkdir', but running:

mkdir /software/postgresql-8.0.0

produces the correct error "File exists" on my system. I suspect the
'mkdir' command probably checks to see if the directory exists first
before trying to create it, which avoids the problem.

Ken Lareau
elessar@numenor.org

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kenneth Lareau (#5)
Re: Strange issue with initdb on 8.0 and Solaris automounts

Kenneth Lareau <elessar@numenor.org> writes:

In message <21723.1106868138@sss.pgh.pa.us>, Tom Lane writes:

I suppose that manually creating the data directory before running
initdb would also avoid this issue, since the mkdir(2) loop is only
entered if we don't find the directory in existence.

Actually, creating the 'data' directory first doesn't work either:

Good point.

I don't know why the command 'mkdir' doesn't exhibit the
same problem as the function 'mkdir', but running:

mkdir /software/postgresql-8.0.0

produces the correct error "File exists" on my system.

Could you truss that and see what it does? It would be a simple change
in initdb to make it stat before mkdir instead of after, but I'm not
totally convinced that would fix the problem. If mkdir returns a funny
error code then stat might as well ...

regards, tom lane

#7Kenneth Lareau
elessar@numenor.org
In reply to: Tom Lane (#6)
Re: Strange issue with initdb on 8.0 and Solaris automounts

In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes:

Kenneth Lareau <elessar@numenor.org> writes:

In message <21723.1106868138@sss.pgh.pa.us>, Tom Lane writes:

I suppose that manually creating the data directory before running
initdb would also avoid this issue, since the mkdir(2) loop is only
entered if we don't find the directory in existence.

Actually, creating the 'data' directory first doesn't work either:

Good point.

I don't know why the command 'mkdir' doesn't exhibit the
same problem as the function 'mkdir', but running:

mkdir /software/postgresql-8.0.0

produces the correct error "File exists" on my system.

Could you truss that and see what it does? It would be a simple change
in initdb to make it stat before mkdir instead of after, but I'm not
totally convinced that would fix the problem. If mkdir returns a funny
error code then stat might as well ...

regards, tom lane

Here's the relevant truss output from 'mkdir /software/postgresql-8.0.0'
on my Solaris 9 system:

10832: umask(0) = 077
10832: umask(077) = 0
10832: mkdir("/software/postgresql-8.0.0", 0777) Err#89 ENOSYS
10832: stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0
10832: fstat64(2, 0xFFBFEB78) = 0
10832: write(2, " m k d i r", 5) = 5
10832: write(2, " : ", 2) = 2
10832: write(2, " c a n n o t c r e a t".., 24) = 24
10832: write(2, " ` / s o f t w a r e / p".., 28) = 28
10832: write(2, " : ", 2) = 2
10832: write(2, " F i l e e x i s t s", 11) = 11
10832: write(2, "\n", 1) = 1
10832: _exit(1)

It's doing the stat after the mkdir attempt it seems, and coming back
with the correct response. Hmm, maybe I should look at the Solaris 8
code for the mkdir command...

Ken Lareau
elessar@numenor.org

#8Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#6)
Re: Strange issue with initdb on 8.0 and Solaris automounts

Tom Lane wrote:

I don't know why the command 'mkdir' doesn't exhibit the
same problem as the function 'mkdir', but running:

mkdir /software/postgresql-8.0.0

produces the correct error "File exists" on my system.

Could you truss that and see what it does? It would be a simple change
in initdb to make it stat before mkdir instead of after, but I'm not
totally convinced that would fix the problem. If mkdir returns a funny
error code then stat might as well ...

There's also a tiny race condition, which I guess isn't worth worrying
about.

Returning ENOSYS is pretty bogus ...

cheers

andrew

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kenneth Lareau (#7)
Re: Strange issue with initdb on 8.0 and Solaris automounts

Kenneth Lareau <elessar@numenor.org> writes:

In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes:

Could you truss that and see what it does?

Here's the relevant truss output from 'mkdir /software/postgresql-8.0.0'
on my Solaris 9 system:

10832: mkdir("/software/postgresql-8.0.0", 0777) Err#89 ENOSYS
10832: stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0

It's doing the stat after the mkdir attempt it seems, and coming back
with the correct response. Hmm, maybe I should look at the Solaris 8
code for the mkdir command...

Well, the important point is that the stat does succeed. I'm not going
to put in anything as specific as a check for ENOSYS, but it seems
reasonable to try the stat first and mkdir only if stat fails.
I've applied the attached patch.

regards, tom lane

*** src/bin/initdb/initdb.c.orig	Sat Jan  8 17:51:12 2005
--- src/bin/initdb/initdb.c	Thu Jan 27 19:23:49 2005
***************
*** 476,481 ****
--- 476,484 ----
   * this tries to build all the elements of a path to a directory a la mkdir -p
   * we assume the path is in canonical form, i.e. uses / as the separator
   * we also assume it isn't null.
+  *
+  * note that on failure, the path arg has been modified to show the particular
+  * directory level we had problems with.
   */
  static int
  mkdir_p(char *path, mode_t omode)
***************
*** 544,573 ****
  		}
  		if (last)
  			(void) umask(oumask);
! 		if (mkdir(path, last ? omode : S_IRWXU | S_IRWXG | S_IRWXO) < 0)
  		{
! 			if (errno == EEXIST || errno == EISDIR)
! 			{
! 				if (stat(path, &sb) < 0)
! 				{
! 					retval = 1;
! 					break;
! 				}
! 				else if (!S_ISDIR(sb.st_mode))
! 				{
! 					if (last)
! 						errno = EEXIST;
! 					else
! 						errno = ENOTDIR;
! 					retval = 1;
! 					break;
! 				}
! 			}
! 			else
  			{
  				retval = 1;
  				break;
  			}
  		}
  		if (!last)
  			*p = '/';
--- 547,570 ----
  		}
  		if (last)
  			(void) umask(oumask);
! 
! 		/* check for pre-existing directory; ok if it's a parent */
! 		if (stat(path, &sb) == 0)
  		{
! 			if (!S_ISDIR(sb.st_mode))
  			{
+ 				if (last)
+ 					errno = EEXIST;
+ 				else
+ 					errno = ENOTDIR;
  				retval = 1;
  				break;
  			}
+ 		}
+ 		else if (mkdir(path, last ? omode : S_IRWXU | S_IRWXG | S_IRWXO) < 0)
+ 		{
+ 			retval = 1;
+ 			break;
  		}
  		if (!last)
  			*p = '/';
#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#8)
Re: Strange issue with initdb on 8.0 and Solaris automounts

Andrew Dunstan <andrew@dunslane.net> writes:

There's also a tiny race condition, which I guess isn't worth worrying
about.

Considering that we're not checking ownership or permissions of the
parent directories, I'd say not.

regards, tom lane

#11Kenneth Lareau
elessar@numenor.org
In reply to: Tom Lane (#9)
Re: Strange issue with initdb on 8.0 and Solaris automounts

In message <22687.1106872653@sss.pgh.pa.us>, Tom Lane writes:

Kenneth Lareau <elessar@numenor.org> writes:

In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes:

Could you truss that and see what it does?

Here's the relevant truss output from 'mkdir /software/postgresql-8.0.0'
on my Solaris 9 system:

10832: mkdir("/software/postgresql-8.0.0", 0777) Err#89 ENOSYS
10832: stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0

It's doing the stat after the mkdir attempt it seems, and coming back
with the correct response. Hmm, maybe I should look at the Solaris 8
code for the mkdir command...

Well, the important point is that the stat does succeed. I'm not going
to put in anything as specific as a check for ENOSYS, but it seems
reasonable to try the stat first and mkdir only if stat fails.
I've applied the attached patch.

regards, tom lane

Tom, thank you very much for the patch, it worked like a charm.

Ken Lareau
elessar@numenor.org

#12David Parker
dparker@tazznetworks.com
In reply to: Kenneth Lareau (#11)
Re: Strange issue with initdb on 8.0 and Solaris automounts

Yes, thanks very much!

- DAP

Show quoted text

-----Original Message-----
From: pgsql-hackers-owner@postgresql.org
[mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Kenneth Lareau
Sent: Thursday, January 27, 2005 8:10 PM
To: Tom Lane
Cc: Kenneth Lareau; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Strange issue with initdb on 8.0 and
Solaris automounts

In message <22687.1106872653@sss.pgh.pa.us>, Tom Lane writes:

Kenneth Lareau <elessar@numenor.org> writes:

In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes:

Could you truss that and see what it does?

Here's the relevant truss output from 'mkdir

/software/postgresql-8.0.0'

on my Solaris 9 system:

10832: mkdir("/software/postgresql-8.0.0", 0777)

Err#89 ENOSYS

10832: stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0

It's doing the stat after the mkdir attempt it seems, and

coming back

with the correct response. Hmm, maybe I should look at the

Solaris 8

code for the mkdir command...

Well, the important point is that the stat does succeed. I'm

not going

to put in anything as specific as a check for ENOSYS, but it seems
reasonable to try the stat first and mkdir only if stat fails.
I've applied the attached patch.

regards, tom lane

Tom, thank you very much for the patch, it worked like a charm.

Ken Lareau
elessar@numenor.org

---------------------------(end of
broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index
scan if your
joining column's datatypes do not match

#13David Parker
dparker@tazznetworks.com
In reply to: David Parker (#12)
Re: Strange issue with initdb on 8.0 and Solaris automounts

Will this make it into 8.1?

Show quoted text

-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Thursday, January 27, 2005 7:38 PM
To: Kenneth Lareau
Cc: David Parker; pgsql-hackers@postgresql.org
Subject: Re: [HACKERS] Strange issue with initdb on 8.0 and
Solaris automounts

Kenneth Lareau <elessar@numenor.org> writes:

In message <22095.1106869848@sss.pgh.pa.us>, Tom Lane writes:

Could you truss that and see what it does?

Here's the relevant truss output from 'mkdir

/software/postgresql-8.0.0'

on my Solaris 9 system:

10832: mkdir("/software/postgresql-8.0.0", 0777) Err#89 ENOSYS
10832: stat64("/software/postgresql-8.0.0", 0xFFBFFA38) = 0

It's doing the stat after the mkdir attempt it seems, and

coming back

with the correct response. Hmm, maybe I should look at the

Solaris 8

code for the mkdir command...

Well, the important point is that the stat does succeed. I'm
not going to put in anything as specific as a check for
ENOSYS, but it seems reasonable to try the stat first and
mkdir only if stat fails.
I've applied the attached patch.

regards, tom lane

*** src/bin/initdb/initdb.c.orig	Sat Jan  8 17:51:12 2005
--- src/bin/initdb/initdb.c	Thu Jan 27 19:23:49 2005
***************
*** 476,481 ****
--- 476,484 ----
* this tries to build all the elements of a path to a 
directory a la mkdir -p
* we assume the path is in canonical form, i.e. uses / as 
the separator
* we also assume it isn't null.
+  *
+  * note that on failure, the path arg has been modified to show the 
+ particular
+  * directory level we had problems with.
*/
static int
mkdir_p(char *path, mode_t omode)
***************
*** 544,573 ****
}
if (last)
(void) umask(oumask);
! 		if (mkdir(path, last ? omode : S_IRWXU | 
S_IRWXG | S_IRWXO) < 0)
{
! 			if (errno == EEXIST || errno == EISDIR)
! 			{
! 				if (stat(path, &sb) < 0)
! 				{
! 					retval = 1;
! 					break;
! 				}
! 				else if (!S_ISDIR(sb.st_mode))
! 				{
! 					if (last)
! 						errno = EEXIST;
! 					else
! 						errno = ENOTDIR;
! 					retval = 1;
! 					break;
! 				}
! 			}
! 			else
{
retval = 1;
break;
}
}
if (!last)
*p = '/';
--- 547,570 ----
}
if (last)
(void) umask(oumask);
! 
! 		/* check for pre-existing directory; ok if it's 
a parent */
! 		if (stat(path, &sb) == 0)
{
! 			if (!S_ISDIR(sb.st_mode))
{
+ 				if (last)
+ 					errno = EEXIST;
+ 				else
+ 					errno = ENOTDIR;
retval = 1;
break;
}
+ 		}
+ 		else if (mkdir(path, last ? omode : S_IRWXU | 
S_IRWXG | S_IRWXO) < 0)
+ 		{
+ 			retval = 1;
+ 			break;
}
if (!last)
*p = '/';