fix pg_mkdir_p to tolerate concurrent directory creation
While working on the pytest stuff, I found this issue when making it
work on Windows, but the issue can exist everywhere. pg_mkdir_p can fail
if there is a concurrent directory creation. This patch makes it
tolerant of concurrent directory creation. To make it more robust it
stops using stat() on Windows and instead uses GetFileAttributes().
I think this makes the function actually meet the contract set out in
the file's header comment:
* pg_mkdir_p --- create a directory and, if necessary, parent directories
*
* This is equivalent to "mkdir -p" except we don't complain if the target
* directory already exists.
Thanks to Bryan Green for help with the Windows part.
There might be a few call sites that actually work around this bug that
we can clean up.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Attachments:
0001-Make-pg_mkdir_p-tolerant-of-a-concurrent-directory-c.patchtext/x-patch; charset=UTF-8; name=0001-Make-pg_mkdir_p-tolerant-of-a-concurrent-directory-c.patchDownload+50-3
Andrew Dunstan <andrew@dunslane.net> writes:
While working on the pytest stuff, I found this issue when making it
work on Windows, but the issue can exist everywhere. pg_mkdir_p can fail
if there is a concurrent directory creation.
This bit:
+ if (errno != EEXIST || stat(path, &sb) != 0 || !S_ISDIR(sb.st_mode))
+ {
+ retval = -1;
+ break;
+ }
looks like it could corrupt the reported errno, ie stat() could
overwrite what mkdir() reported.
regards, tom lane
On 2026-06-18 Th 10:24 AM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
While working on the pytest stuff, I found this issue when making it
work on Windows, but the issue can exist everywhere. pg_mkdir_p can fail
if there is a concurrent directory creation.This bit:
+ if (errno != EEXIST || stat(path, &sb) != 0 || !S_ISDIR(sb.st_mode)) + { + retval = -1; + break; + }looks like it could corrupt the reported errno, ie stat() could
overwrite what mkdir() reported.
Oh, good catch, thanks! Here's a revised patch.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Attachments:
v2-0001-Make-pg_mkdir_p-tolerant-of-a-concurrent-director.patchtext/x-patch; charset=UTF-8; name=v2-0001-Make-pg_mkdir_p-tolerant-of-a-concurrent-director.patchDownload+53-3
Andrew Dunstan <andrew@dunslane.net> writes:
Oh, good catch, thanks! Here's a revised patch.
The Unix side of this LGTM. I have no idea about the Windows
half (... but are we sure GetFileAttributes() can't change errno?)
regards, tom lane
On 2026-06-18 Th 11:43 AM, Tom Lane wrote:
Andrew Dunstan<andrew@dunslane.net> writes:
Oh, good catch, thanks! Here's a revised patch.
The Unix side of this LGTM. I have no idea about the Windows
half (... but are we sure GetFileAttributes() can't change errno?)
I'm told it does not. Google says:
GetFileAttributes does not set errno. Because it is a Win32 API function
rather than a standard C library function, it sets the thread-local
Last-Error Code instead.
|
|
|cheers|
|
|
|andrew|
--
Andrew Dunstan
EDB:https://www.enterprisedb.com
On 2026-Jun-18, Andrew Dunstan wrote:
While working on the pytest stuff, I found this issue when making it work on
Windows, but the issue can exist everywhere. pg_mkdir_p can fail if there is
a concurrent directory creation. This patch makes it tolerant of concurrent
directory creation.
Hmm, should the 'break's in the loop be 'continue's? If you're creating
path a/b/c, and a concurrent process beats you but is only attempting to
do a/b, then with the break you'll fail to create the final c.
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
Bob [Floyd] used to say that he was planning to get a Ph.D. by the "green
stamp method," namely by saving envelopes addressed to him as 'Dr. Floyd'.
After collecting 500 such letters, he mused, a university somewhere in
Arizona would probably grant him a degree. (Don Knuth)
=?utf-8?Q?=C3=81lvaro?= Herrera <alvherre@kurilemu.de> writes:
Hmm, should the 'break's in the loop be 'continue's? If you're creating
path a/b/c, and a concurrent process beats you but is only attempting to
do a/b, then with the break you'll fail to create the final c.
Maybe I'm confused, but I think it's right as-is. The "break"s are
places where we're giving up and reporting a hard error. If we want
to continue to the next directory level, we just fall through ---
and doing that with "continue" would be wrong because it'd miss the
step at the bottom of the loop:
if (!last)
*p = '/';
regards, tom lane
Actually ... given this change, why don't we drop the initial stat()
call? Just try mkdir(), and move on if it succeeds. If not, but
there's already a directory there, we're good. That eliminates the
race condition without duplicating code.
regards, tom lane
Attachments:
v3-0001-Make-pg_mkdir_p-tolerant-of-a-concurrent-director.patchtext/x-diff; charset=us-ascii; name*0=v3-0001-Make-pg_mkdir_p-tolerant-of-a-concurrent-director.p; name*1=atchDownload+34-13
On 2026-06-19 Fr 11:07 AM, Tom Lane wrote:
Actually ... given this change, why don't we drop the initial stat()
call? Just try mkdir(), and move on if it succeeds. If not, but
there's already a directory there, we're good. That eliminates the
race condition without duplicating code.
LGTM, thanks!
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
Andrew Dunstan <andrew@dunslane.net> writes:
LGTM, thanks!
OK, pushed.
regards, tom lane
On 2026-06-19 Fr 12:52 PM, Tom Lane wrote:
Andrew Dunstan <andrew@dunslane.net> writes:
LGTM, thanks!
OK, pushed.
Thanks!
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com