Performance patch for Win32

Started by Mark Dilgerover 13 years ago7 messages
#1Mark Dilger
markdilger@yahoo.com
1 attachment(s)

This is a patch against src/backend/storage/file/fd.c
taken from 9.2beta1.

This patch is submitted for review and comments, not
for application to the code base.  *WIP*

This patch addresses a performance problem stemming
from the use of FindFirstFile() and FindNextFile() to
iterate over a directory in Windows.  These two functions
are used in the port of readdir for Windows.  Unfortunately,
unlike Linux, these Windows directory iteration functions
return the equivalent of a stat() call for each file iterated.
Hence, if a directory contains many tens of thousands of
files, the iteration can take several minutes to complete.

In RemovePgTempFile(), multiple directories are iterated
and all files found which match the pattern for a temporary
file are unlinked.  The pattern matching is performed
*outside* the directory iteration.  This patch uses a file
pattern like "t*" to match all temporary files, rather than
iterating over all files in the directory, thus pushing the
pattern match *inside* the directory iteration and gaining
significant startup time performance.

This is not theoretical.  The real-world database where I
found this problem is on a Windows 2003 server running
PostgreSQL 9.1.3 and having 56,000 tables.  I was able
to duplicate the problem on a Windows 2008 server.

To reproduce, you will need a database on Windows with
tens of thousands of tables and a recent version of
PostgreSQL.  Reboot the Windows server so that the
filesystem is guaranteed not to be in the filesystem cache.
Start postgres using pg_ctl, and note that it takes several
minutes to start.  After applying the patch and re-running
these steps, the server should not take so long to start.

I have the following reservations about my design, and
solicit comments and suggestions for improvement:

1)  The changes I made in fd.c pass a pattern rather than
a name into ReadDir *knowing the details* of how ReadDir
on Windows will use the port of readdir in src/port/dirent.c
and that in that code FindFirstFile() and FindNextFile() will
be called.  This knowledge about the inner workings of
the port of readdir() is not appropriate inside fd.c, IMHO.

2) I used a fair amount of #ifdef WIN32 to avoid adding
unnecessary variables or branches to the non-windows
code.  Since this code is probably not on the critical path
performancewise, this may be overkill.

3) The pattern passed to ReadDir of the form "t*" should
probably be something closer to (in pcre form):
m/^t\d+_\d+/, rather than m/^t.*/.  I am not sufficiently
familiar with how Windows interprets file patterns, and
whether it interprets them differently from one version of
Windows to another, to be comfortable making a more
precise pattern.

4) Other places in the PostgreSQL sources where directory
iteration is needed should probably use a pattern if possible
when running on Windows.  Thus, it might make more
sense to have a version of ReadDir that explicitly takes a
pattern, and use that version of ReadDir elsewhere in the
codebase.

Attachments:

fd.diffsapplication/octet-stream; name=fd.diffsDownload
*** postgresql-9.2beta1/src/backend/storage/file/fd.c	2012-05-10 15:35:09.000000000 -0700
--- postgresql-9.2beta1/src/backend/storage/file/fd.c.modified	2012-05-29 13:34:30.000000000 -0700
***************
*** 2021,2038 ****
  	struct dirent *temp_de;
  	char		rm_path[MAXPGPATH];
  
  	temp_dir = AllocateDir(tmpdirname);
  	if (temp_dir == NULL)
  	{
  		/* anything except ENOENT is fishy */
  		if (errno != ENOENT)
  			elog(LOG,
  				 "could not open temporary-files directory \"%s\": %m",
! 				 tmpdirname);
  		return;
  	}
  
  	while ((temp_de = ReadDir(temp_dir, tmpdirname)) != NULL)
  	{
  		if (strcmp(temp_de->d_name, ".") == 0 ||
  			strcmp(temp_de->d_name, "..") == 0)
--- 2021,2060 ----
  	struct dirent *temp_de;
  	char		rm_path[MAXPGPATH];
  
+ #ifdef WIN32
+ 	/* On Win32, iterating over all files in a directory that match a pattern
+ 	 * can be much faster than iterating over all files in the directory, if
+ 	 * the file names that match the pattern are a small subset of all files.
+ 	 */
+ 	char		tempdirpattern[MAXPGPATH];
+ 
+ 	snprintf(tmpdirpattern, MAXPGPATH, "%s/%s*", tmpdirname, PG_TEMP_FILE_PREFIX);
+ 	temp_dir = AllocateDir(tmpdirpattern);
+ #else
  	temp_dir = AllocateDir(tmpdirname);
+ #endif
+ 
  	if (temp_dir == NULL)
  	{
  		/* anything except ENOENT is fishy */
  		if (errno != ENOENT)
  			elog(LOG,
+ #ifdef WIN32
+ 				 "could not open temporary-files directory using pattern \"%s\": %m",
+ 				 tmpdirpattern
+ #else
  				 "could not open temporary-files directory \"%s\": %m",
! 				 tmpdirname
! #endif
! 				 );
  		return;
  	}
  
+ #ifdef WIN32
+ 	while ((temp_de = ReadDir(temp_dir, tmpdirpattern)) != NULL)
+ #else
  	while ((temp_de = ReadDir(temp_dir, tmpdirname)) != NULL)
+ #endif
  	{
  		if (strcmp(temp_de->d_name, ".") == 0 ||
  			strcmp(temp_de->d_name, "..") == 0)
***************
*** 2103,2119 ****
  	struct dirent *de;
  	char		rm_path[MAXPGPATH];
  
  	dbspace_dir = AllocateDir(dbspacedirname);
  	if (dbspace_dir == NULL)
  	{
  		/* we just saw this directory, so it really ought to be there */
  		elog(LOG,
  			 "could not open dbspace directory \"%s\": %m",
! 			 dbspacedirname);
  		return;
  	}
  
  	while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
  	{
  		if (!looks_like_temp_rel_name(de->d_name))
  			continue;
--- 2125,2158 ----
  	struct dirent *de;
  	char		rm_path[MAXPGPATH];
  
+ #ifdef WIN32
+ 	char		dbspacedirpattern[MAXPGPATH];
+ 
+ 	snprintf(dbspacedirpattern, MAXPGPATH, "%s/t*", dbspacedirname);
+ 	dbspace_dir = AllocateDir(dbspacedirpattern);
+ #else
  	dbspace_dir = AllocateDir(dbspacedirname);
+ #endif
  	if (dbspace_dir == NULL)
  	{
  		/* we just saw this directory, so it really ought to be there */
  		elog(LOG,
+ #ifdef WIN32
+ 			 "could not open dbspace directory using pattern \"%s\": %m",
+ 			 dbspacedirpattern
+ #else
  			 "could not open dbspace directory \"%s\": %m",
! 			 dbspacedirname
! #endif
! 			 );
  		return;
  	}
  
+ #ifdef WIN32
+ 	while ((de = ReadDir(dbspace_dir, dbspacedirpattern)) != NULL)
+ #else
  	while ((de = ReadDir(dbspace_dir, dbspacedirname)) != NULL)
+ #endif
  	{
  		if (!looks_like_temp_rel_name(de->d_name))
  			continue;
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Dilger (#1)
Re: Performance patch for Win32

Mark Dilger <markdilger@yahoo.com> writes:

4) Other places in the PostgreSQL sources where directory
iteration is needed should probably use a pattern if possible
when running on Windows.� Thus, it might make more
sense to have a version of ReadDir that explicitly takes a
pattern, and use that version of ReadDir elsewhere in the
codebase.

Yeah, I think a separate argument passed to an AllocateDir variant
would be a less ugly way to deal with this. For example, in place
of your first #ifdef block just write

temp_dir = AllocateDirWithFilePattern(tmpdirname,
PG_TEMP_FILE_PREFIX "*");

What is not immediately obvious to me is whether we should try to make
the pattern argument do something useful on non-Windows platforms
(and thus allow removal of the ad-hoc pattern match code in the loops
where this is used); versus just treating it as a no-op on non-Windows.
If we did that, we'd have to consider that Windows gets to define what
the pattern language is and try to emulate that; which is likely to be
expensive enough to not be a win, not to mention that non-Windows
developers might not be terribly familiar with the finer points of the
pattern language.

I'm kind of inclined to consider that we should just treat the pattern
option as a Windows-specific wart and keep the ad-hoc matching code as
it is.

regards, tom lane

#3Mark Dilger
markdilger@yahoo.com
In reply to: Tom Lane (#2)
Re: Performance patch for Win32

I am hesitant to write a function like AllocateDirWithFilePattern
if the pattern is simply ignored on non-Windows.  In my patch,
the pattern underspecified the files, and the ad-hoc matching code
applied to all the returned files tightened that up.  But a person
could just as well overspecify the pattern and then they would get
different behavior on Windows vs. non-Windows, with fewer
files returned by FindNextFile() than would have matched the
ad-hoc pattern.

In this particular instance, AllocateDirWithFilePrefix would
work, and could be applied on all platforms, using something
like:

    temp_dir = AllocateDirWithFilePrefix(tmpdirname,
                                        PG_TEMP_FILE_PREFIX);

and that could be converted to PG_TEMP_FILE_PREFIX "*"
on Windows, and on non-Windows the function itself could
apply the prefix check easily enough.

Is AllocateDirWithFilePrefix overly specific?  Clearly we
could also take a Suffix argument, but if we go too far down
this road we just reinvent regular expressions....

mark

________________________________
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Mark Dilger <markdilger@yahoo.com>
Cc: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Sent: Tuesday, May 29, 2012 2:30 PM
Subject: Re: [HACKERS] Performance patch for Win32

Mark Dilger <markdilger@yahoo.com> writes:

4) Other places in the PostgreSQL sources where directory
iteration is needed should probably use a pattern if possible
when running on Windows.  Thus, it might make more
sense to have a version of ReadDir that explicitly takes a
pattern, and use that version of ReadDir elsewhere in the
codebase.

Yeah, I think a separate argument passed to an AllocateDir variant
would be a less ugly way to deal with this.  For example, in place
of your first #ifdef block just write

        temp_dir = AllocateDirWithFilePattern(tmpdirname,
                                              PG_TEMP_FILE_PREFIX "*");

What is not immediately obvious to me is whether we should try to make
the pattern argument do something useful on non-Windows platforms
(and thus allow removal of the ad-hoc pattern match code in the loops
where this is used); versus just treating it as a no-op on non-Windows.
If we did that, we'd have to consider that Windows gets to define what
the pattern language is and try to emulate that; which is likely to be
expensive enough to not be a win, not to mention that non-Windows
developers might not be terribly familiar with the finer points of the
pattern language.

I'm kind of inclined to consider that we should just treat the pattern
option as a Windows-specific wart and keep the ad-hoc matching code as
it is.

            regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mark Dilger (#3)
Re: Performance patch for Win32

Mark Dilger <markdilger@yahoo.com> writes:

I am hesitant to write a function like AllocateDirWithFilePattern
if the pattern is simply ignored on non-Windows.� In my patch,
the pattern underspecified the files, and the ad-hoc matching code
applied to all the returned files tightened that up.� But a person
could just as well overspecify the pattern and then they would get
different behavior on Windows vs. non-Windows, with fewer
files returned by FindNextFile() than would have matched the
ad-hoc pattern.

Well, if you're imagining that we wouldn't need to test carefully on
both Windows and non-Windows, I think that's a pipe dream. As an
example, your proposal of AllocateDirWithFilePrefix would only work
consistently across platforms if the prefix didn't contain anything
that Windows thought was a pattern metacharacter. (This might never
come up, but I'm not too sure what the metacharacters are on Windows.)

Having said that, I have nothing particularly against the idea of
specifying a prefix rather than an arbitrary pattern. I'm just
saying it'll still need testing. Also, I wonder how many of the
potential stat-equivalent operations we'll be unable to optimize
away with the more restricted definition. Using a tighter pattern
on Windows seems basically free (modulo testing) if we accept that
it's Windows-only.

regards, tom lane

#5Mark Dilger
markdilger@yahoo.com
In reply to: Tom Lane (#4)
Re: Performance patch for Win32

I was imagining that this would be a trap for linux developers
who saw nothing wrong with their code until it made it to the
build/test farm.  That's pretty far down the development
process.  Of course, it is also a trap in the other direction, for
Windows developers who use the pattern but do not include
anything equivalent for the non-Windows execution path.

On the whole, however, your argument in favor of tighter
patterns might be more convincing than my argument in favor
of catching bugs sooner.

I will start implementing your suggestion for patch v2.

________________________________
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Mark Dilger <markdilger@yahoo.com>
Cc: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Sent: Tuesday, May 29, 2012 3:42 PM
Subject: Re: [HACKERS] Performance patch for Win32

Mark Dilger <markdilger@yahoo.com> writes:

I am hesitant to write a function like AllocateDirWithFilePattern
if the pattern is simply ignored on non-Windows.  In my patch,
the pattern underspecified the files, and the ad-hoc matching code
applied to all the returned files tightened that up.  But a person
could just as well overspecify the pattern and then they would get
different behavior on Windows vs. non-Windows, with fewer
files returned by FindNextFile() than would have matched the
ad-hoc pattern.

Well, if you're imagining that we wouldn't need to test carefully on
both Windows and non-Windows, I think that's a pipe dream.  As an
example, your proposal of AllocateDirWithFilePrefix would only work
consistently across platforms if the prefix didn't contain anything
that Windows thought was a pattern metacharacter.  (This might never
come up, but I'm not too sure what the metacharacters are on Windows.)

Having said that, I have nothing particularly against the idea of
specifying a prefix rather than an arbitrary pattern.  I'm just
saying it'll still need testing.  Also, I wonder how many of the
potential stat-equivalent operations we'll be unable to optimize
away with the more restricted definition.  Using a tighter pattern
on Windows seems basically free (modulo testing) if we accept that
it's Windows-only.

            regards, tom lane

#6Bruce Momjian
bruce@momjian.us
In reply to: Mark Dilger (#5)
Re: Performance patch for Win32

On Tue, May 29, 2012 at 03:54:59PM -0700, Mark Dilger wrote:

I was imagining that this would be a trap for linux developers
who saw nothing wrong with their code until it made it to the
build/test farm. That's pretty far down the development
process. Of course, it is also a trap in the other direction, for
Windows developers who use the pattern but do not include
anything equivalent for the non-Windows execution path.

On the whole, however, your argument in favor of tighter
patterns might be more convincing than my argument in favor
of catching bugs sooner.

I will start implementing your suggestion for patch v2.

Any progress on this?

---------------------------------------------------------------------------

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
From: Tom Lane <tgl@sss.pgh.pa.us>
To: Mark Dilger <markdilger@yahoo.com>
Cc: "pgsql-hackers@postgresql.org" <pgsql-hackers@postgresql.org>
Sent: Tuesday, May 29, 2012 3:42 PM
Subject: Re: [HACKERS] Performance patch for Win32

Mark Dilger <markdilger@yahoo.com> writes:

I am hesitant to write a function like AllocateDirWithFilePattern
if the pattern is simply ignored on non-Windows. In my patch,
the pattern underspecified the files, and the ad-hoc matching code
applied to all the returned files tightened that up. But a person
could just as well overspecify the pattern and then they would get
different behavior on Windows vs. non-Windows, with fewer
files returned by FindNextFile() than would have matched the
ad-hoc pattern.

Well, if you're imagining that we wouldn't need to test carefully on
both Windows and non-Windows, I think that's a pipe dream. As an
example, your proposal of AllocateDirWithFilePrefix would only work
consistently across platforms if the prefix didn't contain anything
that Windows thought was a pattern metacharacter. (This might never
come up, but I'm not too sure what the metacharacters are on Windows.)

Having said that, I have nothing particularly against the idea of
specifying a prefix rather than an arbitrary pattern. I'm just
saying it'll still need testing. Also, I wonder how many of the
potential stat-equivalent operations we'll be unable to optimize
away with the more restricted definition. Using a tighter pattern
on Windows seems basically free (modulo testing) if we accept that
it's Windows-only.

regards, tom lane

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

#7Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#6)
Re: Performance patch for Win32

On Thu, Aug 30, 2012 at 04:37:37PM -0400, Bruce Momjian wrote:

On Tue, May 29, 2012 at 03:54:59PM -0700, Mark Dilger wrote:

I was imagining that this would be a trap for linux developers
who saw nothing wrong with their code until it made it to the
build/test farm. That's pretty far down the development
process. Of course, it is also a trap in the other direction, for
Windows developers who use the pattern but do not include
anything equivalent for the non-Windows execution path.

On the whole, however, your argument in favor of tighter
patterns might be more convincing than my argument in favor
of catching bugs sooner.

I will start implementing your suggestion for patch v2.

Any progress on this?

I have added this to the TODO list:

Reduce file statistics overhead on directory reads

/messages/by-id/1338325561.82125.YahooMailNeo@web39304.mail.mud.yahoo.com

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ It's impossible for everything to be true. +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers