Win32 CHECK_FOR_INTERRUPTS() performance tweak

Started by Qingqing Zhouover 20 years ago79 messageshackers
Jump to latest
#1Qingqing Zhou
zhouqq@cs.toronto.edu

This patch improves the win32 CHECK_FOR_INTERRUPTS() performance by
testing if any unblocked signals are queued before check
pgwin32_signal_event. This avoids an unnecessary system call.

Regards,
Qingqing

---

Index: backend/port/win32/signal.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/port/win32/signal.c,v
retrieving revision 1.12
diff -u -r1.12 signal.c
--- backend/port/win32/signal.c	15 Oct 2005 02:49:23 -0000	1.12
+++ backend/port/win32/signal.c	21 Oct 2005 05:32:52 -0000
@@ -19,11 +19,12 @@
 /* pg_signal_crit_sec is used to protect only pg_signal_queue. That is the only
  * variable that can be accessed from the signal sending threads! */
 static CRITICAL_SECTION pg_signal_crit_sec;
-static int	pg_signal_queue;
 static pqsigfunc pg_signal_array[PG_SIGNAL_COUNT];
 static pqsigfunc pg_signal_defaults[PG_SIGNAL_COUNT];
-static int	pg_signal_mask;
+
+int	pg_signal_queue;
+int	pg_signal_mask;

DLLIMPORT HANDLE pgwin32_signal_event;
HANDLE pgwin32_initial_signal_pipe = INVALID_HANDLE_VALUE;
@@ -91,11 +92,10 @@
int i;

 	EnterCriticalSection(&pg_signal_crit_sec);
-	while (pg_signal_queue & ~pg_signal_mask)
+	while (UNBLOCKED_SIGNAL_QUEUE())
 	{
 		/* One or more unblocked signals queued for execution */
-
-		int			exec_mask = pg_signal_queue & ~pg_signal_mask;
+		int			exec_mask = UNBLOCKED_SIGNAL_QUEUE();
 		for (i = 0; i < PG_SIGNAL_COUNT; i++)
 		{
Index: include/port/win32.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/port/win32.h,v
retrieving revision 1.47
diff -u -r1.47 win32.h
--- include/port/win32.h	15 Oct 2005 02:49:45 -0000	1.47
+++ include/port/win32.h	21 Oct 2005 05:33:26 -0000
@@ -214,6 +214,12 @@
 /* In backend/port/win32/signal.c */
+extern int	pg_signal_queue;
+extern int	pg_signal_mask;
+
+#define UNBLOCKED_SIGNAL_QUEUE()	\
+		(pg_signal_queue & ~pg_signal_mask)
+
 extern DLLIMPORT HANDLE pgwin32_signal_event;
 extern HANDLE pgwin32_initial_signal_pipe;
Index: include/miscadmin.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/miscadmin.h,v
retrieving revision 1.180
diff -u -r1.180 miscadmin.h
--- include/miscadmin.h	15 Oct 2005 02:49:41 -0000	1.180
+++ include/miscadmin.h	21 Oct 2005 05:33:56 -0000
@@ -87,8 +87,10 @@
 #define CHECK_FOR_INTERRUPTS() \
 do { \
-	if (WaitForSingleObjectEx(pgwin32_signal_event,0,TRUE) == WAIT_OBJECT_0) \
-		pgwin32_dispatch_queued_signals(); \
+	if (UNBLOCKED_SIGNAL_QUEUE())	{ \
+		if (WaitForSingleObjectEx(pgwin32_signal_event,0,TRUE) == WAIT_OBJECT_0) \
+			pgwin32_dispatch_queued_signals(); \
+	}	\
 	if (InterruptPending) \
 		ProcessInterrupts(); \
 } while(0)
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Qingqing Zhou (#1)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

Qingqing Zhou <zhouqq@cs.toronto.edu> writes:

This patch improves the win32 CHECK_FOR_INTERRUPTS() performance by
testing if any unblocked signals are queued before check
pgwin32_signal_event. This avoids an unnecessary system call.

http://archives.postgresql.org/pgsql-patches/2005-10/msg00191.php

This looks to me like a pretty important performance tweak for Windows.
Can any of the people who worked on the Windows signal implementation
look it over and confirm it's OK?

regards, tom lane

#3Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#2)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

This patch improves the win32 CHECK_FOR_INTERRUPTS() performance by
testing if any unblocked signals are queued before check
pgwin32_signal_event. This avoids an unnecessary system call.

http://archives.postgresql.org/pgsql-patches/2005-10/msg00191.php

This looks to me like a pretty important performance tweak
for Windows.
Can any of the people who worked on the Windows signal
implementation look it over and confirm it's OK?

I think it looks good. (Haven't tested it yet, but from reading it.)

It seems it's safe to skip the interlocking that we do. If we miss one
signal for some reason, we will find it on the next hit to
CHECK_FOR_INTERRUPTS(). And if we get an "extra hit", we still
re-examine the stuff when we're locked, so it shouldn't be a big
problem.

Do you have any numbers on how much performance increases with it? I
agree that it looks like it could be a significant help in some cases,
but it'd be great to have numbers...

I'm a little bit concerned about doing it this late in beta, but it does
look safe to me. When it's that late, it'd be good to have one more
person review it though :)

//Magnus

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#3)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

"Magnus Hagander" <mha@sollentuna.net> writes:

Do you have any numbers on how much performance increases with it?

A rough estimate is that it would cost more than half as much as EXPLAIN
ANALYZE does: that imposes two extra syscalls per ExecProcNode, while
this adds one. There are other high-frequency CHECK_FOR_INTERRUPTS
calls that I can't quantify as easily. My guess is that it's costing us
less than a factor of 2, but well more than 10%, on typical queries
... so definitely worth fixing for 8.1 if we can convince ourselves
it's correct.

regards, tom lane

#5Merlin Moncure
merlin.moncure@rcsonline.com
In reply to: Tom Lane (#4)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

I can't get the postgres to link with the patch...
Am I missing something?
Merlin

Info: resolving _pg_signal_mask by linking to __imp__pg_signal_mask
(auto-import)
Info: resolving _pg_signal_queue by linking to __imp__pg_signal_queue
(auto-import)
fu000001.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname'
fu000003.o(.idata$3+0xc): undefined reference to `libpostgres_a_iname'
nmth000000.o(.idata$4+0x0): undefined reference to `_nm__pg_signal_mask'
nmth000002.o(.idata$4+0x0): undefined reference to
`_nm__pg_signal_queue'

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#5)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:

I can't get the postgres to link with the patch...
Am I missing something?

Probably those variables need "extern DLLIMPORT".

regards, tom lane

#7Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Magnus Hagander (#3)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

"Tom Lane" <tgl@sss.pgh.pa.us> wrote

... so definitely worth fixing for 8.1 if we can convince ourselves
it's correct.

Despite the performance, there is one thing I am not exactly sure. Shall we
add "volatile" quanlifier to at least pg_signal_queue? The dangerous place
is PGSemaphoreLock(). If the compiler cache this value somehow, then we are
in trouble, but the original way (check event directly) does not have this
problem.

Regards,
Qingqing

#8Merlin Moncure
merlin.moncure@rcsonline.com
In reply to: Qingqing Zhou (#7)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

I can't get the postgres to link with the patch...
Am I missing something?
Merlin

False alarm. I had to rerun configure which copies win32.h in various
places, as Qingqing noted.

Merlin

#9Merlin Moncure
merlin.moncure@rcsonline.com
In reply to: Merlin Moncure (#8)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

I can't get the postgres to link with the patch...
Am I missing something?
Merlin

False alarm. I had to rerun configure which copies win32.h in various
places, as Qingqing noted.

Not false alarm :) Only with DLLIMPORT can I link all the libraries.
Will have performance #s up in a bit.

Merlin

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Qingqing Zhou (#7)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

"Qingqing Zhou" <zhouqq@cs.toronto.edu> writes:

Shall we add "volatile" quanlifier to at least pg_signal_queue?

If that's changed by a separate thread, "volatile" seems essential.
What about the mask variable?

regards, tom lane

#11Andrew Dunstan
andrew@dunslane.net
In reply to: Qingqing Zhou (#7)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance

Qingqing Zhou wrote:

"Tom Lane" <tgl@sss.pgh.pa.us> wrote

... so definitely worth fixing for 8.1 if we can convince ourselves
it's correct.

Despite the performance, there is one thing I am not exactly sure. Shall we
add "volatile" quanlifier to at least pg_signal_queue? The dangerous place
is PGSemaphoreLock(). If the compiler cache this value somehow, then we are
in trouble, but the original way (check event directly) does not have this
problem.

The fact this question is asked worries me a bit.

Also, I have a small style question - why use a nested if instead of
just saying

if (UNBLOCKED_SIGNAL_QUEUE() && WaitForSingleObjectEx(pgwin32_signal_event,0,TRUE) == WAIT_OBJECT_0)

?

cheers

andrew

#12Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#10)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

Shall we add "volatile" quanlifier to at least pg_signal_queue?

If that's changed by a separate thread, "volatile" seems essential.
What about the mask variable?

Yes, that does seem right. Previously it would never be concurrently
modified, because it was always locked by the critical section, but now
we read it without locking, and we certainly don't want that optimized
away.

The mask is only ever written by the main thread, never by the signal
dispatching thread. So I think that one could do without.

//Magnus

#13Merlin Moncure
merlin.moncure@rcsonline.com
In reply to: Magnus Hagander (#12)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

I can't get the postgres to link with the patch...
Am I missing something?
Merlin

False alarm. I had to rerun configure which copies win32.h in

various

places, as Qingqing noted.

Not false alarm :) Only with DLLIMPORT can I link all the libraries.
Will have performance #s up in a bit.

I have a couple of cpu-bound performance tests that I just ran with and
without the patch. Everything is ran with n=1 until volatile issue is
sorted out but so far I am not seeing any performance
improvement...believe me I would like nothing more than to report
otherwise. Also my tests maybe don't stress the signal code much. I
observed both the running time measured from the client and the weighted
average cpu load on the server.

Maybe it make a difference with high user load.

Merlin

#14Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Magnus Hagander (#12)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

On Fri, 21 Oct 2005, Magnus Hagander wrote:

Shall we add "volatile" quanlifier to at least pg_signal_queue?

If that's changed by a separate thread, "volatile" seems essential.
What about the mask variable?

Yes, that does seem right. Previously it would never be concurrently
modified, because it was always locked by the critical section, but now
we read it without locking, and we certainly don't want that optimized
away.

The mask is only ever written by the main thread, never by the signal
dispatching thread. So I think that one could do without.

Agreed.

Regards,
Qingqing

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#13)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:

Will have performance #s up in a bit.

I have a couple of cpu-bound performance tests that I just ran with and
without the patch. Everything is ran with n=1 until volatile issue is
sorted out but so far I am not seeing any performance
improvement...

Hm, what were the tests exactly? Offhand I'd expect something like a
SELECT COUNT(*) on a large but not-too-wide table to show noticeable
improvement.

regards, tom lane

#16Merlin Moncure
merlin.moncure@rcsonline.com
In reply to: Tom Lane (#15)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
Hm, what were the tests exactly? Offhand I'd expect something like a
SELECT COUNT(*) on a large but not-too-wide table to show noticeable
improvement.

regards, tom lane

I STAND CORRECTED! My tests were high volume record by record
iterators, etc. Read and drool, gentlemen.

Merlin

=============stock============

esp=# select count(*) from data1.line_file;
count
--------
321306
(1 row)

Time: 844.000 ms
esp=# select count(*) from data1.line_file;
count
--------
321306
(1 row)

Time: 843.000 ms
esp=# select count(*) from data1.line_file;
count
--------
321306
(1 row)

Time: 844.000 ms
esp=# \q
=============patched============
esp=# \timing
Timing is on.
esp=# select count(*) from data1.line_file;
count
--------
321306
(1 row)

Time: 453.000 ms
esp=# select count(*) from data1.line_file;
count
--------
321306
(1 row)

Time: 468.000 ms
esp=# select count(*) from data1.line_file;
count
--------
321306
(1 row)

Time: 469.000 ms

#17Magnus Hagander
magnus@hagander.net
In reply to: Merlin Moncure (#16)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

Hm, what were the tests exactly? Offhand I'd expect

something like a

SELECT COUNT(*) on a large but not-too-wide table to show

noticeable

improvement.

regards, tom lane

I STAND CORRECTED! My tests were high volume record by
record iterators, etc. Read and drool, gentlemen.

Wow, that's just great!

Was that with the volatile attribute or not?

//Magnus

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#16)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:

I STAND CORRECTED! My tests were high volume record by record
iterators, etc. Read and drool, gentlemen.

Looks good to me ;-) ...

If I recall the bidding correctly, the original patch needs DLLIMPORT
qualifiers attached to both of the variables, plus volatile attached to
pg_signal_queue. Do you want to send along the modified patch, or do
you think a non-Windows-hacker can get it right the first time?

regards, tom lane

#19Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Merlin Moncure (#16)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance

On Fri, 21 Oct 2005, Merlin Moncure wrote:

"Merlin Moncure" <merlin.moncure@rcsonline.com> writes:
Hm, what were the tests exactly? Offhand I'd expect something like a
SELECT COUNT(*) on a large but not-too-wide table to show noticeable
improvement.

regards, tom lane

I STAND CORRECTED! My tests were high volume record by record
iterators, etc. Read and drool, gentlemen.

Great! Did you patch the "volatile" thing?

Regards,
Qingqing

#20Merlin Moncure
merlin.moncure@rcsonline.com
In reply to: Tom Lane (#18)
Re: [PATCHES] Win32 CHECK_FOR_INTERRUPTS() performance tweak

Wow, that's just great!

Was that with the volatile attribute or not?

//Magnus

not. However (I'm assuming) this should not greatly impact things.
Good work. QQ: Can you fix the patch? I'm done till Monday.

Merlin

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#17)
#22Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Merlin Moncure (#20)
#23Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Andrew Dunstan (#11)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#20)
#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#24)
#26Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#25)
#27Andrew Dunstan
andrew@dunslane.net
In reply to: Andrew Dunstan (#26)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#26)
#29Dave Page
dpage@pgadmin.org
In reply to: Tom Lane (#28)
#30Magnus Hagander
magnus@hagander.net
In reply to: Dave Page (#29)
#31Magnus Hagander
magnus@hagander.net
In reply to: Magnus Hagander (#30)
#32Andrew Dunstan
andrew@dunslane.net
In reply to: Magnus Hagander (#31)
#33Magnus Hagander
magnus@hagander.net
In reply to: Andrew Dunstan (#32)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#30)
#35Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#34)
#36Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#34)
#37Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#33)
#38Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#37)
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#32)
#40Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#39)
#41Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#40)
#42Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#38)
#43Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#42)
#44Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Tom Lane (#43)
#45Andrew Dunstan
andrew@dunslane.net
In reply to: Qingqing Zhou (#44)
#46Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Andrew Dunstan (#45)
#47Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Qingqing Zhou (#46)
#48Andrew Dunstan
andrew@dunslane.net
In reply to: Qingqing Zhou (#46)
#49Magnus Hagander
magnus@hagander.net
In reply to: Andrew Dunstan (#48)
#50Magnus Hagander
magnus@hagander.net
In reply to: Magnus Hagander (#49)
#51Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#49)
#52Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#51)
#53Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Magnus Hagander (#52)
#54Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Magnus Hagander (#52)
#55Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#49)
#56Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#55)
#57Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Tom Lane (#55)
#58Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Magnus Hagander (#56)
#59Magnus Hagander
magnus@hagander.net
In reply to: Qingqing Zhou (#58)
#60Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#56)
#61Tom Lane
tgl@sss.pgh.pa.us
In reply to: Qingqing Zhou (#57)
#62Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#61)
#63Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#55)
#64Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Magnus Hagander (#49)
#65Magnus Hagander
magnus@hagander.net
In reply to: Qingqing Zhou (#64)
#66Merlin Moncure
merlin.moncure@rcsonline.com
In reply to: Magnus Hagander (#65)
#67Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#66)
#68Merlin Moncure
merlin.moncure@rcsonline.com
In reply to: Tom Lane (#67)
#69Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#67)
#70Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#68)
#71Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Dunstan (#69)
#72Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Magnus Hagander (#65)
#73Tom Lane
tgl@sss.pgh.pa.us
In reply to: Merlin Moncure (#66)
#74Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Tom Lane (#73)
#75Tom Lane
tgl@sss.pgh.pa.us
In reply to: Qingqing Zhou (#74)
#76Qingqing Zhou
zhouqq@cs.toronto.edu
In reply to: Tom Lane (#75)
#77Merlin Moncure
merlin.moncure@rcsonline.com
In reply to: Qingqing Zhou (#76)
#78Magnus Hagander
magnus@hagander.net
In reply to: Merlin Moncure (#77)
#79Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#78)