Keepalives win32

Started by Magnus Haganderover 15 years ago36 messages
#1Magnus Hagander
magnus@hagander.net

Hi!

I'm looking at adding win32 support for keepalives in libpq (well,
also backend, but libpq for now), per the request from Robert Haas.
I've come up with one issue though - in Windows, you can only set the
"idle" and "interval" parameter together in a single syscall (in Unix,
you have one for each). There is no support for setting the counter at
all.

However, there is no API for *reading* the current value, nor the
default value. There is no way to specify "set the default". If we set
one of them to zero, it really means zero - no interval at all (so
it'll flood out the packets - really fun when you enable it for
keepalive_idle).

The default value for these are available in the registry only.

The way I see it, we have two options:
1) Read the default value from the registry. That's some fairly ugly code, imho.
2) Ignore the registry value and use the default value of 2 hours/1
second. That will override any changes the user made in the registry,
which seems pretty ugly.
3) Require that these two parameters are always specified together (on
windows). Which is annoying.

Not sure which one to pick - opinions?

The API used is documented at:
http://msdn.microsoft.com/en-us/library/dd877220(v=VS.85).aspx
Patch as it looks now (libpq only, and with obvious problems with this
issue): http://github.com/mhagander/postgres/compare/master...win32keepalive

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#1)
Re: Keepalives win32

Magnus Hagander <magnus@hagander.net> writes:

[ can't read system's keepalive values in windows ]

The way I see it, we have two options:
1) Read the default value from the registry. That's some fairly ugly code, imho.
2) Ignore the registry value and use the default value of 2 hours/1
second. That will override any changes the user made in the registry,
which seems pretty ugly.
3) Require that these two parameters are always specified together (on
windows). Which is annoying.

I vote for #2. It's the least inconsistent --- we don't pay attention
to the registry for much of anything else, do we?

In practice I think people who were setting either would set both, so
it's not worth a huge amount of effort to have an unsurprising behavior
when only one is set.

regards, tom lane

#3Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#2)
Re: Keepalives win32

On Mon, Jun 28, 2010 at 20:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

[ can't read system's keepalive values in windows ]

The way I see it, we have two options:
1) Read the default value from the registry. That's some fairly ugly code, imho.
2) Ignore the registry value and use the default value of 2 hours/1
second. That will override any changes the user made in the registry,
which seems pretty ugly.
3) Require that these two parameters are always specified together (on
windows). Which is annoying.

I vote for #2.  It's the least inconsistent --- we don't pay attention
to the registry for much of anything else, do we?

Directly, no? Indirectly, we do. For every other TCP parameter
(because the registry controls what we'll get as the default when we
"just use things")

In practice I think people who were setting either would set both, so
it's not worth a huge amount of effort to have an unsurprising behavior
when only one is set.

There's unsurprising, and downright hostile (the way we get by default
is if you don't set keepalive_time, it'll spew keepalive packages
continuously, which is certainly not good). In tha tcase, it's
probably better to throw an error (which would be trivial to do, of
course)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#3)
Re: Keepalives win32

Magnus Hagander <magnus@hagander.net> writes:

On Mon, Jun 28, 2010 at 20:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I vote for #2. �It's the least inconsistent --- we don't pay attention
to the registry for much of anything else, do we?

Directly, no? Indirectly, we do. For every other TCP parameter
(because the registry controls what we'll get as the default when we
"just use things")

Not if we make the code use the RFC values as the defaults. I'm
envisioning the GUC assign hooks doing something like

#ifdef WIN32
if (newval == 0)
newval = RFC-specified-default;
#endif

so that the main GUC logic can still think that zero means "use the
default". We're just redefining where the default comes from.

This would be a change from previous behavior, but so what?
Implementing any functionality at all here is a change from previous
behavior on Windows. I don't have the slightest problem with saying
"as of 9.0, set these values via postgresql.conf, not the registry".

regards, tom lane

#5Andrew Chernow
ac@esilo.com
In reply to: Magnus Hagander (#1)
Re: Keepalives win32

The way I see it, we have two options:
1) Read the default value from the registry. That's some fairly ugly code, imho.

It seems faily simple to yank these values out, no? Even easier if you
use the all-in-wonder shell function SHGetValue().

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters
Values: KeepAliveTime, KeepAliveInterval
Type: DWORD

The only annoying thing is that the values may not exist. Well, it is
also rather annoying there is no way to set the counter.

The API used is documented at:
http://msdn.microsoft.com/en-us/library/dd877220(v=VS.85).aspx
Patch as it looks now (libpq only, and with obvious problems with this
issue): http://github.com/mhagander/postgres/compare/master...win32keepalive

and here :)

http://archives.postgresql.org/pgsql-hackers/2009-05/msg01099.php

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

#6Magnus Hagander
magnus@hagander.net
In reply to: Andrew Chernow (#5)
Re: Keepalives win32

On Mon, Jun 28, 2010 at 21:03, Andrew Chernow <ac@esilo.com> wrote:

The way I see it, we have two options:
1) Read the default value from the registry. That's some fairly ugly code,
imho.

It seems faily simple to yank these values out, no?  Even easier if you use
the all-in-wonder shell function SHGetValue().

We don't want to use that function, because it brings in a bunch of
extra dependencies. This makes libpq.dll more heavyweight and more
importantly, decreases the number of parallell connections we can deal
with on the server side (on win32 at least, not sure about win64).

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Tcpip\Parameters
Values: KeepAliveTime, KeepAliveInterval
Type: DWORD

The only annoying thing is that the values may not exist.  Well, it is also

Right, we'd need an fallback in case they don't exist as well.

rather annoying there is no way to set the counter.

Yeah, but that's at least well documented how it behaves. In fact,
there used to be a way to set that (via registry key), but they
removed it in Vista.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#7Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#4)
Re: Keepalives win32

On Mon, Jun 28, 2010 at 21:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Mon, Jun 28, 2010 at 20:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I vote for #2.  It's the least inconsistent --- we don't pay attention
to the registry for much of anything else, do we?

Directly, no? Indirectly, we do. For every other TCP parameter
(because the registry controls what we'll get as the default when we
"just use things")

Not if we make the code use the RFC values as the defaults.  I'm
envisioning the GUC assign hooks doing something like

#ifdef WIN32
       if (newval == 0)
               newval = RFC-specified-default;
#endif

Right. (I've only looked at the libpq side so far)

Also, we could avoid caling it *at all* if neither one of those
parameters is set. That'll take a bit more code (using the
unix-codepath of setsockopt() to enable keepalives at all), but it
shouldn't amount to many lines..

so that the main GUC logic can still think that zero means "use the
default".  We're just redefining where the default comes from.

Yeah.

This would be a change from previous behavior, but so what?
Implementing any functionality at all here is a change from previous
behavior on Windows.  I don't have the slightest problem with saying
"as of 9.0, set these values via postgresql.conf, not the registry".

Works for me.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#8Andrew Chernow
ac@esilo.com
In reply to: Magnus Hagander (#6)
Re: Keepalives win32

It seems faily simple to yank these values out, no? Even easier if you use
the all-in-wonder shell function SHGetValue().

We don't want to use that function, because it brings in a bunch of
extra dependencies. This makes libpq.dll more heavyweight and more
importantly, decreases the number of parallell connections we can deal
with on the server side (on win32 at least, not sure about win64).

Oh, didn't know that. Are the standard reg functions, open/query/close
really that bad? Can't be any worse than the security api or MAPI hell ;)

--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

#9Magnus Hagander
magnus@hagander.net
In reply to: Magnus Hagander (#7)
1 attachment(s)
Re: Keepalives win32

On Mon, Jun 28, 2010 at 21:10, Magnus Hagander <magnus@hagander.net> wrote:

On Mon, Jun 28, 2010 at 21:03, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Mon, Jun 28, 2010 at 20:45, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I vote for #2.  It's the least inconsistent --- we don't pay attention
to the registry for much of anything else, do we?

Directly, no? Indirectly, we do. For every other TCP parameter
(because the registry controls what we'll get as the default when we
"just use things")

Not if we make the code use the RFC values as the defaults.  I'm
envisioning the GUC assign hooks doing something like

#ifdef WIN32
       if (newval == 0)
               newval = RFC-specified-default;
#endif

Right. (I've only looked at the libpq side so far)

Also, we could avoid caling it *at all* if neither one of those
parameters is set. That'll take a bit more code (using the
unix-codepath of setsockopt() to enable keepalives at all), but it
shouldn't amount to many lines..

Here's what I'm thinking, for the libpq side. Similar change on the
server side. Seems ok?

(still http://github.com/mhagander/postgres/compare/master...win32keepalive
for those that prefer that interface)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Attachments:

libpq_keepalives_win32.patchapplication/octet-stream; name=libpq_keepalives_win32.patchDownload
*** a/src/interfaces/libpq/fe-connect.c
--- b/src/interfaces/libpq/fe-connect.c
***************
*** 38,43 ****
--- 38,44 ----
  #endif
  #define near
  #include <shlobj.h>
+ #include <mstcpip.h>
  #else
  #include <sys/socket.h>
  #include <netdb.h>
***************
*** 982,987 **** useKeepalives(PGconn *conn)
--- 983,989 ----
  	return val != 0 ? 1 : 0;
  }
  
+ #ifndef WIN32
  /*
   * Set the keepalive idle timer.
   */
***************
*** 1076,1081 **** setKeepalivesCount(PGconn *conn)
--- 1078,1150 ----
  	return 1;
  }
  
+ #else /* Win32 */
+ /*
+  * Enable keepalives and set the keepalive values on Win32,
+  * where they are always set in one batch.
+  */
+ static int
+ setKeepalivesWin32(PGconn *conn)
+ {
+ 	struct tcp_keepalive 	ka;
+ 	DWORD					retsize;
+ 	int						idle = 0;
+ 	int						interval = 0;
+ 
+ 	if (conn->keepalives_idle == NULL &&
+ 		conn->keepalives_interval == NULL)
+ 	{
+ 		/*
+ 		 * When neither is specified, use the setsockopt() API to avoid
+ 		 * overwriting the system default values.
+ 		 */
+ 		int		on = 1;
+ 		char	sebuf[256];
+ 
+ 		if (setsockopt(conn->sock,
+ 					   SOL_SOCKET, SO_KEEPALIVE,
+ 					   (char *) &on, sizeof(on)) < 0)
+ 		{
+ 			appendPQExpBuffer(&conn->errorMessage,
+ 							  libpq_gettext("setsockopt(SO_KEEPALIVE) failed: %s\n"),
+ 							  SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)));
+ 			return 0;
+ 		}
+ 	}
+ 
+ 	if (conn->keepalives_idle)
+ 		idle = atoi(conn->keepalives_idle);
+ 	if (idle <= 0)
+ 		idle = 2 * 60 * 60; /* 2 hours = default */
+ 
+ 	if (conn->keepalives_interval)
+ 		interval = atoi(conn->keepalives_interval);
+ 	if (interval <= 0)
+ 		interval = 1; /* 1 second = default */
+ 
+ 	ka.onoff = 1;
+ 	ka.keepalivetime = idle * 1000;
+ 	ka.keepaliveinterval = interval * 1000;
+ 
+ 	if (WSAIoctl(conn->sock,
+ 				 SIO_KEEPALIVE_VALS,
+ 				 (LPVOID) &ka,
+ 				 sizeof(ka),
+ 				 NULL,
+ 				 0,
+ 				 &retsize,
+ 				 NULL,
+ 				 NULL)
+ 		!= 0)
+ 	{
+ 		appendPQExpBuffer(&conn->errorMessage,
+ 						  libpq_gettext("WSAIoctl(SIO_KEEPALIVE_VALS) failed: %ui\n"),
+ 						  WSAGetLastError());
+ 		return 0;
+ 	}
+ 	return 1;
+ }
+ #endif /* WIN32 */
  
  /* ----------
   * connectDBStart -
***************
*** 1478,1483 **** keep_going:						/* We will come back to here until there is
--- 1547,1553 ----
  						{
  							/* Do nothing */
  						}
+ #ifndef WIN32
  						else if (setsockopt(conn->sock,
  											SOL_SOCKET, SO_KEEPALIVE,
  											(char *) &on, sizeof(on)) < 0)
***************
*** 1491,1496 **** keep_going:						/* We will come back to here until there is
--- 1561,1570 ----
  								 || !setKeepalivesInterval(conn)
  								 || !setKeepalivesCount(conn))
  							err = 1;
+ #else /* WIN32 */
+ 						else if (!setKeepalivesWin32(conn))
+ 							err = 1;
+ #endif /* WIN32 */
  
  						if (err)
  						{
#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#9)
Re: Keepalives win32

Magnus Hagander <magnus@hagander.net> writes:

Here's what I'm thinking, for the libpq side. Similar change on the
server side. Seems ok?

I had in mind just legislating that the defaults are the RFC values,
none of this "try to use the registry values in one case" business.
I don't believe that you can make the server side act that way without
much more ugliness than is warranted. Also, at least on the libpq side
there is no backwards compatibility argument to be made, because we
never turned on keepalives at all on that side in previous releases.

regards, tom lane

#11Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#10)
Re: Keepalives win32

On Mon, Jun 28, 2010 at 22:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

Here's what I'm thinking, for the libpq side. Similar change on the
server side. Seems ok?

I had in mind just legislating that the defaults are the RFC values,
none of this "try to use the registry values in one case" business.

Um, if you look at that patch, it doesn't try to use the registry. It
falls back directly to the system default, ignoring the registry. The
only special case is where the user doesn't specify any of the
parameters.

I don't believe that you can make the server side act that way without
much more ugliness than is warranted.  Also, at least on the libpq side
there is no backwards compatibility argument to be made, because we
never turned on keepalives at all on that side in previous releases.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#11)
Re: Keepalives win32

Magnus Hagander <magnus@hagander.net> writes:

On Mon, Jun 28, 2010 at 22:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I had in mind just legislating that the defaults are the RFC values,
none of this "try to use the registry values in one case" business.

Um, if you look at that patch, it doesn't try to use the registry. It
falls back directly to the system default, ignoring the registry. The
only special case is where the user doesn't specify any of the
parameters.

What I was trying to say is I think we could dispense with the
setsockopt() code path, and just always use the WSAIoctl() path anytime
keepalives are turned on. I don't know what "system default values"
you're speaking of, if they're not the registry entries; and I
definitely don't see the point of consulting such values if they aren't
user-settable. We might as well just consult the RFCs and be done.

regards, tom lane

#13Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#12)
Re: Keepalives win32

On Mon, Jun 28, 2010 at 8:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Mon, Jun 28, 2010 at 22:39, Tom Lane <tgl@sss.pgh.pa.us> wrote:

I had in mind just legislating that the defaults are the RFC values,
none of this "try to use the registry values in one case" business.

Um, if you look at that patch, it doesn't try to use the registry. It
falls back directly to the system default, ignoring the registry. The
only special case is where the user doesn't specify any of the
parameters.

What I was trying to say is I think we could dispense with the
setsockopt() code path, and just always use the WSAIoctl() path anytime
keepalives are turned on.  I don't know what "system default values"
you're speaking of, if they're not the registry entries; and I
definitely don't see the point of consulting such values if they aren't
user-settable.  We might as well just consult the RFCs and be done.

FWIW, I think I prefer Magnus's approach, but I'm not 100% sure I can
defend that preference...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#13)
Re: Keepalives win32

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jun 28, 2010 at 8:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What I was trying to say is I think we could dispense with the
setsockopt() code path, and just always use the WSAIoctl() path anytime
keepalives are turned on. �I don't know what "system default values"
you're speaking of, if they're not the registry entries; and I
definitely don't see the point of consulting such values if they aren't
user-settable. �We might as well just consult the RFCs and be done.

FWIW, I think I prefer Magnus's approach, but I'm not 100% sure I can
defend that preference...

Well, basically what I don't like about Magnus' proposal is that setting
one of the two values changes the default that will be used for the
other one. (Or, if it does not change the default, the extra code is
useless anyway.) If we just always go through the WSAIoctl() path then
we can clearly document "the default for this on Windows is so-and-so".
How is the documentation going to explain the behavior of the proposed
code?

regards, tom lane

#15Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#14)
Re: Keepalives win32

Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jun 28, 2010 at 8:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What I was trying to say is I think we could dispense with the
setsockopt() code path, and just always use the WSAIoctl() path anytime
keepalives are turned on. ���I don't know what "system default values"
you're speaking of, if they're not the registry entries; and I
definitely don't see the point of consulting such values if they aren't
user-settable. ���We might as well just consult the RFCs and be done.

FWIW, I think I prefer Magnus's approach, but I'm not 100% sure I can
defend that preference...

Well, basically what I don't like about Magnus' proposal is that setting
one of the two values changes the default that will be used for the
other one. (Or, if it does not change the default, the extra code is
useless anyway.) If we just always go through the WSAIoctl() path then
we can clearly document "the default for this on Windows is so-and-so".
How is the documentation going to explain the behavior of the proposed
code?

Let's look at the usage probabilities. 99% of Win32 users will not use
any of these settings. I would hate to come up with a solution that
changes the default behavior for that 99%.

Therefore, I think using hard-coded defaults only for cases where
someone sets one but not all settings is appropriate. The documentation
text would be:

On Windows, if a keepalive settings is set, then defaults will be
used for any unset values, specifically, keepalives_idle (200) and
keepalives_interval(4). Windows does not allow control of
keepalives_count.

Seems simple enough.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

#16Pavel Golub
pavel@microolap.com
In reply to: Bruce Momjian (#15)
Re: Keepalives win32

Hello, Bruce.

You wrote:

BM> Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jun 28, 2010 at 8:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What I was trying to say is I think we could dispense with the
setsockopt() code path, and just always use the WSAIoctl() path anytime
keepalives are turned on. пїЅI don't know what "system default values"
you're speaking of, if they're not the registry entries; and I
definitely don't see the point of consulting such values if they aren't
user-settable. пїЅWe might as well just consult the RFCs and be done.

FWIW, I think I prefer Magnus's approach, but I'm not 100% sure I can
defend that preference...

Well, basically what I don't like about Magnus' proposal is that setting
one of the two values changes the default that will be used for the
other one. (Or, if it does not change the default, the extra code is
useless anyway.) If we just always go through the WSAIoctl() path then
we can clearly document "the default for this on Windows is so-and-so".
How is the documentation going to explain the behavior of the proposed
code?

BM> Let's look at the usage probabilities. 99% of Win32 users will not use
BM> any of these settings.

Let me disagree with this statement. As DAC developer I'm faced with
opposite reality. There are a lot of users demanding this
functionality.

BM> I would hate to come up with a solution that
BM> changes the default behavior for that 99%.

BM> Therefore, I think using hard-coded defaults only for cases where
BM> someone sets one but not all settings is appropriate. The documentation
BM> text would be:

BM> On Windows, if a keepalive settings is set, then defaults will be
BM> used for any unset values, specifically, keepalives_idle (200) and
BM> keepalives_interval(4). Windows does not allow control of
BM> keepalives_count.

BM> Seems simple enough.

BM> --
BM> Bruce Momjian <bruce@momjian.us> http://momjian.us
BM> EnterpriseDB http://enterprisedb.com

BM> + None of us is going to be here forever. +

--
With best wishes,
Pavel mailto:pavel@gf.microolap.com

#17Magnus Hagander
magnus@hagander.net
In reply to: Pavel Golub (#16)
Re: Keepalives win32

2010/6/30 Pavel Golub <pavel@microolap.com>:

Hello, Bruce.

You wrote:

BM> Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jun 28, 2010 at 8:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What I was trying to say is I think we could dispense with the
setsockopt() code path, and just always use the WSAIoctl() path anytime
keepalives are turned on.  I don't know what "system default values"
you're speaking of, if they're not the registry entries; and I
definitely don't see the point of consulting such values if they aren't
user-settable.  We might as well just consult the RFCs and be done.

FWIW, I think I prefer Magnus's approach, but I'm not 100% sure I can
defend that preference...

Well, basically what I don't like about Magnus' proposal is that setting
one of the two values changes the default that will be used for the
other one.  (Or, if it does not change the default, the extra code is
useless anyway.)  If we just always go through the WSAIoctl() path then
we can clearly document "the default for this on Windows is so-and-so".
How is the documentation going to explain the behavior of the proposed
code?

BM> Let's look at the usage probabilities.  99% of Win32 users will not use
BM> any of these settings.

Let me disagree with this statement. As DAC developer I'm faced with
opposite reality. There are a lot of users demanding this
functionality.

It's very intersting to hear from somebody who expects to actually use
this. But are you aware that we're only talking about *adjusting* the
keepalive values, not enabling them? Because we will, as the code
stands now, enable keepalive by defaults - just use the system default
values for timeout intervals. (Meaning this is how we do it on Unix as
of HEAD, irregardless of my patch)

Do you have an opinion on the two choices for handling keepalives_idle
and keepalives_interval? They basically are:

1) When not configured, use system defaults. When only one of the two
parameters configured, use RFC default for the other one (overwrite
system default).

2) When not configured, use RFC defaults (overwrite system defaults).
When only one of the two parameters configured, use RFC default for
the other one (overwrite system default)

3) When not configured, use system defaults. When only one of the two
parameters configured, throw error.

I can see pros and cons with both. Given that I still think *most*
people will not configure the intervals at all, I think #1 is the one
that follows "principle of least surprise". Perhaps option *3* is the
one that does this for partial configuration?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#18Pavel Golub
pavel@microolap.com
In reply to: Magnus Hagander (#17)
Re: Keepalives win32

Hello, Magnus.

You wrote:

MH> 2010/6/30 Pavel Golub <pavel@microolap.com>:

Hello, Bruce.

You wrote:

BM> Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Mon, Jun 28, 2010 at 8:24 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

What I was trying to say is I think we could dispense with the
setsockopt() code path, and just always use the WSAIoctl() path anytime
keepalives are turned on. �I don't know what "system default values"
you're speaking of, if they're not the registry entries; and I
definitely don't see the point of consulting such values if they aren't
user-settable. �We might as well just consult the RFCs and be done.

FWIW, I think I prefer Magnus's approach, but I'm not 100% sure I can
defend that preference...

Well, basically what I don't like about Magnus' proposal is that setting
one of the two values changes the default that will be used for the
other one. �(Or, if it does not change the default, the extra code is
useless anyway.) �If we just always go through the WSAIoctl() path then
we can clearly document "the default for this on Windows is so-and-so".
How is the documentation going to explain the behavior of the proposed
code?

BM> Let's look at the usage probabilities. �99% of Win32 users will not use
BM> any of these settings.

Let me disagree with this statement. As DAC developer I'm faced with
opposite reality. There are a lot of users demanding this
functionality.

MH> It's very intersting to hear from somebody who expects to actually use
MH> this. But are you aware that we're only talking about *adjusting* the
MH> keepalive values, not enabling them? Because we will, as the code
MH> stands now, enable keepalive by defaults - just use the system default
MH> values for timeout intervals. (Meaning this is how we do it on Unix as
MH> of HEAD, irregardless of my patch)

MH> Do you have an opinion on the two choices for handling keepalives_idle
MH> and keepalives_interval? They basically are:

MH> 1) When not configured, use system defaults. When only one of the two
MH> parameters configured, use RFC default for the other one (overwrite
MH> system default).

MH> 2) When not configured, use RFC defaults (overwrite system defaults).
MH> When only one of the two parameters configured, use RFC default for
MH> the other one (overwrite system default)

MH> 3) When not configured, use system defaults. When only one of the two
MH> parameters configured, throw error.

MH> I can see pros and cons with both. Given that I still think *most*
MH> people will not configure the intervals at all, I think #1 is the one
MH> that follows "principle of least surprise". Perhaps option *3* is the
MH> one that does this for partial configuration?

Frankly speaking I cannot decide what is the best approach. :) It's up
to you guys.

--
With best wishes,
Pavel mailto:pavel@gf.microolap.com

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#17)
Re: Keepalives win32

Magnus Hagander <magnus@hagander.net> writes:

Do you have an opinion on the two choices for handling keepalives_idle
and keepalives_interval? They basically are:

1) When not configured, use system defaults. When only one of the two
parameters configured, use RFC default for the other one (overwrite
system default).

2) When not configured, use RFC defaults (overwrite system defaults).
When only one of the two parameters configured, use RFC default for
the other one (overwrite system default)

3) When not configured, use system defaults. When only one of the two
parameters configured, throw error.

It's hard to argue about this when most of us have no idea what these
"system defaults" are, or whether they really are any different from the
RFC values in the first place, or whether ordinary users know how to
alter them or even find out their values. Please provide some
background if you want intelligent comments.

regards, tom lane

#20Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#19)
Re: Keepalives win32

On Wed, Jun 30, 2010 at 16:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

Do you have an opinion on the two choices for handling keepalives_idle
and keepalives_interval? They basically are:

1) When not configured, use system defaults. When only one of the two
parameters configured, use RFC default for the other one (overwrite
system default).

2) When not configured, use RFC defaults (overwrite system defaults).
When only one of the two parameters configured, use RFC default for
the other one (overwrite system default)

3) When not configured, use system defaults. When only one of the two
parameters configured, throw error.

It's hard to argue about this when most of us have no idea what these
"system defaults" are, or whether they really are any different from the
RFC values in the first place, or whether ordinary users know how to
alter them or even find out their values.  Please provide some
background if you want intelligent comments.

The system defaults are whatever the user has configured at a machine
level (by editing the registry, by hand or by tool (including
policies)). I doubt many users have configured them by hand. There may
well be tools that do it for them.

Anyway, after some checking i realized #3 can't be implemented anyway
in the backend, since guc won't let us know early enough. So that's
out.

Thus, let's go with #2. Which was your suggestion :)

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#20)
Re: Keepalives win32

Magnus Hagander <magnus@hagander.net> writes:

On Wed, Jun 30, 2010 at 16:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It's hard to argue about this when most of us have no idea what these
"system defaults" are, or whether they really are any different from the
RFC values in the first place, or whether ordinary users know how to
alter them or even find out their values. �Please provide some
background if you want intelligent comments.

The system defaults are whatever the user has configured at a machine
level (by editing the registry, by hand or by tool (including
policies)). I doubt many users have configured them by hand. There may
well be tools that do it for them.

But you previously stated that this code was ignoring the registry
values. So doesn't "system defaults" boil down to whatever Windows'
wired-in defaults are?

regards, tom lane

#22Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#21)
Re: Keepalives win32

On Wed, Jun 30, 2010 at 16:48, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Wed, Jun 30, 2010 at 16:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It's hard to argue about this when most of us have no idea what these
"system defaults" are, or whether they really are any different from the
RFC values in the first place, or whether ordinary users know how to
alter them or even find out their values.  Please provide some
background if you want intelligent comments.

The system defaults are whatever the user has configured at a machine
level (by editing the registry, by hand or by tool (including
policies)). I doubt many users have configured them by hand. There may
well be tools that do it for them.

But you previously stated that this code was ignoring the registry
values.  So doesn't "system defaults" boil down to whatever Windows'
wired-in defaults are?

The order is Windows wired-in-defaults -> registry values -> what app chooses.

And yes, we *are* ignoring whatever the user has put in the registry,
making our path Windows documented-wired-in-defaults -> what app
chooses if we do this.

Windows default for idle is 2 hours, for interval 1 second.

Assume the user had reconfigured his default in the registry to 1 hour.

If the user makes no config change at all, that means it will run with
1 hour for idle and 1 second for interval.

If we now set tcp_interval to 10 seconds (to change the default), we
will now also change his idle value back to the system default, so he
will get 2 hours for idle and 10 seconds for interval. Thus, we are
ignoring the changes he made globally on his system.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#23Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#21)
Re: Keepalives win32

Tom Lane wrote:

Magnus Hagander <magnus@hagander.net> writes:

On Wed, Jun 30, 2010 at 16:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

It's hard to argue about this when most of us have no idea what these
"system defaults" are, or whether they really are any different from the
RFC values in the first place, or whether ordinary users know how to
alter them or even find out their values. ���Please provide some
background if you want intelligent comments.

The system defaults are whatever the user has configured at a machine
level (by editing the registry, by hand or by tool (including
policies)). I doubt many users have configured them by hand. There may
well be tools that do it for them.

But you previously stated that this code was ignoring the registry
values. So doesn't "system defaults" boil down to whatever Windows'
wired-in defaults are?

For Magnus, #2 was to use the RFC defaults. The OS defaults might be
different for different versions of Windows. We could use the OS
defaults for _some_ version of Windows, but I am not sure that is an
improvement.

I still like #1 because it affects the fewest people, and that option
uses the RFC defaults only for unset values when others are set. I
still think we can do #3 (error), but we have to add a check in an
unrelated place to check for unset values, and the code is likely to be
ugly.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#22)
Re: Keepalives win32

Magnus Hagander <magnus@hagander.net> writes:

But you previously stated that this code was ignoring the registry
values. �So doesn't "system defaults" boil down to whatever Windows'
wired-in defaults are?

The order is Windows wired-in-defaults -> registry values -> what app chooses.

And yes, we *are* ignoring whatever the user has put in the registry,

How does that statement square with your follow-on example?

Assume the user had reconfigured his default in the registry to 1 hour.

If the user makes no config change at all, that means it will run with
1 hour for idle and 1 second for interval.

If we now set tcp_interval to 10 seconds (to change the default), we
will now also change his idle value back to the system default, so he
will get 2 hours for idle and 10 seconds for interval. Thus, we are
ignoring the changes he made globally on his system.

With the code as you have it, yes, but if we do it as I'm suggesting,
that doesn't happen --- the effective value of the other parameter
doesn't change.

regards, tom lane

#25Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#23)
Re: Keepalives win32

Bruce Momjian <bruce@momjian.us> writes:

I still like #1 because it affects the fewest people, and that option
uses the RFC defaults only for unset values when others are set.

What's your idea of "affecting the fewest people"? There is no previous
history to be backward-compatible with, because we never supported
keepalive on Windows before.

regards, tom lane

#26Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#25)
Re: Keepalives win32

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

I still like #1 because it affects the fewest people, and that option
uses the RFC defaults only for unset values when others are set.

What's your idea of "affecting the fewest people"? There is no previous
history to be backward-compatible with, because we never supported
keepalive on Windows before.

Well, starting in 9.0, keepalives in libpq will default to 'on':

Controls whether client-side TCP keepalives are used. The default
value is 1, meaning on, but you can change this to 0, meaning off,
if keepalives are not wanted. This parameter is ignored for
connections made via a Unix-domain socket.

My definition is whether we should affect keepalive behavior for the 99%
of people who do not change the libpq defaults, meaning the other
keepalive settings. #2 would cause these people to use
non-registry-controlled keepalive behavior by using RFC defaults, and
even if we use Windows defaults, those defaults might be different for
different Windows versions.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#26)
Re: Keepalives win32

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

What's your idea of "affecting the fewest people"? There is no previous
history to be backward-compatible with, because we never supported
keepalive on Windows before.

Well, starting in 9.0, keepalives in libpq will default to 'on':

Yes, which is already a change in behavior. I don't understand why you
are worrying about "backwards compatibility" to parameter values that
weren't in use before. I think self-consistency of the new version is
far more important than that.

even if we use Windows defaults, those defaults might be different for
different Windows versions.

I'm not sure if that's an issue or not, but if it is, that seems to me
to argue for #2 not #1.

regards, tom lane

#28Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Magnus Hagander (#22)
Re: Keepalives win32

Magnus Hagander <magnus@hagander.net> wrote:

Windows default for idle is 2 hours, for interval 1 second.

And it defaults to five retries. With these settings, you could
have a TCP connection break with as little as a five second network
outage, if it happened to come after two hours of silence on the
connection; although an outage of up to two hours could go totally
unnoticed. The RFC values have a total of nine tries at 75 second
intervals, so for a single network outage to break a connection, it
would have to last at least ten minutes; but again, an outage of up
to two hours could occur before it started to check for problems.

I'm inclined toward option 2 (previously described on this thread),
because the Windows defaults are dumb. Wait two hours and then test
for five seconds???

I also think we may want to suggest that for most environments,
people may want to change these settings to something more
aggressive, like a 30 to 120 second initial delay, with a 10 or 20
second retry interval. The RFC defaults seem approximately right
for a TCP connection to a colony on the surface of the moon, where
besides the round trip latency of 2.5 seconds they might have to pay
by the byte. In other words, it is *so* conservative that I have
trouble seeing it ever causing a problem compared to not having
keepalive enabled, but it will eventually clean things up. In
practice people usually want something more aggressive.

-Kevin

#29Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#27)
Re: Keepalives win32

Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

What's your idea of "affecting the fewest people"? There is no previous
history to be backward-compatible with, because we never supported
keepalive on Windows before.

Well, starting in 9.0, keepalives in libpq will default to 'on':

Yes, which is already a change in behavior. I don't understand why you
are worrying about "backwards compatibility" to parameter values that
weren't in use before. I think self-consistency of the new version is
far more important than that.

I am worried about compatibility/consistency with other Windows
processes.

even if we use Windows defaults, those defaults might be different for
different Windows versions.

I'm not sure if that's an issue or not, but if it is, that seems to me
to argue for #2 not #1.

I assume if someone modified the registry, they want it to be used for
all applications that use keepalives on their system. Also, keep in
mind that, unlike the backend, which has postgresql.conf, it is
burdensome to set a libpq setting for all applications (without using
pg_service.conf).

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ None of us is going to be here forever. +

#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Kevin Grittner (#28)
Re: Keepalives win32

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

I also think we may want to suggest that for most environments,
people may want to change these settings to something more
aggressive, like a 30 to 120 second initial delay, with a 10 or 20
second retry interval. The RFC defaults seem approximately right
for a TCP connection to a colony on the surface of the moon, where
besides the round trip latency of 2.5 seconds they might have to pay
by the byte.

Well, the RFCs were definitely written at a time when bandwidth was a
lot more expensive than it is today.

In other words, it is *so* conservative that I have
trouble seeing it ever causing a problem compared to not having
keepalive enabled, but it will eventually clean things up.

Yes. This is a large part of the reason why I think it's okay for us to
turn libpq keepalive on by default in 9.0 --- the default parameters for
it are so conservative as to be unlikely to cause trouble. If Windows
isn't using RFC-equivalent default parameters, that seems like a good
reason to disregard the system settings and force use of the RFC values
as defaults.

regards, tom lane

#31Pavel Golub
pavel@microolap.com
In reply to: Tom Lane (#27)
Re: Keepalives win32

Hello, Tom.

You wrote:

TL> Bruce Momjian <bruce@momjian.us> writes:

Tom Lane wrote:

What's your idea of "affecting the fewest people"? There is no previous
history to be backward-compatible with, because we never supported
keepalive on Windows before.

Well, starting in 9.0, keepalives in libpq will default to 'on':

TL> Yes, which is already a change in behavior. I don't understand why you
TL> are worrying about "backwards compatibility" to parameter values that
TL> weren't in use before. I think self-consistency of the new version is
TL> far more important than that.

Absolutely agree.

even if we use Windows defaults, those defaults might be different for
different Windows versions.

TL> I'm not sure if that's an issue or not, but if it is, that seems to me
TL> to argue for #2 not #1.

TL> regards, tom lane

--
With best wishes,
Pavel mailto:pavel@gf.microolap.com

#32Magnus Hagander
magnus@hagander.net
In reply to: Tom Lane (#30)
1 attachment(s)
Re: Keepalives win32

On Wed, Jun 30, 2010 at 17:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

I also think we may want to suggest that for most environments,
people may want to change these settings to something more
aggressive, like a 30 to 120 second initial delay, with a 10 or 20
second retry interval.  The RFC defaults seem approximately right
for a TCP connection to a colony on the surface of the moon, where
besides the round trip latency of 2.5 seconds they might have to pay
by the byte.

Well, the RFCs were definitely written at a time when bandwidth was a
lot more expensive than it is today.

In other words, it is *so* conservative that I have
trouble seeing it ever causing a problem compared to not having
keepalive enabled, but it will eventually clean things up.

Yes.  This is a large part of the reason why I think it's okay for us to
turn libpq keepalive on by default in 9.0 --- the default parameters for
it are so conservative as to be unlikely to cause trouble.  If Windows
isn't using RFC-equivalent default parameters, that seems like a good
reason to disregard the system settings and force use of the RFC values
as defaults.

Here's an updated version of the patch, which includes server side
functionality. I took out the code that tried to"be smart". It'll now
set them to 2 hours/1 second by default. I looked quickly at the RFC
and didn't find the exact values there, so those values are the
documented out-of-the-box defaults on Windows. I can easily change
them to RFC values if someone can find them for me :)

It's also merged with roberts macos patch, since they were conflicting.

Doc changes not included, but I'll get those in before commit.

Comments?

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Attachments:

win32keepalive.patchapplication/octet-stream; name=win32keepalive.patchDownload
*** a/src/backend/libpq/pqcomm.c
--- b/src/backend/libpq/pqcomm.c
***************
*** 83,88 ****
--- 83,91 ----
  #ifdef HAVE_UTIME_H
  #include <utime.h>
  #endif
+ #ifdef WIN32
+ #include <mstcpip.h>
+ #endif
  
  #include "libpq/ip.h"
  #include "libpq/libpq.h"
***************
*** 1314,1323 **** pq_endcopyout(bool errorAbort)
   * Support for TCP Keepalive parameters
   */
  
  int
  pq_getkeepalivesidle(Port *port)
  {
! #if defined(TCP_KEEPIDLE) || defined(TCP_KEEPALIVE)
  	if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
  		return 0;
  
--- 1317,1371 ----
   * Support for TCP Keepalive parameters
   */
  
+ /*
+  * On Windows, we need to set both idle and interval at the same time.
+  * We also cannot reset them to the default (setting to zero will
+  * actually set them to zero, not default), therefor we fallback to
+  * the out-of-the-box default instead.
+  */
+ #ifdef WIN32
+ static int
+ pq_setkeepaliveswin32(Port *port, int idle, int interval)
+ {
+ 	struct tcp_keepalive	ka;
+ 	DWORD					retsize;
+ 
+ 	if (idle <= 0)
+ 		idle = 2 * 60 * 60; /* default = 2 hours */
+ 	if (interval <= 0)
+ 		interval = 1;       /* default = 1 second */
+ 
+ 	ka.onoff = 1;
+ 	ka.keepalivetime = idle * 1000;
+ 	ka.keepaliveinterval = interval * 1000;
+ 
+ 	if (WSAIoctl(port->sock,
+ 				 SIO_KEEPALIVE_VALS,
+ 				 (LPVOID) &ka,
+ 				 sizeof(ka),
+ 				 NULL,
+ 				 0,
+ 				 &retsize,
+ 				 NULL,
+ 				 NULL)
+ 		!= 0)
+ 	{
+ 		elog(LOG, "WSAIoctl(SIO_KEEPALIVE_VALS) failed: %ui",
+ 			 WSAGetLastError());
+ 		return STATUS_ERROR;
+ 	}
+ 	if (port->keepalives_idle != idle)
+ 		port->keepalives_idle = idle;
+ 	if (port->keepalives_interval != interval)
+ 		port->keepalives_interval = interval;
+ 	return STATUS_OK;
+ }
+ #endif
+ 
  int
  pq_getkeepalivesidle(Port *port)
  {
! #if defined(TCP_KEEPIDLE) || defined(TCP_KEEPALIVE) || defined(WIN32)
  	if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
  		return 0;
  
***************
*** 1326,1331 **** pq_getkeepalivesidle(Port *port)
--- 1374,1380 ----
  
  	if (port->default_keepalives_idle == 0)
  	{
+ #ifndef WIN32
  		ACCEPT_TYPE_ARG3 size = sizeof(port->default_keepalives_idle);
  
  #ifdef TCP_KEEPIDLE
***************
*** 1344,1350 **** pq_getkeepalivesidle(Port *port)
  			elog(LOG, "getsockopt(TCP_KEEPALIVE) failed: %m");
  			port->default_keepalives_idle = -1; /* don't know */
  		}
! #endif
  	}
  
  	return port->default_keepalives_idle;
--- 1393,1403 ----
  			elog(LOG, "getsockopt(TCP_KEEPALIVE) failed: %m");
  			port->default_keepalives_idle = -1; /* don't know */
  		}
! #endif /* TCP_KEEPIDLE */
! #else /* WIN32 */
! 		/* We can't get the defaults on Windows, so return "don't know" */
! 		port->default_keepalives_idle = -1;
! #endif /* WIN32 */
  	}
  
  	return port->default_keepalives_idle;
***************
*** 1359,1368 **** pq_setkeepalivesidle(int idle, Port *port)
  	if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
  		return STATUS_OK;
  
! #if defined(TCP_KEEPIDLE) || defined(TCP_KEEPALIVE)
  	if (idle == port->keepalives_idle)
  		return STATUS_OK;
  
  	if (port->default_keepalives_idle <= 0)
  	{
  		if (pq_getkeepalivesidle(port) < 0)
--- 1412,1422 ----
  	if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
  		return STATUS_OK;
  
! #if defined(TCP_KEEPIDLE) || defined(TCP_KEEPALIVE) || defined(WIN32)
  	if (idle == port->keepalives_idle)
  		return STATUS_OK;
  
+ #ifndef WIN32
  	if (port->default_keepalives_idle <= 0)
  	{
  		if (pq_getkeepalivesidle(port) < 0)
***************
*** 1394,1414 **** pq_setkeepalivesidle(int idle, Port *port)
  #endif
  
  	port->keepalives_idle = idle;
! #else
  	if (idle != 0)
  	{
  		elog(LOG, "setting the keepalive idle time is not supported");
  		return STATUS_ERROR;
  	}
  #endif
- 
  	return STATUS_OK;
  }
  
  int
  pq_getkeepalivesinterval(Port *port)
  {
! #ifdef TCP_KEEPINTVL
  	if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
  		return 0;
  
--- 1448,1470 ----
  #endif
  
  	port->keepalives_idle = idle;
! #else /* WIN32 */
! 	return pq_setkeepaliveswin32(port, idle, port->keepalives_interval);
! #endif
! #else /* TCP_KEEPIDLE || WIN32 */
  	if (idle != 0)
  	{
  		elog(LOG, "setting the keepalive idle time is not supported");
  		return STATUS_ERROR;
  	}
  #endif
  	return STATUS_OK;
  }
  
  int
  pq_getkeepalivesinterval(Port *port)
  {
! #if defined(TCP_KEEPIDLE) || defined(WIN32)
  	if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
  		return 0;
  
***************
*** 1417,1422 **** pq_getkeepalivesinterval(Port *port)
--- 1473,1479 ----
  
  	if (port->default_keepalives_interval == 0)
  	{
+ #ifndef WIN32
  		ACCEPT_TYPE_ARG3 size = sizeof(port->default_keepalives_interval);
  
  		if (getsockopt(port->sock, IPPROTO_TCP, TCP_KEEPINTVL,
***************
*** 1426,1431 **** pq_getkeepalivesinterval(Port *port)
--- 1483,1492 ----
  			elog(LOG, "getsockopt(TCP_KEEPINTVL) failed: %m");
  			port->default_keepalives_interval = -1;		/* don't know */
  		}
+ #else
+ 		/* We can't get the defaults on Windows, so return "don't know" */
+ 		port->default_keepalives_interval = -1;
+ #endif /* WIN32 */
  	}
  
  	return port->default_keepalives_interval;
***************
*** 1440,1449 **** pq_setkeepalivesinterval(int interval, Port *port)
  	if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
  		return STATUS_OK;
  
! #ifdef TCP_KEEPINTVL
  	if (interval == port->keepalives_interval)
  		return STATUS_OK;
  
  	if (port->default_keepalives_interval <= 0)
  	{
  		if (pq_getkeepalivesinterval(port) < 0)
--- 1501,1511 ----
  	if (port == NULL || IS_AF_UNIX(port->laddr.addr.ss_family))
  		return STATUS_OK;
  
! #if defined(TCP_KEEPINTVL) || defined (WIN32)
  	if (interval == port->keepalives_interval)
  		return STATUS_OK;
  
+ #ifndef WIN32
  	if (port->default_keepalives_interval <= 0)
  	{
  		if (pq_getkeepalivesinterval(port) < 0)
***************
*** 1466,1471 **** pq_setkeepalivesinterval(int interval, Port *port)
--- 1528,1536 ----
  	}
  
  	port->keepalives_interval = interval;
+ #else /* WIN32 */
+ 	return pq_setkeepaliveswin32(port, port->keepalives_idle, interval);
+ #endif
  #else
  	if (interval != 0)
  	{
*** a/src/interfaces/libpq/fe-connect.c
--- b/src/interfaces/libpq/fe-connect.c
***************
*** 38,43 ****
--- 38,44 ----
  #endif
  #define near
  #include <shlobj.h>
+ #include <mstcpip.h>
  #else
  #include <sys/socket.h>
  #include <netdb.h>
***************
*** 982,987 **** useKeepalives(PGconn *conn)
--- 983,989 ----
  	return val != 0 ? 1 : 0;
  }
  
+ #ifndef WIN32
  /*
   * Set the keepalive idle timer.
   */
***************
*** 1090,1095 **** setKeepalivesCount(PGconn *conn)
--- 1092,1143 ----
  	return 1;
  }
  
+ #else /* Win32 */
+ /*
+  * Enable keepalives and set the keepalive values on Win32,
+  * where they are always set in one batch.
+  */
+ static int
+ setKeepalivesWin32(PGconn *conn)
+ {
+ 	struct tcp_keepalive 	ka;
+ 	DWORD					retsize;
+ 	int						idle = 0;
+ 	int						interval = 0;
+ 
+ 	if (conn->keepalives_idle)
+ 		idle = atoi(conn->keepalives_idle);
+ 	if (idle <= 0)
+ 		idle = 2 * 60 * 60; /* 2 hours = default */
+ 
+ 	if (conn->keepalives_interval)
+ 		interval = atoi(conn->keepalives_interval);
+ 	if (interval <= 0)
+ 		interval = 1; /* 1 second = default */
+ 
+ 	ka.onoff = 1;
+ 	ka.keepalivetime = idle * 1000;
+ 	ka.keepaliveinterval = interval * 1000;
+ 
+ 	if (WSAIoctl(conn->sock,
+ 				 SIO_KEEPALIVE_VALS,
+ 				 (LPVOID) &ka,
+ 				 sizeof(ka),
+ 				 NULL,
+ 				 0,
+ 				 &retsize,
+ 				 NULL,
+ 				 NULL)
+ 		!= 0)
+ 	{
+ 		appendPQExpBuffer(&conn->errorMessage,
+ 						  libpq_gettext("WSAIoctl(SIO_KEEPALIVE_VALS) failed: %ui\n"),
+ 						  WSAGetLastError());
+ 		return 0;
+ 	}
+ 	return 1;
+ }
+ #endif /* WIN32 */
  
  /* ----------
   * connectDBStart -
***************
*** 1492,1497 **** keep_going:						/* We will come back to here until there is
--- 1540,1546 ----
  						{
  							/* Do nothing */
  						}
+ #ifndef WIN32
  						else if (setsockopt(conn->sock,
  											SOL_SOCKET, SO_KEEPALIVE,
  											(char *) &on, sizeof(on)) < 0)
***************
*** 1505,1510 **** keep_going:						/* We will come back to here until there is
--- 1554,1563 ----
  								 || !setKeepalivesInterval(conn)
  								 || !setKeepalivesCount(conn))
  							err = 1;
+ #else /* WIN32 */
+ 						else if (!setKeepalivesWin32(conn))
+ 							err = 1;
+ #endif /* WIN32 */
  
  						if (err)
  						{
#33Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#32)
Re: Keepalives win32

On Wed, Jul 7, 2010 at 9:20 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Jun 30, 2010 at 17:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

I also think we may want to suggest that for most environments,
people may want to change these settings to something more
aggressive, like a 30 to 120 second initial delay, with a 10 or 20
second retry interval.  The RFC defaults seem approximately right
for a TCP connection to a colony on the surface of the moon, where
besides the round trip latency of 2.5 seconds they might have to pay
by the byte.

Well, the RFCs were definitely written at a time when bandwidth was a
lot more expensive than it is today.

In other words, it is *so* conservative that I have
trouble seeing it ever causing a problem compared to not having
keepalive enabled, but it will eventually clean things up.

Yes.  This is a large part of the reason why I think it's okay for us to
turn libpq keepalive on by default in 9.0 --- the default parameters for
it are so conservative as to be unlikely to cause trouble.  If Windows
isn't using RFC-equivalent default parameters, that seems like a good
reason to disregard the system settings and force use of the RFC values
as defaults.

Here's an updated version of the patch, which includes server side
functionality. I took out the code that tried to"be smart". It'll now
set them to 2 hours/1 second by default. I looked quickly at the RFC
and didn't find the exact values there, so those values are the
documented out-of-the-box defaults on Windows. I can easily change
them to RFC values if someone can find them for me :)

It's also merged with roberts macos patch, since they were conflicting.

Doc changes not included, but I'll get those in before commit.

Comments?

Looks generally OK, though my knowledge of Windows is pretty limited.
We'd better get this committed PDQ if it's going into beta3, else
there won't be a full buildfarm cycle before we wrap.

(BTW, there are two buildfarm machines - wigeon and orangutan - that
are consistently failing with rather bizarre error messages. Are
these real failures or are those machines just messed up?)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

#34Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Magnus Hagander (#32)
Re: Keepalives win32

Magnus Hagander <magnus@hagander.net> wrote:

It'll now set them to 2 hours/1 second by default. I looked
quickly at the RFC and didn't find the exact values there, so those
values are the documented out-of-the-box defaults on Windows. I
can easily change them to RFC values if someone can find them for
me :)

The RFC specifies 2 hours/75 seconds/9 tries. Even though we can't
reasonably adjust the number of tries up from 5 in Windows, I'd be
inclined to keep the 75 interval, rather than doubling it.

-Kevin

#35Andrew Dunstan
andrew@dunslane.net
In reply to: Robert Haas (#33)
Re: Keepalives win32

Robert Haas wrote:

(BTW, there are two buildfarm machines - wigeon and orangutan - that
are consistently failing with rather bizarre error messages. Are
these real failures or are those machines just messed up?)

Dave and Scott,

please investigate these errors in your buildfarm members.

cheers

andrew

#36Magnus Hagander
magnus@hagander.net
In reply to: Robert Haas (#33)
Re: Keepalives win32

On Wed, Jul 7, 2010 at 15:32, Robert Haas <robertmhaas@gmail.com> wrote:

On Wed, Jul 7, 2010 at 9:20 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Jun 30, 2010 at 17:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Kevin Grittner" <Kevin.Grittner@wicourts.gov> writes:

I also think we may want to suggest that for most environments,
people may want to change these settings to something more
aggressive, like a 30 to 120 second initial delay, with a 10 or 20
second retry interval.  The RFC defaults seem approximately right
for a TCP connection to a colony on the surface of the moon, where
besides the round trip latency of 2.5 seconds they might have to pay
by the byte.

Well, the RFCs were definitely written at a time when bandwidth was a
lot more expensive than it is today.

In other words, it is *so* conservative that I have
trouble seeing it ever causing a problem compared to not having
keepalive enabled, but it will eventually clean things up.

Yes.  This is a large part of the reason why I think it's okay for us to
turn libpq keepalive on by default in 9.0 --- the default parameters for
it are so conservative as to be unlikely to cause trouble.  If Windows
isn't using RFC-equivalent default parameters, that seems like a good
reason to disregard the system settings and force use of the RFC values
as defaults.

Here's an updated version of the patch, which includes server side
functionality. I took out the code that tried to"be smart". It'll now
set them to 2 hours/1 second by default. I looked quickly at the RFC
and didn't find the exact values there, so those values are the
documented out-of-the-box defaults on Windows. I can easily change
them to RFC values if someone can find them for me :)

It's also merged with roberts macos patch, since they were conflicting.

Doc changes not included, but I'll get those in before commit.

Comments?

Looks generally OK, though my knowledge of Windows is pretty limited.
We'd better get this committed PDQ if it's going into beta3, else
there won't be a full buildfarm cycle before we wrap.

Committed.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/