PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
The new connection made by PQcancel does not use the tcp_user_timeout, connect_timeout or any of the keepalive settings that are provided in the connection string. This means that a call to PQcancel can block for a much longer time than intended if there are network issues. This can be especially impactful, because PQcancel is a blocking function an there is no non blocking version of it.
I attached a proposed patch to use the tcp_user_timeout from the connection string when connecting to Postgres in PQcancel. This resolves the issue for me, since this will make connecting timeout after a configurable time. So the other options are not strictly needed. It might still be nice for completeness to support them too though. I didn't do this yet, because I first wanted some feedback and also because implementing connect_timeout would require using non blocking TCP to connect and then use select to have a timeout.
Attachments:
0001-Use-tcp_user_timeout-in-PQcancel.patchapplication/octet-stream; name=0001-Use-tcp_user_timeout-in-PQcancel.patchDownload
From 0b8e21375ce3b55048503c51948c979cd443613c Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Thu, 30 Sep 2021 16:29:53 +0200
Subject: [PATCH] Use tcp_user_timeout in PQcancel
PGcancel would not adhere to the timeout specified in the
tcp_user_timeout connection option. Which means a call to PGcancel could
take much longer than expected. This can especially be an issue because
there's no non blocking version of PGcancel.
---
src/interfaces/libpq/fe-connect.c | 26 +++++++++++++++++++++++---
src/interfaces/libpq/libpq-int.h | 1 +
2 files changed, 24 insertions(+), 3 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 56755f0796..0c1ec9d64d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4363,13 +4363,19 @@ PQgetCancel(PGconn *conn)
if (conn->sock == PGINVALID_SOCKET)
return NULL;
- cancel = malloc(sizeof(PGcancel));
+ cancel = calloc(1, sizeof(PGcancel));
if (cancel == NULL)
return NULL;
memcpy(&cancel->raddr, &conn->raddr, sizeof(SockAddr));
cancel->be_pid = conn->be_pid;
cancel->be_key = conn->be_key;
+ if (conn->pgtcp_user_timeout != NULL) {
+ if (!parse_int_param(conn->pgtcp_user_timeout, &cancel->pgtcp_user_timeout, conn,
+ "tcp_user_timeout")) {
+ return NULL;
+ }
+ }
return cancel;
}
@@ -4404,7 +4410,7 @@ PQfreeCancel(PGcancel *cancel)
* between the two versions of the cancel function possible.
*/
static int
-internal_cancel(SockAddr *raddr, int be_pid, int be_key,
+internal_cancel(SockAddr *raddr, int be_pid, int be_key, int pgtcp_user_timeout,
char *errbuf, int errbufsize)
{
int save_errno = SOCK_ERRNO;
@@ -4426,6 +4432,19 @@ internal_cancel(SockAddr *raddr, int be_pid, int be_key,
strlcpy(errbuf, "PQcancel() -- socket() failed: ", errbufsize);
goto cancel_errReturn;
}
+
+#ifdef TCP_USER_TIMEOUT
+ if (!IS_AF_UNIX(raddr->addr.ss_family)) {
+ if (pgtcp_user_timeout < 0) {
+ pgtcp_user_timeout = 0;
+ }
+ if (setsockopt(tmpsock, IPPROTO_TCP, TCP_USER_TIMEOUT,
+ (char *) &pgtcp_user_timeout, sizeof(pgtcp_user_timeout)) < 0) {
+ goto cancel_errReturn;
+ }
+ }
+#endif
+
retry3:
if (connect(tmpsock, (struct sockaddr *) &raddr->addr,
raddr->salen) < 0)
@@ -4517,6 +4536,7 @@ PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
}
return internal_cancel(&cancel->raddr, cancel->be_pid, cancel->be_key,
+ cancel->pgtcp_user_timeout,
errbuf, errbufsize);
}
@@ -4551,7 +4571,7 @@ PQrequestCancel(PGconn *conn)
return false;
}
- r = internal_cancel(&conn->raddr, conn->be_pid, conn->be_key,
+ r = internal_cancel(&conn->raddr, conn->be_pid, conn->be_key, 0,
conn->errorMessage.data, conn->errorMessage.maxlen);
if (!r)
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 334aea4b6e..ef27a0bf23 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -581,6 +581,7 @@ struct pg_cancel
SockAddr raddr; /* Remote address */
int be_pid; /* PID of backend --- needed for cancels */
int be_key; /* key of backend --- needed for cancels */
+ int pgtcp_user_timeout; /* tcp user timeout */
};
--
2.17.1
On Thu, Sep 30, 2021 at 7:45 AM Jelte Fennema <Jelte.Fennema@microsoft.com>
wrote:
The new connection made by PQcancel does not use the tcp_user_timeout,
connect_timeout or any of the keepalive settings that are provided in the
connection string. This means that a call to PQcancel can block for a much
longer time than intended if there are network issues. This can be
especially impactful, because PQcancel is a blocking function an there is
no non blocking version of it.I attached a proposed patch to use the tcp_user_timeout from the
connection string when connecting to Postgres in PQcancel. This resolves
the issue for me, since this will make connecting timeout after a
configurable time. So the other options are not strictly needed. It might
still be nice for completeness to support them too though. I didn't do this
yet, because I first wanted some feedback and also because implementing
connect_timeout would require using non blocking TCP to connect and then
use select to have a timeout.
Hi,
int be_key; /* key of backend --- needed for cancels */
+ int pgtcp_user_timeout; /* tcp user timeout */
The other field names are quite short. How about naming the field
tcp_timeout ?
Cheers
We actually ran into an issue caused by this in production, where a PQcancel connection was open on the client for a 2+ days because the server had restarted at the wrong moment in the cancel handshake. The client was now indefinitely waiting for the server to send an EOF back, and because keepalives were not enabled on this socket it was never closed.
I attached an updated patch which also uses the keepalive settings in PQ. The connect_timeout is a bit harder to get it to work. As far as I can tell it would require something like this. https://stackoverflow.com/a/2597774/2570866
The other field names are quite short. How about naming the field tcp_timeout ?
I kept the same names as in the pg_conn struct for consistency sake.
Ugh forgot to attach the patch. Here it is.
________________________________
From: Jelte Fennema <Jelte.Fennema@microsoft.com>
Sent: Wednesday, October 6, 2021 21:56
To: Zhihong Yu <zyu@yugabyte.com>
Cc: pgsql-hackers@postgresql.org <pgsql-hackers@postgresql.org>
Subject: Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings
We actually ran into an issue caused by this in production, where a PQcancel connection was open on the client for a 2+ days because the server had restarted at the wrong moment in the cancel handshake. The client was now indefinitely waiting for the server to send an EOF back, and because keepalives were not enabled on this socket it was never closed.
I attached an updated patch which also uses the keepalive settings in PQ. The connect_timeout is a bit harder to get it to work. As far as I can tell it would require something like this. https://stackoverflow.com/a/2597774/2570866
The other field names are quite short. How about naming the field tcp_timeout ?
I kept the same names as in the pg_conn struct for consistency sake.
Attachments:
0001-Use-tcp_user_timeout-and-keepalives-in-PQcancel.patchapplication/octet-stream; name=0001-Use-tcp_user_timeout-and-keepalives-in-PQcancel.patchDownload
From 92f8fb427fe8b9dc6c5fcfffe51421e6814b9750 Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Thu, 30 Sep 2021 16:29:53 +0200
Subject: [PATCH] Use tcp_user_timeout and keepalives in PQcancel
PGcancel would not adhere to the timeout specified in the
tcp_user_timeout connection option. It would also not enable keepalives
on the socket. So no keepalives would be used by the on these
connections at all, not even the system configured ones.
This means a call to PGcancel could take much longer than expected. This
can especially be an issue because there's no non blocking version of
PGcancel.
---
src/interfaces/libpq/fe-connect.c | 177 +++++++++++++++++++++++++++---
src/interfaces/libpq/libpq-int.h | 8 ++
2 files changed, 171 insertions(+), 14 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 56755f0796..5f0f0a5945 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4363,13 +4363,57 @@ PQgetCancel(PGconn *conn)
if (conn->sock == PGINVALID_SOCKET)
return NULL;
- cancel = malloc(sizeof(PGcancel));
+ cancel = calloc(1, sizeof(PGcancel));
if (cancel == NULL)
return NULL;
memcpy(&cancel->raddr, &conn->raddr, sizeof(SockAddr));
cancel->be_pid = conn->be_pid;
cancel->be_key = conn->be_key;
+ // Set all the socket options to -1, so we can use that to see if the
+ // socket options are set or not.
+ cancel->connect_timeout = -1;
+ cancel->pgtcp_user_timeout = -1;
+ cancel->keepalives = -1;
+ cancel->keepalives_idle = -1;
+ cancel->keepalives_interval = -1;
+ cancel->keepalives_count = -1;
+ if (conn->connect_timeout != NULL) {
+ if (!parse_int_param(conn->connect_timeout, &cancel->connect_timeout, conn,
+ "connect_timeout")) {
+ return NULL;
+ }
+ }
+ if (conn->pgtcp_user_timeout != NULL) {
+ if (!parse_int_param(conn->pgtcp_user_timeout, &cancel->pgtcp_user_timeout, conn,
+ "tcp_user_timeout")) {
+ return NULL;
+ }
+ }
+ if (conn->keepalives != NULL) {
+ if (!parse_int_param(conn->keepalives, &cancel->keepalives, conn,
+ "keepalives")) {
+ return NULL;
+ }
+ }
+ if (conn->keepalives_idle != NULL) {
+ if (!parse_int_param(conn->keepalives_idle, &cancel->keepalives_idle, conn,
+ "keepalives_idle")) {
+ return NULL;
+ }
+ }
+ if (conn->keepalives_interval != NULL) {
+ if (!parse_int_param(conn->keepalives_interval, &cancel->keepalives_interval, conn,
+ "keepalives_interval")) {
+ return NULL;
+ }
+ }
+ if (conn->keepalives_count != NULL) {
+ if (!parse_int_param(conn->keepalives_count, &cancel->keepalives_count, conn,
+ "keepalives_count")) {
+ return NULL;
+ }
+ }
return cancel;
}
@@ -4404,8 +4448,7 @@ PQfreeCancel(PGcancel *cancel)
* between the two versions of the cancel function possible.
*/
static int
-internal_cancel(SockAddr *raddr, int be_pid, int be_key,
- char *errbuf, int errbufsize)
+internal_cancel(PGcancel *cancel, char *errbuf, int errbufsize)
{
int save_errno = SOCK_ERRNO;
pgsocket tmpsock = PGINVALID_SOCKET;
@@ -4421,14 +4464,110 @@ internal_cancel(SockAddr *raddr, int be_pid, int be_key,
* We need to open a temporary connection to the postmaster. Do this with
* only kernel calls.
*/
- if ((tmpsock = socket(raddr->addr.ss_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
+ if ((tmpsock = socket(cancel->raddr.addr.ss_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
{
strlcpy(errbuf, "PQcancel() -- socket() failed: ", errbufsize);
goto cancel_errReturn;
}
+
+ if (!IS_AF_UNIX(cancel->raddr.addr.ss_family)) {
+#ifndef WIN32
+#ifdef TCP_USER_TIMEOUT
+ if (cancel->pgtcp_user_timeout >= 0) {
+ if (setsockopt(tmpsock, IPPROTO_TCP, TCP_USER_TIMEOUT,
+ (char *) &cancel->pgtcp_user_timeout,
+ sizeof(cancel->pgtcp_user_timeout)) < 0) {
+ goto cancel_errReturn;
+ }
+ }
+#endif
+
+ if (cancel->keepalives != 0) {
+ int on = 1;
+ if (setsockopt(tmpsock,
+ SOL_SOCKET, SO_KEEPALIVE,
+ (char *) &on, sizeof(on)) < 0)
+ {
+ goto cancel_errReturn;
+ }
+ }
+
+#ifdef PG_TCP_KEEPALIVE_IDLE
+ if (cancel->keepalives_idle >= 0) {
+ if (setsockopt(tmpsock, IPPROTO_TCP, PG_TCP_KEEPALIVE_IDLE,
+ (char *) &cancel->keepalives_idle,
+ sizeof(cancel->keepalives_idle)) < 0)
+ {
+ goto cancel_errReturn;
+ }
+ }
+#endif
+
+#ifdef TCP_KEEPINTVL
+ if (cancel->keepalives_interval >= 0) {
+ if (setsockopt(tmpsock, IPPROTO_TCP, TCP_KEEPINTVL,
+ (char *) &cancel->keepalives_interval,
+ sizeof(cancel->keepalives_interval)) < 0)
+ {
+ goto cancel_errReturn;
+ }
+ }
+#endif
+
+#ifdef TCP_KEEPCNT
+ if (cancel->keepalives_count >= 0) {
+ if (setsockopt(tmpsock, IPPROTO_TCP, TCP_KEEPCNT,
+ (char *) &cancel->keepalives_count,
+ sizeof(cancel->keepalives_count)) < 0)
+ {
+ goto cancel_errReturn;
+ }
+ }
+#endif
+#else
+ if (cancel->keepalives != 0) {
+ int idle = cancel->keepalives_idle;
+ int interval = cancel->keepalives_interval;
+ if (idle <= 0)
+ idle = 2 * 60 * 60; /* 2 hours = default */
+
+ if (conn->keepalives_interval &&
+ !parse_int_param(conn->keepalives_interval, &interval, conn,
+ "keepalives_interval"))
+ return 0;
+ if (interval <= 0)
+ interval = 1; /* 1 second = default */
+
+ ka.onoff = 1;
+ ka.keepalivetime = idle * 1000;
+ ka.keepaliveinterval = interval * 1000;
+
+ if (WSAIoctl(conn->sock,
+ SIO_KEEPALIVE_VALS,
+ (LPVOID) &ka,
+ sizeof(ka),
+ NULL,
+ 0,
+ &retsize,
+ NULL,
+ NULL)
+ != 0)
+ {
+ appendPQExpBuffer(&conn->errorMessage,
+ libpq_gettext("%s(%s) failed: error code %d\n"),
+ "WSAIoctl", "SIO_KEEPALIVE_VALS",
+ WSAGetLastError());
+ return 0;
+ }
+ }
+
+#endif
+ }
+
+
retry3:
- if (connect(tmpsock, (struct sockaddr *) &raddr->addr,
- raddr->salen) < 0)
+ if (connect(tmpsock, (struct sockaddr *) &cancel->raddr.addr,
+ cancel->raddr.salen) < 0)
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
@@ -4445,8 +4584,8 @@ retry3:
crp.packetlen = pg_hton32((uint32) sizeof(crp));
crp.cp.cancelRequestCode = (MsgType) pg_hton32(CANCEL_REQUEST_CODE);
- crp.cp.backendPID = pg_hton32(be_pid);
- crp.cp.cancelAuthCode = pg_hton32(be_key);
+ crp.cp.backendPID = pg_hton32(cancel->be_pid);
+ crp.cp.cancelAuthCode = pg_hton32(cancel->be_key);
retry4:
if (send(tmpsock, (char *) &crp, sizeof(crp), 0) != (int) sizeof(crp))
@@ -4516,8 +4655,7 @@ PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
return false;
}
- return internal_cancel(&cancel->raddr, cancel->be_pid, cancel->be_key,
- errbuf, errbufsize);
+ return internal_cancel(cancel, errbuf, errbufsize);
}
/*
@@ -4550,10 +4688,21 @@ PQrequestCancel(PGconn *conn)
return false;
}
-
- r = internal_cancel(&conn->raddr, conn->be_pid, conn->be_key,
- conn->errorMessage.data, conn->errorMessage.maxlen);
-
+ PGcancel cancel = {0};
+
+ memcpy(&cancel.raddr, &conn->raddr, sizeof(SockAddr));
+ cancel.be_pid = conn->be_pid;
+ cancel.be_key = conn->be_key;
+ // Set all the socket options to -1, so we can use that to see if the
+ // socket options are set or not.
+ cancel.connect_timeout = -1;
+ cancel.pgtcp_user_timeout = -1;
+ cancel.keepalives = -1;
+ cancel.keepalives_idle = -1;
+ cancel.keepalives_interval = -1;
+ cancel.keepalives_count = -1;
+
+ r = internal_cancel(&cancel, conn->errorMessage.data, conn->errorMessage.maxlen);
if (!r)
conn->errorMessage.len = strlen(conn->errorMessage.data);
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 334aea4b6e..661c175f14 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -581,6 +581,14 @@ struct pg_cancel
SockAddr raddr; /* Remote address */
int be_pid; /* PID of backend --- needed for cancels */
int be_key; /* key of backend --- needed for cancels */
+ int connect_timeout; /* connection timeout */
+ int pgtcp_user_timeout; /* tcp user timeout */
+ int keepalives; /* use TCP keepalives? */
+ int keepalives_idle; /* time between TCP keepalives */
+ int keepalives_interval; /* time between TCP keepalive
+ * retransmits */
+ int keepalives_count; /* maximum number of TCP keepalive
+ * retransmits */
};
--
2.17.1
On 2021/10/07 4:58, Jelte Fennema wrote:
Ugh forgot to attach the patch. Here it is.
Thanks for working on this patch!
@@ -4546,10 +4684,21 @@ PQrequestCancel(PGconn *conn)
return false;
}
-
- r = internal_cancel(&conn->raddr, conn->be_pid, conn->be_key,
Since PQrequestCancel() is marked as deprecated, I don't think that
we need to add the feature into it.
+ if (cancel->pgtcp_user_timeout >= 0) {
+ if (setsockopt(tmpsock, IPPROTO_TCP, TCP_USER_TIMEOUT,
+ (char *) &cancel->pgtcp_user_timeout,
+ sizeof(cancel->pgtcp_user_timeout)) < 0) {
+ goto cancel_errReturn;
+ }
+ }
libpq has already setKeepalivesXXX() functions to do the almost same thing.
Isn't it better to modify and reuse them instead of adding the almost
same code, to simplify the code?
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
Fujii Masao <masao.fujii@oss.nttdata.com> writes:
Since PQrequestCancel() is marked as deprecated, I don't think that
we need to add the feature into it.
Not absolutely necessary, agreed, but it looks like it's pretty
easy to make that happen, so why not? I'd suggest dropping the
separate implementation and turning PQrequestCancel() into a
thin wrapper around PQgetCancel, PQcancel, PQfreeCancel.
libpq has already setKeepalivesXXX() functions to do the almost same thing.
Isn't it better to modify and reuse them instead of adding the almost
same code, to simplify the code?
I find this patch fairly scary, because it's apparently been coded
with little regard for the expectation that PQcancel can be called
within a signal handler.
I see that setsockopt(2) is specified to be async signal safe by
POSIX, so at least in principle it is okay to add those calls to
PQcancel. But you've got to be really careful what else you do in
there. You can NOT use appendPQExpBuffer. You can NOT access the
PGconn (I don't think the Windows part of this even compiles ...
nope, it doesn't, per the cfbot). I'm not sure that WSAIoctl
is safe to use in a signal handler, so on the whole I think
I'd drop the Windows-specific chunk altogether. But in any case,
I'm very strongly against calling out to other libpq code from here,
because then the signal-safety restrictions start applying to that
other code too, and that's a recipe for trouble in future.
The patch could use a little attention to conforming to PG coding
conventions (comment style, brace style, C99 declarations are all
wrong --- pgindent would fix much of that, but maybe not in a way
you like). The lack of comments about why it's doing what it's doing
needs to be rectified, too. Why are these specific options important
and not any others?
regards, tom lane
I'd suggest dropping the separate implementation and turning
PQrequestCancel() into a thin wrapper around PQgetCancel,
PQcancel, PQfreeCancel.
Fine by me. I didn't want to change behavior of a deprecated
function.
I find this patch fairly scary, because it's apparently been coded
with little regard for the expectation that PQcancel can be called
within a signal handler.
You can NOT use appendPQExpBuffer. You can NOT access the
PGconn (I don't think the Windows part of this even compiles ...
nope, it doesn't, per the cfbot).
I guess I was tired or at least inattentive when I added the windows part at the end
of my coding session. It really was the Linux part that I cared about. For that part
I definitely took care to make the code signal safe. Which is also why I did not call out to
any of the existing functions, like setKeepalivesXXX(). I don't think I'm the right person
to write the windows code for this (I have zero C windows experience). So, if it's not
required for this patch to be accepted I'll happily remove it.
The patch could use a little attention to conforming to PG coding
conventions (comment style, brace style, C99 declarations are all
wrong --- pgindent would fix much of that, but maybe not in a way
you like).
Sure, I'll run pgindent for my next version of the patch.
The lack of comments about why it's doing what it's doing
needs to be rectified, too. Why are these specific options important
and not any others?
I'll make sure to add comments before the final version of this patch. This
patch was more meant as a draft to gauge if this was even the correct way of fixing
this problem.
To be honest I think it would make sense to add a new PQcancel function that is not
required to be signal safe and reuses regular connection setup code. This would make sure
options like this are supported automatically in the future. Another advantage is that it would
allow for sending cancel messages in a non-blocking way. So, you would be able to easily
send multiple cancels in a concurrent way. It looks to me like PQcancel is mostly designed
the way it is to keep it easy for psql to send cancelations. I think many other uses of PQcancel
don't require it to be signal safe at all (at least for Citus its usage signal safety is not required).
I was able to spend some time on this again. I attached two patches to this email:
The first patch is a cleaned up version of my previous patch. I think I addressed
all feedback on the previous version in that patch (e.g. removed windows code,
fixed formatting).
The second patch is a new one, it implements honouring of the connect_timeout
connection option in PQcancel. This patch requires the first patch to also be applied,
but since it seemed fairly separate and the code is not trivial I didn't want the first
patch to be blocked on this.
Finally, I would love it if once these fixes are merged the would also be backpatched to
previous versions of libpq. Does that seem possible? As far as I can tell it would be fine,
since it doesn't really change any of the public APIs. The only change is that the pg_cancel
struct now has a few additional fields. But since that struct is defined in libpq-int.h, so that
struct should not be used by users of libpq directly, right?.
Attachments:
0001-Use-timeout-socket-options-for-cancel-connections.patchapplication/octet-stream; name=0001-Use-timeout-socket-options-for-cancel-connections.patchDownload
From afd1d9ec2fceda6fdac8a4c5e9885a6db115df97 Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Tue, 28 Dec 2021 12:46:35 +0100
Subject: [PATCH 1/2] Use timeout socket options for cancel connections
PQcancel would not adhere to the timeout specified in the
tcp_user_timeout connection option. It would also not enable keepalives
on the socket. So no keepalives would be used by the on these
connections at all, not even the system configured ones.
This means a call to PQcancel could take much longer than expected,
possibly indefinitely. This can especially be an issue because there's
no non-blocking version of PQcancel.
This patch correctly sets both tcp_user_timeout and keepalive related
socket options on the connection that's used to cancel a query. It only
does so for non Windows systems, because the signal safety of the
related windows functions is not clear to me.
---
src/interfaces/libpq/fe-connect.c | 159 +++++++++++++++++++++++++++---
src/interfaces/libpq/libpq-int.h | 7 ++
2 files changed, 154 insertions(+), 12 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 9b6a6939f0..ed29c64484 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4385,13 +4385,60 @@ PQgetCancel(PGconn *conn)
if (conn->sock == PGINVALID_SOCKET)
return NULL;
- cancel = malloc(sizeof(PGcancel));
+ cancel = calloc(1, sizeof(PGcancel));
if (cancel == NULL)
return NULL;
memcpy(&cancel->raddr, &conn->raddr, sizeof(SockAddr));
cancel->be_pid = conn->be_pid;
cancel->be_key = conn->be_key;
+ /* Set all the socket options to -1, so we can use that to see if the */
+ /* socket options are set or not. */
+ cancel->pgtcp_user_timeout = -1;
+ cancel->keepalives = -1;
+ cancel->keepalives_idle = -1;
+ cancel->keepalives_interval = -1;
+ cancel->keepalives_count = -1;
+ if (conn->pgtcp_user_timeout != NULL)
+ {
+ if (!parse_int_param(conn->pgtcp_user_timeout, &cancel->pgtcp_user_timeout, conn,
+ "tcp_user_timeout"))
+ {
+ return NULL;
+ }
+ }
+ if (conn->keepalives != NULL)
+ {
+ if (!parse_int_param(conn->keepalives, &cancel->keepalives, conn,
+ "keepalives"))
+ {
+ return NULL;
+ }
+ }
+ if (conn->keepalives_idle != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_idle, &cancel->keepalives_idle, conn,
+ "keepalives_idle"))
+ {
+ return NULL;
+ }
+ }
+ if (conn->keepalives_interval != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_interval, &cancel->keepalives_interval, conn,
+ "keepalives_interval"))
+ {
+ return NULL;
+ }
+ }
+ if (conn->keepalives_count != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_count, &cancel->keepalives_count, conn,
+ "keepalives_count"))
+ {
+ return NULL;
+ }
+ }
return cancel;
}
@@ -4426,8 +4473,7 @@ PQfreeCancel(PGcancel *cancel)
* between the two versions of the cancel function possible.
*/
static int
-internal_cancel(SockAddr *raddr, int be_pid, int be_key,
- char *errbuf, int errbufsize)
+internal_cancel(PGcancel *cancel, char *errbuf, int errbufsize)
{
int save_errno = SOCK_ERRNO;
pgsocket tmpsock = PGINVALID_SOCKET;
@@ -4443,14 +4489,102 @@ internal_cancel(SockAddr *raddr, int be_pid, int be_key,
* We need to open a temporary connection to the postmaster. Do this with
* only kernel calls.
*/
- if ((tmpsock = socket(raddr->addr.ss_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
+ if ((tmpsock = socket(cancel->raddr.addr.ss_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
{
strlcpy(errbuf, "PQcancel() -- socket() failed: ", errbufsize);
goto cancel_errReturn;
}
+
+ /*
+ * Since this connection will only be used to send a tiny bit of data, we
+ * don't need NODELAY. We also don't set the socket to nonblocking mode,
+ * because the API definition of PQcancel currently requires the cancel to
+ * be sent in a blocking way.
+ *
+ * We do set various other socket options related to keepalives and other
+ * TCP timeouts. This is necessary to make sure that the call to this
+ * function does not block indefinitly, when reasonable keepalive and
+ * settings have been provided.
+ *
+ * NOTE: These socket options are currently not set for Windows. The
+ * reason is that signal safety in this function is very important, and it
+ * was not clear to if the functions required to set the socket options on
+ * Windows were signal-safe.
+ */
+#ifndef WIN32
+ if (!IS_AF_UNIX(cancel->raddr.addr.ss_family))
+ {
+#ifdef TCP_USER_TIMEOUT
+ if (cancel->pgtcp_user_timeout >= 0)
+ {
+ if (setsockopt(tmpsock, IPPROTO_TCP, TCP_USER_TIMEOUT,
+ (char *) &cancel->pgtcp_user_timeout,
+ sizeof(cancel->pgtcp_user_timeout)) < 0)
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_USER_TIMEOUT) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+#endif
+
+ if (cancel->keepalives != 0)
+ {
+ int on = 1;
+
+ if (setsockopt(tmpsock,
+ SOL_SOCKET, SO_KEEPALIVE,
+ (char *) &on, sizeof(on)) < 0)
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(SO_KEEPALIVE) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+
+#ifdef PG_TCP_KEEPALIVE_IDLE
+ if (cancel->keepalives_idle >= 0)
+ {
+ if (setsockopt(tmpsock, IPPROTO_TCP, PG_TCP_KEEPALIVE_IDLE,
+ (char *) &cancel->keepalives_idle,
+ sizeof(cancel->keepalives_idle)) < 0)
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(PG_TPC_KEEPALIVE_IDLE) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+#endif
+
+#ifdef TCP_KEEPINTVL
+ if (cancel->keepalives_interval >= 0)
+ {
+ if (setsockopt(tmpsock, IPPROTO_TCP, TCP_KEEPINTVL,
+ (char *) &cancel->keepalives_interval,
+ sizeof(cancel->keepalives_interval)) < 0)
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TPC_KEEPINTVL) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+#endif
+
+#ifdef TCP_KEEPCNT
+ if (cancel->keepalives_count >= 0)
+ {
+ if (setsockopt(tmpsock, IPPROTO_TCP, TCP_KEEPCNT,
+ (char *) &cancel->keepalives_count,
+ sizeof(cancel->keepalives_count)) < 0)
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TPC_KEEPCNT) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+#endif
+ }
+#endif
+
+
retry3:
- if (connect(tmpsock, (struct sockaddr *) &raddr->addr,
- raddr->salen) < 0)
+ if (connect(tmpsock, (struct sockaddr *) &cancel->raddr.addr,
+ cancel->raddr.salen) < 0)
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
@@ -4467,8 +4601,8 @@ retry3:
crp.packetlen = pg_hton32((uint32) sizeof(crp));
crp.cp.cancelRequestCode = (MsgType) pg_hton32(CANCEL_REQUEST_CODE);
- crp.cp.backendPID = pg_hton32(be_pid);
- crp.cp.cancelAuthCode = pg_hton32(be_key);
+ crp.cp.backendPID = pg_hton32(cancel->be_pid);
+ crp.cp.cancelAuthCode = pg_hton32(cancel->be_key);
retry4:
if (send(tmpsock, (char *) &crp, sizeof(crp), 0) != (int) sizeof(crp))
@@ -4538,8 +4672,7 @@ PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
return false;
}
- return internal_cancel(&cancel->raddr, cancel->be_pid, cancel->be_key,
- errbuf, errbufsize);
+ return internal_cancel(cancel, errbuf, errbufsize);
}
/*
@@ -4558,6 +4691,7 @@ int
PQrequestCancel(PGconn *conn)
{
int r;
+ PGcancel *cancel;
/* Check we have an open connection */
if (!conn)
@@ -4573,8 +4707,9 @@ PQrequestCancel(PGconn *conn)
return false;
}
- r = internal_cancel(&conn->raddr, conn->be_pid, conn->be_key,
- conn->errorMessage.data, conn->errorMessage.maxlen);
+ cancel = PQgetCancel(conn);
+ r = PQcancel(cancel, conn->errorMessage.data, conn->errorMessage.maxlen);
+ PQfreeCancel(cancel);
if (!r)
conn->errorMessage.len = strlen(conn->errorMessage.data);
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 490458adef..6dabb14451 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -583,6 +583,13 @@ struct pg_cancel
SockAddr raddr; /* Remote address */
int be_pid; /* PID of backend --- needed for cancels */
int be_key; /* key of backend --- needed for cancels */
+ int pgtcp_user_timeout; /* tcp user timeout */
+ int keepalives; /* use TCP keepalives? */
+ int keepalives_idle; /* time between TCP keepalives */
+ int keepalives_interval; /* time between TCP keepalive
+ * retransmits */
+ int keepalives_count; /* maximum number of TCP keepalive
+ * retransmits */
};
--
2.17.1
0002-Honor-connect_timeout-when-connecting-with-PQcancel.patchapplication/octet-stream; name=0002-Honor-connect_timeout-when-connecting-with-PQcancel.patchDownload
From 78b1e7fc2bcef3d9b1b85b88387e24918411b2e9 Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Tue, 28 Dec 2021 14:29:51 +0100
Subject: [PATCH 2/2] Honor connect_timeout when connecting with PQcancel
PQcancel uses its own connection mechanism, because it needs to be
signal safe. This meant that it was not honoring connect_timeout
connection option. This change fixes that for non windows environments,
by using the timeout option of select() to ensure a connect_timeout.
---
src/interfaces/libpq/fe-connect.c | 128 +++++++++++++++++++++++++++---
src/interfaces/libpq/libpq-int.h | 1 +
2 files changed, 118 insertions(+), 11 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index ed29c64484..61185db329 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4394,11 +4394,20 @@ PQgetCancel(PGconn *conn)
cancel->be_key = conn->be_key;
/* Set all the socket options to -1, so we can use that to see if the */
/* socket options are set or not. */
+ cancel->connect_timeout = -1;
cancel->pgtcp_user_timeout = -1;
cancel->keepalives = -1;
cancel->keepalives_idle = -1;
cancel->keepalives_interval = -1;
cancel->keepalives_count = -1;
+ if (conn->connect_timeout != NULL)
+ {
+ if (!parse_int_param(conn->connect_timeout, &cancel->connect_timeout, conn,
+ "connect_timeout"))
+ {
+ return NULL;
+ }
+ }
if (conn->pgtcp_user_timeout != NULL)
{
if (!parse_int_param(conn->pgtcp_user_timeout, &cancel->pgtcp_user_timeout, conn,
@@ -4479,6 +4488,14 @@ internal_cancel(PGcancel *cancel, char *errbuf, int errbufsize)
pgsocket tmpsock = PGINVALID_SOCKET;
char sebuf[PG_STRERROR_R_BUFLEN];
int maxlen;
+#ifndef WIN32
+ int tmpsockflags = 0;
+ fd_set fdset;
+ bool nonblocking_connect = cancel->connect_timeout > 0;
+ struct timeval connect_timeout;
+#else
+ bool nonblocking_connect = false;
+#endif
struct
{
uint32 packetlen;
@@ -4581,21 +4598,110 @@ internal_cancel(PGcancel *cancel, char *errbuf, int errbufsize)
}
#endif
+#ifndef WIN32
+ if (cancel->connect_timeout > 0)
+ {
+ /*
+ * The connect() system call does not allow a timeout to be specified.
+ * So to honor the connect_timeout that the user specified, we connect
+ * in non blocking mode. Then we provide the connecti select() to
+ */
+ nonblocking_connect = true;
+ tmpsockflags = fcntl(tmpsock, F_GETFL);
+ if (tmpsockflags < 0)
+ {
+ strlcpy(errbuf, "PQcancel() -- fcntl(F_GETFL) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ if (fcntl(tmpsock, F_SETFL, (tmpsockflags | O_NONBLOCK)) == -1)
+ {
+ strlcpy(errbuf, "PQcancel() -- fcntl(F_SETFL, O_NONBLOCK) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+#endif
retry3:
if (connect(tmpsock, (struct sockaddr *) &cancel->raddr.addr,
cancel->raddr.salen) < 0)
{
- if (SOCK_ERRNO == EINTR)
- /* Interrupted system call - we'll just try again */
+ if (nonblocking_connect && (
+ SOCK_ERRNO == EINPROGRESS ||
+#ifdef WIN32
+ SOCK_ERRNO == EWOULDBLOCK ||
+#endif
+ SOCK_ERRNO == EINTR))
+ {
+ /*
+ * This is fine - we're in non-blocking mode, and the connection
+ * is in progress.
+ */
+ }
+ else if (SOCK_ERRNO == EINTR)
+ {
+ /*
+ * Interrupted system call while in blocking mode - we'll just try
+ * again
+ */
goto retry3;
- strlcpy(errbuf, "PQcancel() -- connect() failed: ", errbufsize);
- goto cancel_errReturn;
+ }
+ else
+ {
+ strlcpy(errbuf, "PQcancel() -- connect() failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
}
- /*
- * We needn't set nonblocking I/O or NODELAY options here.
- */
+
+#ifndef WIN32
+ if (cancel->connect_timeout > 0)
+ {
+ connect_timeout.tv_sec = cancel->connect_timeout;
+ connect_timeout.tv_usec = 0;
+
+ FD_ZERO(&fdset);
+ FD_SET(tmpsock, &fdset);
+retry4:
+ if (select(tmpsock + 1, NULL, &fdset, NULL, &connect_timeout) == 1)
+ {
+ int so_error;
+ socklen_t len = sizeof so_error;
+
+ if (SOCK_ERRNO == EINTR)
+ /* Interrupted system call - we'll just try again */
+ goto retry4;
+ if (getsockopt(tmpsock, SOL_SOCKET, SO_ERROR, &so_error, &len))
+ {
+ strlcpy(errbuf, "PQcancel() -- getsockopt(SO_ERROR) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ if (so_error != 0)
+ {
+ SOCK_ERRNO = so_error;
+ strlcpy(errbuf, "PQcancel() -- select() failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+ else
+ {
+ strlcpy(errbuf, "PQcancel() -- Connection timed out\n", errbufsize);
+ closesocket(tmpsock);
+ SOCK_ERRNO_SET(save_errno);
+ return false;
+ }
+
+ /*
+ * Now that we're connected we want a blocking socket again. So we
+ * reset the flags to the value they had before changing to non
+ * blocking mode.
+ */
+ if (fcntl(tmpsock, F_SETFL, tmpsockflags) == -1)
+ {
+ strlcpy(errbuf, "PQcancel() -- fcntl(F_SETFL, !O_NONBLOCK) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+#endif
/* Create and send the cancel request packet. */
@@ -4604,12 +4710,12 @@ retry3:
crp.cp.backendPID = pg_hton32(cancel->be_pid);
crp.cp.cancelAuthCode = pg_hton32(cancel->be_key);
-retry4:
+retry5:
if (send(tmpsock, (char *) &crp, sizeof(crp), 0) != (int) sizeof(crp))
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
- goto retry4;
+ goto retry5;
strlcpy(errbuf, "PQcancel() -- send() failed: ", errbufsize);
goto cancel_errReturn;
}
@@ -4621,12 +4727,12 @@ retry4:
* one we thought we were canceling. Note we don't actually expect this
* read to obtain any data, we are just waiting for EOF to be signaled.
*/
-retry5:
+retry6:
if (recv(tmpsock, (char *) &crp, 1, 0) < 0)
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
- goto retry5;
+ goto retry6;
/* we ignore other error conditions */
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 6dabb14451..c08ecbd06f 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -583,6 +583,7 @@ struct pg_cancel
SockAddr raddr; /* Remote address */
int be_pid; /* PID of backend --- needed for cancels */
int be_key; /* key of backend --- needed for cancels */
+ int connect_timeout; /* connection timeout */
int pgtcp_user_timeout; /* tcp user timeout */
int keepalives; /* use TCP keepalives? */
int keepalives_idle; /* time between TCP keepalives */
--
2.17.1
Hi,
On 2021-12-28 15:49:00 +0000, Jelte Fennema wrote:
The first patch is a cleaned up version of my previous patch. I think I addressed
all feedback on the previous version in that patch (e.g. removed windows code,
fixed formatting).
To me it seems a bit problematic to introduce a divergence between windows /
everything else here. Isn't that just going to lead to other complaints just
like this thread, where somebody discovered the hard way that there's platform
dependent behaviour here?
The second patch is a new one, it implements honouring of the connect_timeout
connection option in PQcancel. This patch requires the first patch to also be applied,
but since it seemed fairly separate and the code is not trivial I didn't want the first
patch to be blocked on this.Finally, I would love it if once these fixes are merged the would also be backpatched to
previous versions of libpq. Does that seem possible? As far as I can tell it would be fine,
since it doesn't really change any of the public APIs. The only change is that the pg_cancel
struct now has a few additional fields. But since that struct is defined in libpq-int.h, so that
struct should not be used by users of libpq directly, right?.
I'm not really convinced this is a good patch to backpatch. There does seem to
be some potential for subtle breakage - code in signal handlers is notoriously
finnicky, it's a rarely exercised code path, etc. It's also not fixing
something that previously worked.
+ * NOTE: These socket options are currently not set for Windows. The + * reason is that signal safety in this function is very important, and it + * was not clear to if the functions required to set the socket options on + * Windows were signal-safe. + */ +#ifndef WIN32 + if (!IS_AF_UNIX(cancel->raddr.addr.ss_family)) + { +#ifdef TCP_USER_TIMEOUT + if (cancel->pgtcp_user_timeout >= 0) + { + if (setsockopt(tmpsock, IPPROTO_TCP, TCP_USER_TIMEOUT, + (char *) &cancel->pgtcp_user_timeout, + sizeof(cancel->pgtcp_user_timeout)) < 0) + { + strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_USER_TIMEOUT) failed: ", errbufsize); + goto cancel_errReturn; + } + } +#endif + + if (cancel->keepalives != 0) + { + int on = 1; + + if (setsockopt(tmpsock, + SOL_SOCKET, SO_KEEPALIVE, + (char *) &on, sizeof(on)) < 0) + { + strlcpy(errbuf, "PQcancel() -- setsockopt(SO_KEEPALIVE) failed: ", errbufsize); + goto cancel_errReturn; + } + }
This is very repetitive - how about introducing a helper function for this?
@@ -4467,8 +4601,8 @@ retry3:
crp.packetlen = pg_hton32((uint32) sizeof(crp)); crp.cp.cancelRequestCode = (MsgType) pg_hton32(CANCEL_REQUEST_CODE); - crp.cp.backendPID = pg_hton32(be_pid); - crp.cp.cancelAuthCode = pg_hton32(be_key); + crp.cp.backendPID = pg_hton32(cancel->be_pid); + crp.cp.cancelAuthCode = pg_hton32(cancel->be_key);
Others might differ, but I'd separate changing the type passed to
internal_cancel() into its own commit.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2021-12-28 15:49:00 +0000, Jelte Fennema wrote:
Finally, I would love it if once these fixes are merged the would also be backpatched to
previous versions of libpq.
I'm not really convinced this is a good patch to backpatch. There does seem to
be some potential for subtle breakage - code in signal handlers is notoriously
finnicky, it's a rarely exercised code path, etc. It's also not fixing
something that previously worked.
IMO, this is a new feature not a bug fix, and as such it's not backpatch
material ... especially if we don't have 100.00% confidence in it.
regards, tom lane
Attached are 3 patches that address the feedback from Andres about code duplication
and splitting up commits. I completely removed internal_cancel now, since it only had
one caller at this point.
IMO, this is a new feature not a bug fix
IMO this is definitely a bugfix. Nowhere in the libpq docs it stated that the connection
options in question do not apply to connections that are opened for cancellations. So
as a user I definitely expect that any connections that libpq opens would use these options.
Which is why I definitely consider it a bug that they are currently not honoured for cancel
requests.
However, even though I think it's a bugfix, I can understand the being hesitant to
backport this. IMHO in that case at least the docs should be updated to explain this
discrepancy. I attached a patch to do so against the docs on the REL_14_STABLE branch.
To me it seems a bit problematic to introduce a divergence between windows /
everything else here. Isn't that just going to lead to other complaints just
like this thread, where somebody discovered the hard way that there's platform
dependent behaviour here?
Of course, fixing this also for windows would be much better. There's two problems:
1. I cannot find any clear documentation on which functions are signal safe in Windows
and which are not. The only reference I can find is this: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/signal?view=msvc-170
However, this says that you should not use any function that generates a system call.
PQcancel is obviously already violating that when calling "connect", so this is not very helpful.
2. My Windows C experience is non existent, so I don't think I would be the right person to write this code.
IMO blocking this bugfix, because it does not fix it for Windows, would be an example of perfect
becoming the enemy of good. One thing I could do is add a note to the docs that these options
are not supported on Windows for cancellation requests (similar to my proposed doc change
for PG14 and below).
Attachments:
0001-Refactor-to-remove-internal_cancel-helper.patchapplication/octet-stream; name=0001-Refactor-to-remove-internal_cancel-helper.patchDownload
From c7eee6f7c3f5426964c07c6bd9a44247c2017d4e Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Thu, 6 Jan 2022 15:11:22 +0100
Subject: [PATCH 1/3] Refactor to remove internal_cancel helper
To simplify future changes to PQcancel this removes the internal_cancel
helper function. Now PQrequestCancel calls PQcancel internally, instead
of having both PQcancel and PQrequestCancel call the shared
internal_cancel helper function.
---
src/interfaces/libpq/fe-connect.c | 58 ++++++++++++-------------------
1 file changed, 22 insertions(+), 36 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 9b6a6939f0..85fce856a5 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4406,14 +4406,17 @@ PQfreeCancel(PGcancel *cancel)
/*
- * PQcancel and PQrequestCancel: attempt to request cancellation of the
- * current operation.
+ * PQcancel: request query cancel
*
* The return value is true if the cancel request was successfully
* dispatched, false if not (in which case an error message is available).
* Note: successful dispatch is no guarantee that there will be any effect at
* the backend. The application must read the operation result as usual.
*
+ * On failure, an error message is stored in *errbuf, which must be of size
+ * errbufsize (recommended size is 256 bytes). *errbuf is not changed on
+ * success return.
+ *
* CAUTION: we want this routine to be safely callable from a signal handler
* (for example, an application might want to call it in a SIGINT handler).
* This means we cannot use any C library routine that might be non-reentrant.
@@ -4421,13 +4424,9 @@ PQfreeCancel(PGcancel *cancel)
* just as dangerous. We avoid sprintf here for that reason. Building up
* error messages with strcpy/strcat is tedious but should be quite safe.
* We also save/restore errno in case the signal handler support doesn't.
- *
- * internal_cancel() is an internal helper function to make code-sharing
- * between the two versions of the cancel function possible.
*/
-static int
-internal_cancel(SockAddr *raddr, int be_pid, int be_key,
- char *errbuf, int errbufsize)
+int
+PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
{
int save_errno = SOCK_ERRNO;
pgsocket tmpsock = PGINVALID_SOCKET;
@@ -4439,18 +4438,24 @@ internal_cancel(SockAddr *raddr, int be_pid, int be_key,
CancelRequestPacket cp;
} crp;
+ if (!cancel)
+ {
+ strlcpy(errbuf, "PQcancel() -- no cancel object supplied", errbufsize);
+ return false;
+ }
+
/*
* We need to open a temporary connection to the postmaster. Do this with
* only kernel calls.
*/
- if ((tmpsock = socket(raddr->addr.ss_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
+ if ((tmpsock = socket(cancel->raddr.addr.ss_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
{
strlcpy(errbuf, "PQcancel() -- socket() failed: ", errbufsize);
goto cancel_errReturn;
}
retry3:
- if (connect(tmpsock, (struct sockaddr *) &raddr->addr,
- raddr->salen) < 0)
+ if (connect(tmpsock, (struct sockaddr *) &cancel->raddr.addr,
+ cancel->raddr.salen) < 0)
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
@@ -4467,8 +4472,8 @@ retry3:
crp.packetlen = pg_hton32((uint32) sizeof(crp));
crp.cp.cancelRequestCode = (MsgType) pg_hton32(CANCEL_REQUEST_CODE);
- crp.cp.backendPID = pg_hton32(be_pid);
- crp.cp.cancelAuthCode = pg_hton32(be_key);
+ crp.cp.backendPID = pg_hton32(cancel->be_pid);
+ crp.cp.cancelAuthCode = pg_hton32(cancel->be_key);
retry4:
if (send(tmpsock, (char *) &crp, sizeof(crp), 0) != (int) sizeof(crp))
@@ -4520,27 +4525,6 @@ cancel_errReturn:
return false;
}
-/*
- * PQcancel: request query cancel
- *
- * Returns true if able to send the cancel request, false if not.
- *
- * On failure, an error message is stored in *errbuf, which must be of size
- * errbufsize (recommended size is 256 bytes). *errbuf is not changed on
- * success return.
- */
-int
-PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
-{
- if (!cancel)
- {
- strlcpy(errbuf, "PQcancel() -- no cancel object supplied", errbufsize);
- return false;
- }
-
- return internal_cancel(&cancel->raddr, cancel->be_pid, cancel->be_key,
- errbuf, errbufsize);
-}
/*
* PQrequestCancel: old, not thread-safe function for requesting query cancel
@@ -4558,6 +4542,7 @@ int
PQrequestCancel(PGconn *conn)
{
int r;
+ PGcancel *cancel;
/* Check we have an open connection */
if (!conn)
@@ -4573,8 +4558,9 @@ PQrequestCancel(PGconn *conn)
return false;
}
- r = internal_cancel(&conn->raddr, conn->be_pid, conn->be_key,
- conn->errorMessage.data, conn->errorMessage.maxlen);
+ cancel = PQgetCancel(conn);
+ r = PQcancel(cancel, conn->errorMessage.data, conn->errorMessage.maxlen);
+ PQfreeCancel(cancel);
if (!r)
conn->errorMessage.len = strlen(conn->errorMessage.data);
--
2.17.1
0002-Use-timeout-socket-options-for-cancel-connections.patchapplication/octet-stream; name=0002-Use-timeout-socket-options-for-cancel-connections.patchDownload
From f53bb6b006ddd92798b3176a876e0cca6d238a69 Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Thu, 6 Jan 2022 15:39:27 +0100
Subject: [PATCH 2/3] Use timeout socket options for cancel connections
PQcancel would not adhere to the timeout specified in the
tcp_user_timeout connection option. It would also not enable keepalives
on the socket. So no keepalives would be used by the on these
connections at all, not even the system configured ones.
This means a call to PQcancel could take much longer than expected,
possibly indefinitely. This can especially be an issue because there's
no non-blocking version of PQcancel.
This patch correctly sets both tcp_user_timeout and keepalive related
socket options on the connection that's used to cancel a query. It only
does so for non Windows systems, because the signal safety of the
related windows functions is not clear to me.
---
src/interfaces/libpq/fe-connect.c | 135 +++++++++++++++++++++++++++++-
src/interfaces/libpq/libpq-int.h | 7 ++
2 files changed, 141 insertions(+), 1 deletion(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 85fce856a5..49d183d13c 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4385,13 +4385,60 @@ PQgetCancel(PGconn *conn)
if (conn->sock == PGINVALID_SOCKET)
return NULL;
- cancel = malloc(sizeof(PGcancel));
+ cancel = calloc(1, sizeof(PGcancel));
if (cancel == NULL)
return NULL;
memcpy(&cancel->raddr, &conn->raddr, sizeof(SockAddr));
cancel->be_pid = conn->be_pid;
cancel->be_key = conn->be_key;
+ /* Set all the socket options to -1, so we can use that to see if the */
+ /* socket options are set or not. */
+ cancel->pgtcp_user_timeout = -1;
+ cancel->keepalives = -1;
+ cancel->keepalives_idle = -1;
+ cancel->keepalives_interval = -1;
+ cancel->keepalives_count = -1;
+ if (conn->pgtcp_user_timeout != NULL)
+ {
+ if (!parse_int_param(conn->pgtcp_user_timeout, &cancel->pgtcp_user_timeout, conn,
+ "tcp_user_timeout"))
+ {
+ return NULL;
+ }
+ }
+ if (conn->keepalives != NULL)
+ {
+ if (!parse_int_param(conn->keepalives, &cancel->keepalives, conn,
+ "keepalives"))
+ {
+ return NULL;
+ }
+ }
+ if (conn->keepalives_idle != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_idle, &cancel->keepalives_idle, conn,
+ "keepalives_idle"))
+ {
+ return NULL;
+ }
+ }
+ if (conn->keepalives_interval != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_interval, &cancel->keepalives_interval, conn,
+ "keepalives_interval"))
+ {
+ return NULL;
+ }
+ }
+ if (conn->keepalives_count != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_count, &cancel->keepalives_count, conn,
+ "keepalives_count"))
+ {
+ return NULL;
+ }
+ }
return cancel;
}
@@ -4405,6 +4452,28 @@ PQfreeCancel(PGcancel *cancel)
}
+/*
+ * Sets an integer based socket options on TCP socket if the provided value is
+ * not negative. Returns false if setsockopt fails for some reason.
+ *
+ * CAUTION: This needs to be signal safe, since it's used by PQcancel.
+ */
+static bool
+set_optional_int_tcp_sockopt(int fd, int optname, int value) {
+ if (value < 0)
+ {
+ return true;
+ }
+ if (setsockopt(fd, IPPROTO_TCP, optname,
+ (char *) &value,
+ sizeof(value)) < 0)
+ {
+ return false;
+ }
+ return true;
+}
+
+
/*
* PQcancel: request query cancel
*
@@ -4453,6 +4522,70 @@ PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
strlcpy(errbuf, "PQcancel() -- socket() failed: ", errbufsize);
goto cancel_errReturn;
}
+
+ /*
+ * Since this connection will only be used to send a tiny bit of data, we
+ * don't need NODELAY. We also don't set the socket to nonblocking mode,
+ * because the API definition of PQcancel currently requires the cancel to
+ * be sent in a blocking way.
+ *
+ * We do set various other socket options related to keepalives and other
+ * TCP timeouts. This is necessary to make sure that the call to this
+ * function does not block indefinitly, when reasonable keepalive and
+ * settings have been provided.
+ *
+ * NOTE: These socket options are currently not set for Windows. The
+ * reason is that signal safety in this function is very important, and it
+ * was not clear to if the functions required to set the socket options on
+ * Windows were signal-safe.
+ */
+#ifndef WIN32
+ if (!IS_AF_UNIX(cancel->raddr.addr.ss_family))
+ {
+#ifdef TCP_USER_TIMEOUT
+ if (!set_optional_int_tcp_sockopt(tmpsock, TCP_USER_TIMEOUT, cancel->pgtcp_user_timeout)) {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_USER_TIMEOUT) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+
+ if (cancel->keepalives != 0)
+ {
+ int on = 1;
+
+ if (setsockopt(tmpsock,
+ SOL_SOCKET, SO_KEEPALIVE,
+ (char *) &on, sizeof(on)) < 0)
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(SO_KEEPALIVE) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+
+#ifdef PG_TCP_KEEPALIVE_IDLE
+ if (!set_optional_int_tcp_sockopt(tmpsock, PG_TCP_KEEPALIVE_IDLE, cancel->keepalives_idle)) {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(" PG_TCP_KEEPALIVE_IDLE_STR ") failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+
+#ifdef TCP_KEEPINTVL
+ if (!set_optional_int_tcp_sockopt(tmpsock, TCP_KEEPINTVL, cancel->keepalives_interval)) {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_KEEPINTVL) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+
+#ifdef TCP_KEEPCNT
+ if (!set_optional_int_tcp_sockopt(tmpsock, TCP_KEEPCNT, cancel->keepalives_count)) {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_KEEPCNT) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+ }
+#endif
+
+
retry3:
if (connect(tmpsock, (struct sockaddr *) &cancel->raddr.addr,
cancel->raddr.salen) < 0)
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 490458adef..6dabb14451 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -583,6 +583,13 @@ struct pg_cancel
SockAddr raddr; /* Remote address */
int be_pid; /* PID of backend --- needed for cancels */
int be_key; /* key of backend --- needed for cancels */
+ int pgtcp_user_timeout; /* tcp user timeout */
+ int keepalives; /* use TCP keepalives? */
+ int keepalives_idle; /* time between TCP keepalives */
+ int keepalives_interval; /* time between TCP keepalive
+ * retransmits */
+ int keepalives_count; /* maximum number of TCP keepalive
+ * retransmits */
};
--
2.17.1
0003-Honor-connect_timeout-when-connecting-with-PQcancel.patchapplication/octet-stream; name=0003-Honor-connect_timeout-when-connecting-with-PQcancel.patchDownload
From f3e52a2541836c396bf5d8c3822bff9cfca59e9a Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Tue, 28 Dec 2021 14:29:51 +0100
Subject: [PATCH 3/3] Honor connect_timeout when connecting with PQcancel
PQcancel uses its own connection mechanism, because it needs to be
signal safe. This meant that it was not honoring connect_timeout
connection option. This change fixes that for non windows environments,
by using the timeout option of select() to ensure a connect_timeout.
---
src/interfaces/libpq/fe-connect.c | 128 +++++++++++++++++++++++++++---
src/interfaces/libpq/libpq-int.h | 1 +
2 files changed, 118 insertions(+), 11 deletions(-)
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 49d183d13c..d23d544cad 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4394,11 +4394,20 @@ PQgetCancel(PGconn *conn)
cancel->be_key = conn->be_key;
/* Set all the socket options to -1, so we can use that to see if the */
/* socket options are set or not. */
+ cancel->connect_timeout = -1;
cancel->pgtcp_user_timeout = -1;
cancel->keepalives = -1;
cancel->keepalives_idle = -1;
cancel->keepalives_interval = -1;
cancel->keepalives_count = -1;
+ if (conn->connect_timeout != NULL)
+ {
+ if (!parse_int_param(conn->connect_timeout, &cancel->connect_timeout, conn,
+ "connect_timeout"))
+ {
+ return NULL;
+ }
+ }
if (conn->pgtcp_user_timeout != NULL)
{
if (!parse_int_param(conn->pgtcp_user_timeout, &cancel->pgtcp_user_timeout, conn,
@@ -4501,6 +4510,14 @@ PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
pgsocket tmpsock = PGINVALID_SOCKET;
char sebuf[PG_STRERROR_R_BUFLEN];
int maxlen;
+#ifndef WIN32
+ int tmpsockflags = 0;
+ fd_set fdset;
+ bool nonblocking_connect = cancel->connect_timeout > 0;
+ struct timeval connect_timeout;
+#else
+ bool nonblocking_connect = false;
+#endif
struct
{
uint32 packetlen;
@@ -4585,21 +4602,110 @@ PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
}
#endif
+#ifndef WIN32
+ if (cancel->connect_timeout > 0)
+ {
+ /*
+ * The connect() system call does not allow a timeout to be specified.
+ * So to honor the connect_timeout that the user specified, we connect
+ * in non blocking mode. Then we provide the connecti select() to
+ */
+ nonblocking_connect = true;
+ tmpsockflags = fcntl(tmpsock, F_GETFL);
+ if (tmpsockflags < 0)
+ {
+ strlcpy(errbuf, "PQcancel() -- fcntl(F_GETFL) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ if (fcntl(tmpsock, F_SETFL, (tmpsockflags | O_NONBLOCK)) == -1)
+ {
+ strlcpy(errbuf, "PQcancel() -- fcntl(F_SETFL, O_NONBLOCK) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+#endif
retry3:
if (connect(tmpsock, (struct sockaddr *) &cancel->raddr.addr,
cancel->raddr.salen) < 0)
{
- if (SOCK_ERRNO == EINTR)
- /* Interrupted system call - we'll just try again */
+ if (nonblocking_connect && (
+ SOCK_ERRNO == EINPROGRESS ||
+#ifdef WIN32
+ SOCK_ERRNO == EWOULDBLOCK ||
+#endif
+ SOCK_ERRNO == EINTR))
+ {
+ /*
+ * This is fine - we're in non-blocking mode, and the connection
+ * is in progress.
+ */
+ }
+ else if (SOCK_ERRNO == EINTR)
+ {
+ /*
+ * Interrupted system call while in blocking mode - we'll just try
+ * again
+ */
goto retry3;
- strlcpy(errbuf, "PQcancel() -- connect() failed: ", errbufsize);
- goto cancel_errReturn;
+ }
+ else
+ {
+ strlcpy(errbuf, "PQcancel() -- connect() failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
}
- /*
- * We needn't set nonblocking I/O or NODELAY options here.
- */
+
+#ifndef WIN32
+ if (cancel->connect_timeout > 0)
+ {
+ connect_timeout.tv_sec = cancel->connect_timeout;
+ connect_timeout.tv_usec = 0;
+
+ FD_ZERO(&fdset);
+ FD_SET(tmpsock, &fdset);
+retry4:
+ if (select(tmpsock + 1, NULL, &fdset, NULL, &connect_timeout) == 1)
+ {
+ int so_error;
+ socklen_t len = sizeof so_error;
+
+ if (SOCK_ERRNO == EINTR)
+ /* Interrupted system call - we'll just try again */
+ goto retry4;
+ if (getsockopt(tmpsock, SOL_SOCKET, SO_ERROR, &so_error, &len))
+ {
+ strlcpy(errbuf, "PQcancel() -- getsockopt(SO_ERROR) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ if (so_error != 0)
+ {
+ SOCK_ERRNO = so_error;
+ strlcpy(errbuf, "PQcancel() -- select() failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+ else
+ {
+ strlcpy(errbuf, "PQcancel() -- Connection timed out\n", errbufsize);
+ closesocket(tmpsock);
+ SOCK_ERRNO_SET(save_errno);
+ return false;
+ }
+
+ /*
+ * Now that we're connected we want a blocking socket again. So we
+ * reset the flags to the value they had before changing to non
+ * blocking mode.
+ */
+ if (fcntl(tmpsock, F_SETFL, tmpsockflags) == -1)
+ {
+ strlcpy(errbuf, "PQcancel() -- fcntl(F_SETFL, !O_NONBLOCK) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+ }
+#endif
/* Create and send the cancel request packet. */
@@ -4608,12 +4714,12 @@ retry3:
crp.cp.backendPID = pg_hton32(cancel->be_pid);
crp.cp.cancelAuthCode = pg_hton32(cancel->be_key);
-retry4:
+retry5:
if (send(tmpsock, (char *) &crp, sizeof(crp), 0) != (int) sizeof(crp))
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
- goto retry4;
+ goto retry5;
strlcpy(errbuf, "PQcancel() -- send() failed: ", errbufsize);
goto cancel_errReturn;
}
@@ -4625,12 +4731,12 @@ retry4:
* one we thought we were canceling. Note we don't actually expect this
* read to obtain any data, we are just waiting for EOF to be signaled.
*/
-retry5:
+retry6:
if (recv(tmpsock, (char *) &crp, 1, 0) < 0)
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
- goto retry5;
+ goto retry6;
/* we ignore other error conditions */
}
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 6dabb14451..c08ecbd06f 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -583,6 +583,7 @@ struct pg_cancel
SockAddr raddr; /* Remote address */
int be_pid; /* PID of backend --- needed for cancels */
int be_key; /* key of backend --- needed for cancels */
+ int connect_timeout; /* connection timeout */
int pgtcp_user_timeout; /* tcp user timeout */
int keepalives; /* use TCP keepalives? */
int keepalives_idle; /* time between TCP keepalives */
--
2.17.1
0001-Doc-explain-that-connection-options-don-t-apply-to-c.patchapplication/octet-stream; name=0001-Doc-explain-that-connection-options-don-t-apply-to-c.patchDownload
From e474b8ba1effa93754bbe8bfdcbd61ab7e512e27 Mon Sep 17 00:00:00 2001
From: Jelte Fennema <jelte.fennema@microsoft.com>
Date: Thu, 6 Jan 2022 16:45:25 +0100
Subject: [PATCH] Doc: explain that connection options don't apply to
cancellations
---
doc/src/sgml/libpq.sgml | 18 ++++++++++++++----
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 14f35d37f6..f3a860aaa3 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1255,7 +1255,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
For example, if you specify two hosts and <literal>connect_timeout</literal>
is 5, each host will time out if no connection is made within 5
seconds, so the total time spent waiting for a connection might be
- up to 10 seconds.
+ up to 10 seconds. This paramater is ignored for connections that are
+ used to send cancellation requests.
</para>
</listitem>
</varlistentry>
@@ -1324,7 +1325,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
Controls whether client-side TCP keepalives are used. The default
value is 1, meaning on, but you can change this to 0, meaning off,
if keepalives are not wanted. This parameter is ignored for
- connections made via a Unix-domain socket.
+ connections made via a Unix-domain socket. This paramater is also
+ ignored for connections that are used to send cancellation requests.
</para>
</listitem>
</varlistentry>
@@ -1336,7 +1338,9 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
Controls the number of seconds of inactivity after which TCP should
send a keepalive message to the server. A value of zero uses the
system default. This parameter is ignored for connections made via a
- Unix-domain socket, or if keepalives are disabled.
+ Unix-domain socket, or if keepalives are disabled. This paramater is
+ also ignored for connections that are used to send cancellation
+ requests.
It is only supported on systems where <symbol>TCP_KEEPIDLE</symbol> or
an equivalent socket option is available, and on Windows; on other
systems, it has no effect.
@@ -1352,6 +1356,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
that is not acknowledged by the server should be retransmitted. A
value of zero uses the system default. This parameter is ignored for
connections made via a Unix-domain socket, or if keepalives are disabled.
+ This paramater is also ignored for connections that are used to send
+ cancellation requests.
It is only supported on systems where <symbol>TCP_KEEPINTVL</symbol> or
an equivalent socket option is available, and on Windows; on other
systems, it has no effect.
@@ -1367,6 +1373,8 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
client's connection to the server is considered dead. A value of
zero uses the system default. This parameter is ignored for
connections made via a Unix-domain socket, or if keepalives are disabled.
+ This paramater is also ignored for connections that are used to send
+ cancellation requests.
It is only supported on systems where <symbol>TCP_KEEPCNT</symbol> or
an equivalent socket option is available; on other systems, it has no
effect.
@@ -1381,7 +1389,9 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
Controls the number of milliseconds that transmitted data may
remain unacknowledged before a connection is forcibly closed.
A value of zero uses the system default. This parameter is
- ignored for connections made via a Unix-domain socket.
+ ignored for connections made via a Unix-domain socket. This paramater
+ is also ignored for connections that are used to send cancellation
+ requests.
It is only supported on systems where <symbol>TCP_USER_TIMEOUT</symbol>
is available; on other systems, it has no effect.
</para>
--
2.17.1
Jelte Fennema <Jelte.Fennema@microsoft.com> writes:
Attached are 3 patches that address the feedback from Andres about code duplication
and splitting up commits. I completely removed internal_cancel now, since it only had
one caller at this point.
Here's some cleaned-up versions of 0001 and 0002. I have not bothered
with 0003 because I for one will not be convinced to commit it. The
risk-vs-reward ratio looks far too high on that, and it's not obvious
why 0002 doesn't already solve whatever problem there is.
A couple of notes:
0001 introduces malloc/free into PQrequestCancel, which promotes it
from being "probably unsafe in a signal handler" to "definitely unsafe
in a signal handler". Before, maybe you could have gotten away with
it if you were sure the PGconn object was idle, but now it's no-go for
sure. I don't have a big problem with that, given that it's been
deprecated for decades, but I made the warning text in libpq.sgml a
little stronger.
As for 0002, I don't see a really good reason why we shouldn't try
to do it on Windows too. If connect() will work, then it seems
likely that setsockopt() and WSAIOCtl() will too. Moreover, note
that at least in our own programs, PQcancel doesn't *really* run in a
signal handler on Windows: see commentary in src/fe_utils/cancel.c.
(The fact that we now have test coverage for PQcancel makes me a lot
more willing to try this than I might otherwise be. Will be
interested to see the cfbot's results.)
Also, I was somewhat surprised while working on this to realize
that PQconnectPoll doesn't call setTCPUserTimeout if keepalives
are disabled per useKeepalives(). I made PQcancel act the same,
but I wonder if that was intentional or a bug. I'd personally
have thought that those settings were independent.
regards, tom lane
Attachments:
v2-0001-Refactor-to-remove-internal_cancel-helper.patchtext/x-diff; charset=us-ascii; name=v2-0001-Refactor-to-remove-internal_cancel-helper.patchDownload
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 14f35d37f6..e0ab7cd555 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -5707,8 +5707,8 @@ int PQrequestCancel(PGconn *conn);
<structname>PGconn</structname> object, and in case of failure stores the
error message in the <structname>PGconn</structname> object (whence it can
be retrieved by <xref linkend="libpq-PQerrorMessage"/>). Although
- the functionality is the same, this approach creates hazards for
- multiple-thread programs and signal handlers, since it is possible
+ the functionality is the same, this approach is not safe within
+ multiple-thread programs or signal handlers, since it is possible
that overwriting the <structname>PGconn</structname>'s error message will
mess up the operation currently in progress on the connection.
</para>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 5fc16be849..9e75abd304 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4394,14 +4394,17 @@ PQfreeCancel(PGcancel *cancel)
/*
- * PQcancel and PQrequestCancel: attempt to request cancellation of the
- * current operation.
+ * PQcancel: request query cancel
*
* The return value is true if the cancel request was successfully
* dispatched, false if not (in which case an error message is available).
* Note: successful dispatch is no guarantee that there will be any effect at
* the backend. The application must read the operation result as usual.
*
+ * On failure, an error message is stored in *errbuf, which must be of size
+ * errbufsize (recommended size is 256 bytes). *errbuf is not changed on
+ * success return.
+ *
* CAUTION: we want this routine to be safely callable from a signal handler
* (for example, an application might want to call it in a SIGINT handler).
* This means we cannot use any C library routine that might be non-reentrant.
@@ -4409,13 +4412,9 @@ PQfreeCancel(PGcancel *cancel)
* just as dangerous. We avoid sprintf here for that reason. Building up
* error messages with strcpy/strcat is tedious but should be quite safe.
* We also save/restore errno in case the signal handler support doesn't.
- *
- * internal_cancel() is an internal helper function to make code-sharing
- * between the two versions of the cancel function possible.
*/
-static int
-internal_cancel(SockAddr *raddr, int be_pid, int be_key,
- char *errbuf, int errbufsize)
+int
+PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
{
int save_errno = SOCK_ERRNO;
pgsocket tmpsock = PGINVALID_SOCKET;
@@ -4427,18 +4426,26 @@ internal_cancel(SockAddr *raddr, int be_pid, int be_key,
CancelRequestPacket cp;
} crp;
+ if (!cancel)
+ {
+ strlcpy(errbuf, "PQcancel() -- no cancel object supplied", errbufsize);
+ /* strlcpy probably doesn't change errno, but be paranoid */
+ SOCK_ERRNO_SET(save_errno);
+ return false;
+ }
+
/*
* We need to open a temporary connection to the postmaster. Do this with
* only kernel calls.
*/
- if ((tmpsock = socket(raddr->addr.ss_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
+ if ((tmpsock = socket(cancel->raddr.addr.ss_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
{
strlcpy(errbuf, "PQcancel() -- socket() failed: ", errbufsize);
goto cancel_errReturn;
}
retry3:
- if (connect(tmpsock, (struct sockaddr *) &raddr->addr,
- raddr->salen) < 0)
+ if (connect(tmpsock, (struct sockaddr *) &cancel->raddr.addr,
+ cancel->raddr.salen) < 0)
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
@@ -4455,8 +4462,8 @@ retry3:
crp.packetlen = pg_hton32((uint32) sizeof(crp));
crp.cp.cancelRequestCode = (MsgType) pg_hton32(CANCEL_REQUEST_CODE);
- crp.cp.backendPID = pg_hton32(be_pid);
- crp.cp.cancelAuthCode = pg_hton32(be_key);
+ crp.cp.backendPID = pg_hton32(cancel->be_pid);
+ crp.cp.cancelAuthCode = pg_hton32(cancel->be_key);
retry4:
if (send(tmpsock, (char *) &crp, sizeof(crp), 0) != (int) sizeof(crp))
@@ -4508,27 +4515,6 @@ cancel_errReturn:
return false;
}
-/*
- * PQcancel: request query cancel
- *
- * Returns true if able to send the cancel request, false if not.
- *
- * On failure, an error message is stored in *errbuf, which must be of size
- * errbufsize (recommended size is 256 bytes). *errbuf is not changed on
- * success return.
- */
-int
-PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
-{
- if (!cancel)
- {
- strlcpy(errbuf, "PQcancel() -- no cancel object supplied", errbufsize);
- return false;
- }
-
- return internal_cancel(&cancel->raddr, cancel->be_pid, cancel->be_key,
- errbuf, errbufsize);
-}
/*
* PQrequestCancel: old, not thread-safe function for requesting query cancel
@@ -4546,6 +4532,7 @@ int
PQrequestCancel(PGconn *conn)
{
int r;
+ PGcancel *cancel;
/* Check we have an open connection */
if (!conn)
@@ -4561,8 +4548,19 @@ PQrequestCancel(PGconn *conn)
return false;
}
- r = internal_cancel(&conn->raddr, conn->be_pid, conn->be_key,
- conn->errorMessage.data, conn->errorMessage.maxlen);
+ cancel = PQgetCancel(conn);
+ if (cancel)
+ {
+ r = PQcancel(cancel, conn->errorMessage.data,
+ conn->errorMessage.maxlen);
+ PQfreeCancel(cancel);
+ }
+ else
+ {
+ strlcpy(conn->errorMessage.data, "out of memory",
+ conn->errorMessage.maxlen);
+ r = false;
+ }
if (!r)
conn->errorMessage.len = strlen(conn->errorMessage.data);
v2-0002-Use-timeout-socket-options-for-cancel-connections.patchtext/x-diff; charset=us-ascii; name*0=v2-0002-Use-timeout-socket-options-for-cancel-connections.p; name*1=atchDownload
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 9e75abd304..a4a559904d 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1944,26 +1944,17 @@ setKeepalivesCount(PGconn *conn)
/*
* Enable keepalives and set the keepalive values on Win32,
* where they are always set in one batch.
+ *
+ * setKeepalivesWin32 is used by PQcancel, and had better be signal safe.
*/
static int
-setKeepalivesWin32(PGconn *conn)
+setKeepalivesWin32(pgsocket sock, int idle, int interval)
{
struct tcp_keepalive ka;
DWORD retsize;
- int idle = 0;
- int interval = 0;
- if (conn->keepalives_idle &&
- !parse_int_param(conn->keepalives_idle, &idle, conn,
- "keepalives_idle"))
- return 0;
if (idle <= 0)
idle = 2 * 60 * 60; /* 2 hours = default */
-
- if (conn->keepalives_interval &&
- !parse_int_param(conn->keepalives_interval, &interval, conn,
- "keepalives_interval"))
- return 0;
if (interval <= 0)
interval = 1; /* 1 second = default */
@@ -1971,7 +1962,7 @@ setKeepalivesWin32(PGconn *conn)
ka.keepalivetime = idle * 1000;
ka.keepaliveinterval = interval * 1000;
- if (WSAIoctl(conn->sock,
+ if (WSAIoctl(sock,
SIO_KEEPALIVE_VALS,
(LPVOID) &ka,
sizeof(ka),
@@ -1981,6 +1972,26 @@ setKeepalivesWin32(PGconn *conn)
NULL,
NULL)
!= 0)
+ return 0;
+ return 1;
+}
+
+static int
+prepKeepalivesWin32(PGconn *conn)
+{
+ int idle = -1;
+ int interval = -1;
+
+ if (conn->keepalives_idle &&
+ !parse_int_param(conn->keepalives_idle, &idle, conn,
+ "keepalives_idle"))
+ return 0;
+ if (conn->keepalives_interval &&
+ !parse_int_param(conn->keepalives_interval, &interval, conn,
+ "keepalives_interval"))
+ return 0;
+
+ if (!setKeepalivesWin32(conn->sock, idle, interval))
{
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("%s(%s) failed: error code %d\n"),
@@ -2644,7 +2655,7 @@ keep_going: /* We will come back to here until there is
err = 1;
#else /* WIN32 */
#ifdef SIO_KEEPALIVE_VALS
- else if (!setKeepalivesWin32(conn))
+ else if (!prepKeepalivesWin32(conn))
err = 1;
#endif /* SIO_KEEPALIVE_VALS */
#endif /* WIN32 */
@@ -4380,8 +4391,53 @@ PQgetCancel(PGconn *conn)
memcpy(&cancel->raddr, &conn->raddr, sizeof(SockAddr));
cancel->be_pid = conn->be_pid;
cancel->be_key = conn->be_key;
+ /* We use -1 to indicate an unset connection option */
+ cancel->pgtcp_user_timeout = -1;
+ cancel->keepalives = -1;
+ cancel->keepalives_idle = -1;
+ cancel->keepalives_interval = -1;
+ cancel->keepalives_count = -1;
+ if (conn->pgtcp_user_timeout != NULL)
+ {
+ if (!parse_int_param(conn->pgtcp_user_timeout,
+ &cancel->pgtcp_user_timeout,
+ conn, "tcp_user_timeout"))
+ goto fail;
+ }
+ if (conn->keepalives != NULL)
+ {
+ if (!parse_int_param(conn->keepalives,
+ &cancel->keepalives,
+ conn, "keepalives"))
+ goto fail;
+ }
+ if (conn->keepalives_idle != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_idle,
+ &cancel->keepalives_idle,
+ conn, "keepalives_idle"))
+ goto fail;
+ }
+ if (conn->keepalives_interval != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_interval,
+ &cancel->keepalives_interval,
+ conn, "keepalives_interval"))
+ goto fail;
+ }
+ if (conn->keepalives_count != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_count,
+ &cancel->keepalives_count,
+ conn, "keepalives_count"))
+ goto fail;
+ }
return cancel;
+
+fail:
+ free(cancel);
+ return NULL;
}
/* PQfreeCancel: free a cancel structure */
@@ -4393,6 +4449,23 @@ PQfreeCancel(PGcancel *cancel)
}
+/*
+ * Sets an integer socket option on a TCP socket, if the provided value is
+ * not negative. Returns false if setsockopt fails for some reason.
+ *
+ * CAUTION: This needs to be signal safe, since it's used by PQcancel.
+ */
+static bool
+optional_setsockopt(int fd, int protoid, int optid, int value)
+{
+ if (value < 0)
+ return true;
+ if (setsockopt(fd, protoid, optid, (char *) &value, sizeof(value)) < 0)
+ return false;
+ return true;
+}
+
+
/*
* PQcancel: request query cancel
*
@@ -4443,6 +4516,78 @@ PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
strlcpy(errbuf, "PQcancel() -- socket() failed: ", errbufsize);
goto cancel_errReturn;
}
+
+ /*
+ * Since this connection will only be used to send a tiny bit of data, we
+ * don't need NODELAY. We also don't set the socket to nonblocking mode,
+ * because the API definition of PQcancel requires the cancel to be sent
+ * in a blocking way.
+ *
+ * We do set socket options related to keepalives and other TCP timeouts.
+ * This ensures that this function does not block indefinitely when
+ * reasonable keepalive and timeout settings have been provided.
+ */
+ if (!IS_AF_UNIX(cancel->raddr.addr.ss_family) &&
+ cancel->keepalives != 0)
+ {
+#ifndef WIN32
+ if (!optional_setsockopt(tmpsock, SOL_SOCKET, SO_KEEPALIVE, 1))
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(SO_KEEPALIVE) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+
+#ifdef PG_TCP_KEEPALIVE_IDLE
+ if (!optional_setsockopt(tmpsock, IPPROTO_TCP, PG_TCP_KEEPALIVE_IDLE,
+ cancel->keepalives_idle))
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(" PG_TCP_KEEPALIVE_IDLE_STR ") failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+
+#ifdef TCP_KEEPINTVL
+ if (!optional_setsockopt(tmpsock, IPPROTO_TCP, TCP_KEEPINTVL,
+ cancel->keepalives_interval))
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_KEEPINTVL) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+
+#ifdef TCP_KEEPCNT
+ if (!optional_setsockopt(tmpsock, IPPROTO_TCP, TCP_KEEPCNT,
+ cancel->keepalives_count))
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_KEEPCNT) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+
+#else /* WIN32 */
+
+#ifdef SIO_KEEPALIVE_VALS
+ if (!setKeepalivesWin32(tmpsock,
+ cancel->keepalives_idle,
+ cancel->keepalives_interval))
+ {
+ strlcpy(errbuf, "PQcancel() -- WSAIoctl(SIO_KEEPALIVE_VALS) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif /* SIO_KEEPALIVE_VALS */
+#endif /* WIN32 */
+
+ /* TCP_USER_TIMEOUT works the same way on Unix and Windows */
+#ifdef TCP_USER_TIMEOUT
+ if (!optional_setsockopt(tmpsock, IPPROTO_TCP, TCP_USER_TIMEOUT,
+ cancel->pgtcp_user_timeout))
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_USER_TIMEOUT) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+ }
+
retry3:
if (connect(tmpsock, (struct sockaddr *) &cancel->raddr.addr,
cancel->raddr.salen) < 0)
@@ -4454,10 +4599,6 @@ retry3:
goto cancel_errReturn;
}
- /*
- * We needn't set nonblocking I/O or NODELAY options here.
- */
-
/* Create and send the cancel request packet. */
crp.packetlen = pg_hton32((uint32) sizeof(crp));
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index fcce13843e..4290553482 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -583,6 +583,13 @@ struct pg_cancel
SockAddr raddr; /* Remote address */
int be_pid; /* PID of backend --- needed for cancels */
int be_key; /* key of backend --- needed for cancels */
+ int pgtcp_user_timeout; /* tcp user timeout */
+ int keepalives; /* use TCP keepalives? */
+ int keepalives_idle; /* time between TCP keepalives */
+ int keepalives_interval; /* time between TCP keepalive
+ * retransmits */
+ int keepalives_count; /* maximum number of TCP keepalive
+ * retransmits */
};
I wrote:
(The fact that we now have test coverage for PQcancel makes me a lot
more willing to try this than I might otherwise be. Will be
interested to see the cfbot's results.)
On closer look, I'm not sure that psql/t/020_cancel.pl is actually doing
anything on Windows; the cfbot's test transcript says it ran without
skipping, but looking at the test itself, it seems like it would skip on
Windows.
Meanwhile, the warnings build noticed that we need to #ifdef
out the optional_setsockopt function in some cases. Revised
0002 attached.
regards, tom lane
Attachments:
v2-0001-Refactor-to-remove-internal_cancel-helper.patchtext/x-diff; charset=us-ascii; name=v2-0001-Refactor-to-remove-internal_cancel-helper.patchDownload
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 14f35d37f6..e0ab7cd555 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -5707,8 +5707,8 @@ int PQrequestCancel(PGconn *conn);
<structname>PGconn</structname> object, and in case of failure stores the
error message in the <structname>PGconn</structname> object (whence it can
be retrieved by <xref linkend="libpq-PQerrorMessage"/>). Although
- the functionality is the same, this approach creates hazards for
- multiple-thread programs and signal handlers, since it is possible
+ the functionality is the same, this approach is not safe within
+ multiple-thread programs or signal handlers, since it is possible
that overwriting the <structname>PGconn</structname>'s error message will
mess up the operation currently in progress on the connection.
</para>
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 5fc16be849..9e75abd304 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4394,14 +4394,17 @@ PQfreeCancel(PGcancel *cancel)
/*
- * PQcancel and PQrequestCancel: attempt to request cancellation of the
- * current operation.
+ * PQcancel: request query cancel
*
* The return value is true if the cancel request was successfully
* dispatched, false if not (in which case an error message is available).
* Note: successful dispatch is no guarantee that there will be any effect at
* the backend. The application must read the operation result as usual.
*
+ * On failure, an error message is stored in *errbuf, which must be of size
+ * errbufsize (recommended size is 256 bytes). *errbuf is not changed on
+ * success return.
+ *
* CAUTION: we want this routine to be safely callable from a signal handler
* (for example, an application might want to call it in a SIGINT handler).
* This means we cannot use any C library routine that might be non-reentrant.
@@ -4409,13 +4412,9 @@ PQfreeCancel(PGcancel *cancel)
* just as dangerous. We avoid sprintf here for that reason. Building up
* error messages with strcpy/strcat is tedious but should be quite safe.
* We also save/restore errno in case the signal handler support doesn't.
- *
- * internal_cancel() is an internal helper function to make code-sharing
- * between the two versions of the cancel function possible.
*/
-static int
-internal_cancel(SockAddr *raddr, int be_pid, int be_key,
- char *errbuf, int errbufsize)
+int
+PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
{
int save_errno = SOCK_ERRNO;
pgsocket tmpsock = PGINVALID_SOCKET;
@@ -4427,18 +4426,26 @@ internal_cancel(SockAddr *raddr, int be_pid, int be_key,
CancelRequestPacket cp;
} crp;
+ if (!cancel)
+ {
+ strlcpy(errbuf, "PQcancel() -- no cancel object supplied", errbufsize);
+ /* strlcpy probably doesn't change errno, but be paranoid */
+ SOCK_ERRNO_SET(save_errno);
+ return false;
+ }
+
/*
* We need to open a temporary connection to the postmaster. Do this with
* only kernel calls.
*/
- if ((tmpsock = socket(raddr->addr.ss_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
+ if ((tmpsock = socket(cancel->raddr.addr.ss_family, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
{
strlcpy(errbuf, "PQcancel() -- socket() failed: ", errbufsize);
goto cancel_errReturn;
}
retry3:
- if (connect(tmpsock, (struct sockaddr *) &raddr->addr,
- raddr->salen) < 0)
+ if (connect(tmpsock, (struct sockaddr *) &cancel->raddr.addr,
+ cancel->raddr.salen) < 0)
{
if (SOCK_ERRNO == EINTR)
/* Interrupted system call - we'll just try again */
@@ -4455,8 +4462,8 @@ retry3:
crp.packetlen = pg_hton32((uint32) sizeof(crp));
crp.cp.cancelRequestCode = (MsgType) pg_hton32(CANCEL_REQUEST_CODE);
- crp.cp.backendPID = pg_hton32(be_pid);
- crp.cp.cancelAuthCode = pg_hton32(be_key);
+ crp.cp.backendPID = pg_hton32(cancel->be_pid);
+ crp.cp.cancelAuthCode = pg_hton32(cancel->be_key);
retry4:
if (send(tmpsock, (char *) &crp, sizeof(crp), 0) != (int) sizeof(crp))
@@ -4508,27 +4515,6 @@ cancel_errReturn:
return false;
}
-/*
- * PQcancel: request query cancel
- *
- * Returns true if able to send the cancel request, false if not.
- *
- * On failure, an error message is stored in *errbuf, which must be of size
- * errbufsize (recommended size is 256 bytes). *errbuf is not changed on
- * success return.
- */
-int
-PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
-{
- if (!cancel)
- {
- strlcpy(errbuf, "PQcancel() -- no cancel object supplied", errbufsize);
- return false;
- }
-
- return internal_cancel(&cancel->raddr, cancel->be_pid, cancel->be_key,
- errbuf, errbufsize);
-}
/*
* PQrequestCancel: old, not thread-safe function for requesting query cancel
@@ -4546,6 +4532,7 @@ int
PQrequestCancel(PGconn *conn)
{
int r;
+ PGcancel *cancel;
/* Check we have an open connection */
if (!conn)
@@ -4561,8 +4548,19 @@ PQrequestCancel(PGconn *conn)
return false;
}
- r = internal_cancel(&conn->raddr, conn->be_pid, conn->be_key,
- conn->errorMessage.data, conn->errorMessage.maxlen);
+ cancel = PQgetCancel(conn);
+ if (cancel)
+ {
+ r = PQcancel(cancel, conn->errorMessage.data,
+ conn->errorMessage.maxlen);
+ PQfreeCancel(cancel);
+ }
+ else
+ {
+ strlcpy(conn->errorMessage.data, "out of memory",
+ conn->errorMessage.maxlen);
+ r = false;
+ }
if (!r)
conn->errorMessage.len = strlen(conn->errorMessage.data);
v3-0002-Use-timeout-socket-options-for-cancel-connections.patchtext/x-diff; charset=us-ascii; name*0=v3-0002-Use-timeout-socket-options-for-cancel-connections.p; name*1=atchDownload
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 9e75abd304..fb3b1764fa 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1944,26 +1944,17 @@ setKeepalivesCount(PGconn *conn)
/*
* Enable keepalives and set the keepalive values on Win32,
* where they are always set in one batch.
+ *
+ * setKeepalivesWin32 is used by PQcancel, and had better be signal safe.
*/
static int
-setKeepalivesWin32(PGconn *conn)
+setKeepalivesWin32(pgsocket sock, int idle, int interval)
{
struct tcp_keepalive ka;
DWORD retsize;
- int idle = 0;
- int interval = 0;
- if (conn->keepalives_idle &&
- !parse_int_param(conn->keepalives_idle, &idle, conn,
- "keepalives_idle"))
- return 0;
if (idle <= 0)
idle = 2 * 60 * 60; /* 2 hours = default */
-
- if (conn->keepalives_interval &&
- !parse_int_param(conn->keepalives_interval, &interval, conn,
- "keepalives_interval"))
- return 0;
if (interval <= 0)
interval = 1; /* 1 second = default */
@@ -1971,7 +1962,7 @@ setKeepalivesWin32(PGconn *conn)
ka.keepalivetime = idle * 1000;
ka.keepaliveinterval = interval * 1000;
- if (WSAIoctl(conn->sock,
+ if (WSAIoctl(sock,
SIO_KEEPALIVE_VALS,
(LPVOID) &ka,
sizeof(ka),
@@ -1981,6 +1972,26 @@ setKeepalivesWin32(PGconn *conn)
NULL,
NULL)
!= 0)
+ return 0;
+ return 1;
+}
+
+static int
+prepKeepalivesWin32(PGconn *conn)
+{
+ int idle = -1;
+ int interval = -1;
+
+ if (conn->keepalives_idle &&
+ !parse_int_param(conn->keepalives_idle, &idle, conn,
+ "keepalives_idle"))
+ return 0;
+ if (conn->keepalives_interval &&
+ !parse_int_param(conn->keepalives_interval, &interval, conn,
+ "keepalives_interval"))
+ return 0;
+
+ if (!setKeepalivesWin32(conn->sock, idle, interval))
{
appendPQExpBuffer(&conn->errorMessage,
libpq_gettext("%s(%s) failed: error code %d\n"),
@@ -2644,7 +2655,7 @@ keep_going: /* We will come back to here until there is
err = 1;
#else /* WIN32 */
#ifdef SIO_KEEPALIVE_VALS
- else if (!setKeepalivesWin32(conn))
+ else if (!prepKeepalivesWin32(conn))
err = 1;
#endif /* SIO_KEEPALIVE_VALS */
#endif /* WIN32 */
@@ -4380,8 +4391,53 @@ PQgetCancel(PGconn *conn)
memcpy(&cancel->raddr, &conn->raddr, sizeof(SockAddr));
cancel->be_pid = conn->be_pid;
cancel->be_key = conn->be_key;
+ /* We use -1 to indicate an unset connection option */
+ cancel->pgtcp_user_timeout = -1;
+ cancel->keepalives = -1;
+ cancel->keepalives_idle = -1;
+ cancel->keepalives_interval = -1;
+ cancel->keepalives_count = -1;
+ if (conn->pgtcp_user_timeout != NULL)
+ {
+ if (!parse_int_param(conn->pgtcp_user_timeout,
+ &cancel->pgtcp_user_timeout,
+ conn, "tcp_user_timeout"))
+ goto fail;
+ }
+ if (conn->keepalives != NULL)
+ {
+ if (!parse_int_param(conn->keepalives,
+ &cancel->keepalives,
+ conn, "keepalives"))
+ goto fail;
+ }
+ if (conn->keepalives_idle != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_idle,
+ &cancel->keepalives_idle,
+ conn, "keepalives_idle"))
+ goto fail;
+ }
+ if (conn->keepalives_interval != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_interval,
+ &cancel->keepalives_interval,
+ conn, "keepalives_interval"))
+ goto fail;
+ }
+ if (conn->keepalives_count != NULL)
+ {
+ if (!parse_int_param(conn->keepalives_count,
+ &cancel->keepalives_count,
+ conn, "keepalives_count"))
+ goto fail;
+ }
return cancel;
+
+fail:
+ free(cancel);
+ return NULL;
}
/* PQfreeCancel: free a cancel structure */
@@ -4393,6 +4449,25 @@ PQfreeCancel(PGcancel *cancel)
}
+/*
+ * Sets an integer socket option on a TCP socket, if the provided value is
+ * not negative. Returns false if setsockopt fails for some reason.
+ *
+ * CAUTION: This needs to be signal safe, since it's used by PQcancel.
+ */
+#if defined(TCP_USER_TIMEOUT) || !defined(WIN32)
+static bool
+optional_setsockopt(int fd, int protoid, int optid, int value)
+{
+ if (value < 0)
+ return true;
+ if (setsockopt(fd, protoid, optid, (char *) &value, sizeof(value)) < 0)
+ return false;
+ return true;
+}
+#endif
+
+
/*
* PQcancel: request query cancel
*
@@ -4443,6 +4518,78 @@ PQcancel(PGcancel *cancel, char *errbuf, int errbufsize)
strlcpy(errbuf, "PQcancel() -- socket() failed: ", errbufsize);
goto cancel_errReturn;
}
+
+ /*
+ * Since this connection will only be used to send a tiny bit of data, we
+ * don't need NODELAY. We also don't set the socket to nonblocking mode,
+ * because the API definition of PQcancel requires the cancel to be sent
+ * in a blocking way.
+ *
+ * We do set socket options related to keepalives and other TCP timeouts.
+ * This ensures that this function does not block indefinitely when
+ * reasonable keepalive and timeout settings have been provided.
+ */
+ if (!IS_AF_UNIX(cancel->raddr.addr.ss_family) &&
+ cancel->keepalives != 0)
+ {
+#ifndef WIN32
+ if (!optional_setsockopt(tmpsock, SOL_SOCKET, SO_KEEPALIVE, 1))
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(SO_KEEPALIVE) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+
+#ifdef PG_TCP_KEEPALIVE_IDLE
+ if (!optional_setsockopt(tmpsock, IPPROTO_TCP, PG_TCP_KEEPALIVE_IDLE,
+ cancel->keepalives_idle))
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(" PG_TCP_KEEPALIVE_IDLE_STR ") failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+
+#ifdef TCP_KEEPINTVL
+ if (!optional_setsockopt(tmpsock, IPPROTO_TCP, TCP_KEEPINTVL,
+ cancel->keepalives_interval))
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_KEEPINTVL) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+
+#ifdef TCP_KEEPCNT
+ if (!optional_setsockopt(tmpsock, IPPROTO_TCP, TCP_KEEPCNT,
+ cancel->keepalives_count))
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_KEEPCNT) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+
+#else /* WIN32 */
+
+#ifdef SIO_KEEPALIVE_VALS
+ if (!setKeepalivesWin32(tmpsock,
+ cancel->keepalives_idle,
+ cancel->keepalives_interval))
+ {
+ strlcpy(errbuf, "PQcancel() -- WSAIoctl(SIO_KEEPALIVE_VALS) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif /* SIO_KEEPALIVE_VALS */
+#endif /* WIN32 */
+
+ /* TCP_USER_TIMEOUT works the same way on Unix and Windows */
+#ifdef TCP_USER_TIMEOUT
+ if (!optional_setsockopt(tmpsock, IPPROTO_TCP, TCP_USER_TIMEOUT,
+ cancel->pgtcp_user_timeout))
+ {
+ strlcpy(errbuf, "PQcancel() -- setsockopt(TCP_USER_TIMEOUT) failed: ", errbufsize);
+ goto cancel_errReturn;
+ }
+#endif
+ }
+
retry3:
if (connect(tmpsock, (struct sockaddr *) &cancel->raddr.addr,
cancel->raddr.salen) < 0)
@@ -4454,10 +4601,6 @@ retry3:
goto cancel_errReturn;
}
- /*
- * We needn't set nonblocking I/O or NODELAY options here.
- */
-
/* Create and send the cancel request packet. */
crp.packetlen = pg_hton32((uint32) sizeof(crp));
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index fcce13843e..4290553482 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -583,6 +583,13 @@ struct pg_cancel
SockAddr raddr; /* Remote address */
int be_pid; /* PID of backend --- needed for cancels */
int be_key; /* key of backend --- needed for cancels */
+ int pgtcp_user_timeout; /* tcp user timeout */
+ int keepalives; /* use TCP keepalives? */
+ int keepalives_idle; /* time between TCP keepalives */
+ int keepalives_interval; /* time between TCP keepalive
+ * retransmits */
+ int keepalives_count; /* maximum number of TCP keepalive
+ * retransmits */
};
... btw, speaking of signal-safe functions: I am dismayed to
notice that strerror (and strerror_r) are *not* in POSIX's
list of async-signal-safe functions. This is really quite
unsurprising, considering that they are chartered to return
locale-dependent strings. Unless the data has already been
collected in the current process, that'd imply reading something
from the locale definition files, allocating memory to hold it,
etc.
So I'm now thinking this bit in PQcancel is completely unsafe:
strncat(errbuf, SOCK_STRERROR(SOCK_ERRNO, sebuf, sizeof(sebuf)),
maxlen);
Seems we have to give up on providing any details beyond the
name of the function that failed.
regards, tom lane
Thanks for all the cleanup and adding of windows support. To me it now looks good to merge.
Meanwhile I've created another patch that adds, a non-blocking version of PQcancel to libpq.
Which doesn't have this problem by design, because it simply reuses the normal code for
connection establishement. And it also includes tests for PQcancel itself.
Jelte Fennema <Jelte.Fennema@microsoft.com> writes:
Thanks for all the cleanup and adding of windows support. To me it now looks good to merge.
I was about to commit this when I started to wonder if it actually does
anything useful. In particular, I read in the Linux tcp(7) man page
TCP_USER_TIMEOUT (since Linux 2.6.37)
...
This option can be set during any state of a TCP connection, but
is effective only during the synchronized states of a connection
(ESTABLISHED, FIN-WAIT-1, FIN-WAIT-2, CLOSE-WAIT, CLOSING, and
LAST-ACK).
ISTM that the case we care about is where the server fails to respond
to the TCP connection request. If it does so, it seems pretty unlikely
that it wouldn't then eat the small amount of data we're going to send.
While the keepalive options aren't so explicitly documented, I'd bet that
they too don't have any effect until the connection is known established.
So I'm unconvinced that setting these values really has much effect
to make PQcancel more robust (or more accurately, make it fail faster).
I would like to know what scenarios you tested to convince you that
this is worth doing.
regards, tom lane
It seems the man page of TCP_USER_TIMEOUT does not align with
reality then. When I use it on my local machine it is effectively used
as a connection timeout too. The second command times out after
two seconds:
sudo iptables -A INPUT -p tcp --destination-port 5432 -j DROP
psql 'host=localhost tcp_user_timeout=2000'
The keepalive settings only apply once you get to the recv however. And yes,
it is pretty unlikely for the connection to break right when it is waiting for data.
But it has happened for us. And when it happens it is really bad, because
the process will be blocked forever. Since it is a blocking call.
After investigation when this happened it seemed to be a combination of a few
things making this happen:
1. The way citus uses cancelation requests: A Citus query on the coordinator creates
multiple connections to a worker and with 2PC for distributed transactions. If one
connection receives an error it sends a cancel request for all others.
2. When a machine is under heavy CPU or memory pressure things don't work
well:
i. errors can occur pretty frequently, causing lots of cancels to be sent by Citus.
ii. postmaster can be slow in handling new cancelation requests.
iii. Our failover system can think the node is down, because health checks are
failing.
3. Our failover system effectively cuts the power and the network of the primary
when it triggers a fail over to the secondary
This all together can result in a cancel request being interrupted right at that
wrong moment. And when it happens a distributed query on the Citus
coordinator, becomes blocked forever. We've had queries stuck in this state
for multiple days. The only way to get out of it at that point is either by restarting
postgres or manually closing the blocked socket (either with ss or gdb).
Jelte
Jelte Fennema <Jelte.Fennema@microsoft.com> writes:
It seems the man page of TCP_USER_TIMEOUT does not align with
reality then. When I use it on my local machine it is effectively used
as a connection timeout too.
Huh, I should have thought to try that. I confirm this behavior
on RHEL8 (kernel 4.18.0). Not the first mistake I've seen in
Linux man pages :-(.
Doesn't seem to help on macOS, but AFAICT that platform doesn't
have TCP_USER_TIMEOUT, so no surprise there.
Anyway, that removes my objection, so I'll proceed with committing.
Thanks for working on this patch!
regards, tom lane