psql client does not handle WSAEWOULDBLOCK on Windows

Started by Ningover 1 year ago6 messages
#1Ning
ning94803@gmail.com
1 attachment(s)

*Description:*The connection fails with a non-blocking socket error when
using psql on
Windows to connect to a PostgreSQL server with GSSAPI enabled. The error is
because the socket error code is obtained by WSAGetLastError() instead of
errno. This causes the value of errno to be incorrect when handling a
non-blocking socket error.

*Steps to Reproduce:*1. Compile PostgreSQL client (psql) on Windows.
a. Make sure MIT Kerberos is installed. I use the latest version MIT
Kerberos
Version 4.1.
b. Make sure GSSAPI is enabled
2. Attempt to connect to a PostgreSQL server using psql.
a. Set up the Kerberos server and configure the PostgreSQL server by
referring
to https://github.com/50wu/kerberos-docker/blob/main/POSTGRES.README.md
b. change the entry to hostgssenc on PostgreSQL server pg_hba.conf and
restart
hostgssenc all all 0.0.0.0/0 gss include_realm=0
krb_realm=GPDB.KRB
c. Use the following command to connect to the database server
psql -h <hostname> -U "postgres/krb5-service-example-com.example.com" -d
postgres
3. The connection fails with a non-blocking socket error. The error is
something like:
psql: error: connection to server at "xxx", port 5432 failed:

*Environment*:
PostgreSQL version: 16.3
Operating System: Windows 11

*Fix Steps:*In the gss_read function of
src/interfaces/libpq/fe-secure-gssapi.c, change the
check of the error code to use the SOCK_ERRNO to make sure that EAGAIN,
EWOULDBLOCK and EINTR can be properly handled on Windows and other
platforms.

The patch file is attached to this email, please review and consider
merging it to
the main code library.

Thanks,
Ning Wu

Attachments:

fix-windows-non-blocking-socket.patchapplication/octet-stream; name=fix-windows-non-blocking-socket.patchDownload
From 12f0b5b302707e30e7796109eade9bfb1a414212 Mon Sep 17 00:00:00 2001
From: Ning Wu <ning94803@gmail.com>
Date: Tue, 30 Jul 2024 11:27:00 +0800
Subject: [PATCH] Fix error handling for non-blocking socket on Windows

The connection fails with a non-blocking socket error when using psql on
Windows to connect to a PostgreSQL server with GSSAPI enabled.

On Windows, the socket error code is obtained by WSAGetLastError() instead of
errno. This causes the value of errno to be incorrect when handling with
non-blocking socket error.

By using the SOCK_ERRNO macro to ensure that all platforms can get the error
code correctly.

Authored-by: Ning Wu <ning94803@gmail.com>
---
 src/interfaces/libpq/fe-secure-gssapi.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/interfaces/libpq/fe-secure-gssapi.c b/src/interfaces/libpq/fe-secure-gssapi.c
index 98b314613c..8c81efe47b 100644
--- a/src/interfaces/libpq/fe-secure-gssapi.c
+++ b/src/interfaces/libpq/fe-secure-gssapi.c
@@ -430,7 +430,7 @@ gss_read(PGconn *conn, void *recv_buffer, size_t length, ssize_t *ret)
 	*ret = pqsecure_raw_read(conn, recv_buffer, length);
 	if (*ret < 0)
 	{
-		if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)
+		if (SOCK_ERRNO == EAGAIN || SOCK_ERRNO == EWOULDBLOCK || SOCK_ERRNO == EINTR)
 			return PGRES_POLLING_READING;
 		else
 			return PGRES_POLLING_FAILED;
-- 
2.44.0

#2Umar Hayat
postgresql.wizard@gmail.com
In reply to: Ning (#1)
Re: psql client does not handle WSAEWOULDBLOCK on Windows

I have not reproduce your test scenario, looking at code please see following comments:

If you check the function definition of pqsecure_raw_read() it actually do set errno like bellow

SOCK_ERRNO_SET(result_errno);
where result_errno = SOCK_ERRNO

Means anybody using those function pqsecure_raw_read/write, does not need to take care of portable ERRNO.

Regards
Umar Hayat

The new status of this patch is: Waiting on Author

#3Ning
ning94803@gmail.com
In reply to: Ning (#1)
Re: psql client does not handle WSAEWOULDBLOCK on Windows

Hi Umar,

In the function of gss_read() if print the value of errno and SOCK_ERRNO
separately, I found the values are different:
*ret = pqsecure_raw_read(conn, recv_buffer, length);
if (*ret < 0)
{
printf("errno: %d\n", errno);
printf("result_errno: %d\n", SOCK_ERRNO);
...

errno: 0
result_errno: 10035

Also refer to the
https://learn.microsoft.com/en-us/windows/win32/api/winsock/nf-winsock-wsagetlasterror
,
It shows that the Windows Sockets function does not set errno, but uses
WSAGetLastError to report errors. And refer to the
https://learn.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-recv
,
If the function fails, it should call the WSAGetLastError to get the
expansion
error message. This further shows that the socket operation error will not
be
reported through the errno.

So changing the error code check to use the SOCK_ERRNO instead of errno can
be
properly handled on both Windows and other platforms.

To reproduce the issue, I used the following version of Postgres and MIT
Kerberos:
PostgreSQL version: 16.3
MIT Kerberos Version 4.1
Operating System: Windows 11
Visual Studio 2022

On Tue, Jul 30, 2024 at 4:47 PM Ning <ning94803@gmail.com> wrote:

Show quoted text

*Description:*The connection fails with a non-blocking socket error when
using psql on
Windows to connect to a PostgreSQL server with GSSAPI enabled. The error is
because the socket error code is obtained by WSAGetLastError() instead of
errno. This causes the value of errno to be incorrect when handling a
non-blocking socket error.

*Steps to Reproduce:*1. Compile PostgreSQL client (psql) on Windows.
a. Make sure MIT Kerberos is installed. I use the latest version MIT
Kerberos
Version 4.1.
b. Make sure GSSAPI is enabled
2. Attempt to connect to a PostgreSQL server using psql.
a. Set up the Kerberos server and configure the PostgreSQL server by
referring
to https://github.com/50wu/kerberos-docker/blob/main/POSTGRES.README.md
b. change the entry to hostgssenc on PostgreSQL server pg_hba.conf and
restart
hostgssenc all all 0.0.0.0/0 gss include_realm=0
krb_realm=GPDB.KRB
c. Use the following command to connect to the database server
psql -h <hostname> -U "postgres/krb5-service-example-com.example.com" -d
postgres
3. The connection fails with a non-blocking socket error. The error is
something like:
psql: error: connection to server at "xxx", port 5432 failed:

*Environment*:
PostgreSQL version: 16.3
Operating System: Windows 11

*Fix Steps:*In the gss_read function of
src/interfaces/libpq/fe-secure-gssapi.c, change the
check of the error code to use the SOCK_ERRNO to make sure that EAGAIN,
EWOULDBLOCK and EINTR can be properly handled on Windows and other
platforms.

The patch file is attached to this email, please review and consider
merging it to
the main code library.

Thanks,
Ning Wu

#4Michael Paquier
michael@paquier.xyz
In reply to: Ning (#3)
Re: psql client does not handle WSAEWOULDBLOCK on Windows

On Tue, Aug 06, 2024 at 04:01:42PM +0800, Ning wrote:

In the function of gss_read() if print the value of errno and SOCK_ERRNO
separately, I found the values are different:
*ret = pqsecure_raw_read(conn, recv_buffer, length);
if (*ret < 0)
{
printf("errno: %d\n", errno);
printf("result_errno: %d\n", SOCK_ERRNO);
...

     *ret = pqsecure_raw_read(conn, recv_buffer, length);
     if (*ret < 0)
     {
-        if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)
+        if (SOCK_ERRNO == EAGAIN || SOCK_ERRNO == EWOULDBLOCK || SOCK_ERRNO == EINTR)

This is going to require some platform-specific check that I don't
have with me, though I am ready to accept that what you are telling
here is true and that we should apply this macro. Could somebody
check that a bit more in depth? Andrew D. perhaps?

One thing that I don't understand is why does this only apply after
the first call of pqsecure_raw_read() in gss_read()? There is a
second call of pqsecure_raw_read() you are not covering but it would
surely need the same treatment, no?

Also, what about pqsecure_raw_write() in pqsecure_open_gss()?
Shouldn't the same check apply?
--
Michael

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#4)
Re: psql client does not handle WSAEWOULDBLOCK on Windows

Michael Paquier <michael@paquier.xyz> writes:

This is going to require some platform-specific check that I don't
have with me, though I am ready to accept that what you are telling
here is true and that we should apply this macro. Could somebody
check that a bit more in depth? Andrew D. perhaps?

One thing that I don't understand is why does this only apply after
the first call of pqsecure_raw_read() in gss_read()? There is a
second call of pqsecure_raw_read() you are not covering but it would
surely need the same treatment, no?

Also, what about pqsecure_raw_write() in pqsecure_open_gss()?
Shouldn't the same check apply?

Yeah, I think we pretty much need to use SOCK_ERRNO, SOCK_ERRNO_SET,
and SOCK_STRERROR (if relevant) throughout fe-secure-gssapi.c.
Directly using errno is correct for syscalls related to the file
system, but I think everything in this file is dealing with
socket-related errors. Certainly the underlying pqsecure_raw_read
and pqsecure_raw_write layer expects those macros to be used.

Like you, I'm not really in a position to test this on Windows ...

regards, tom lane

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#5)
1 attachment(s)
Re: psql client does not handle WSAEWOULDBLOCK on Windows

I wrote:

Michael Paquier <michael@paquier.xyz> writes:

Also, what about pqsecure_raw_write() in pqsecure_open_gss()?
Shouldn't the same check apply?

Yeah, I think we pretty much need to use SOCK_ERRNO, SOCK_ERRNO_SET,
and SOCK_STRERROR (if relevant) throughout fe-secure-gssapi.c.

Since nothing seems to be happening here, I took another look and
decided that the required changes are really pretty straightforward.
fe-secure-gssapi.c doesn't contain any strerror calls, and its
touches of errno all appear to relate to socket errors, so we can
just change them all. As attached.

Like you, I'm not really in a position to test this on Windows ...

I'm still not, but it's straightforward enough that I'm willing to
push on my own authority. I'll just put this up for long enough
to make sure that the cfbot is happy with it --- though I think
it's not building with GSSAPI on Windows, so that may prove little.

regards, tom lane

Attachments:

v2-0001-Use-SOCK_ERRNO-_SET-in-fe-secure-gssapi.c.patchtext/x-diff; charset=us-ascii; name=v2-0001-Use-SOCK_ERRNO-_SET-in-fe-secure-gssapi.c.patchDownload
From 36f3ee942ef12801e52ce3fc82204455345939a6 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sun, 5 Oct 2025 15:05:38 -0400
Subject: [PATCH v2] Use SOCK_ERRNO[_SET] in fe-secure-gssapi.c.

On Windows, this code did not handle error conditions correctly at
all, since it looked at "errno" which is not used for socket-related
errors on that platform.  This resulted, for example, in failure
to connect to a PostgreSQL server with GSSAPI enabled.

We have a convention for dealing with this within libpq, which is to
use SOCK_ERRNO and SOCK_ERRNO_SET rather than touching errno directly;
but the GSSAPI code is a relative latecomer and did not get that memo.
(The equivalent backend code continues to use errno, because the
backend does this differently.  Maybe libpq's approach should be
rethought someday.)

Author: Ning Wu <ning94803@gmail.com>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFGqpvg-pRw=cdsUpKYfwY6D3d-m9tw8WMcAEE7HHWfm-oYWvw@mail.gmail.com
Backpatch-through: 13
---
 src/interfaces/libpq/fe-secure-gssapi.c | 27 ++++++++++++++-----------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/src/interfaces/libpq/fe-secure-gssapi.c b/src/interfaces/libpq/fe-secure-gssapi.c
index bc9e1ce06fa..843b31e175f 100644
--- a/src/interfaces/libpq/fe-secure-gssapi.c
+++ b/src/interfaces/libpq/fe-secure-gssapi.c
@@ -121,7 +121,7 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len)
 	{
 		appendPQExpBufferStr(&conn->errorMessage,
 							 "GSSAPI caller failed to retransmit all data needing to be retried\n");
-		errno = EINVAL;
+		SOCK_ERRNO_SET(EINVAL);
 		return -1;
 	}
 
@@ -199,14 +199,14 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len)
 		if (major != GSS_S_COMPLETE)
 		{
 			pg_GSS_error(libpq_gettext("GSSAPI wrap error"), conn, major, minor);
-			errno = EIO;		/* for lack of a better idea */
+			SOCK_ERRNO_SET(EIO);	/* for lack of a better idea */
 			goto cleanup;
 		}
 
 		if (conf_state == 0)
 		{
 			libpq_append_conn_error(conn, "outgoing GSSAPI message would not use confidentiality");
-			errno = EIO;		/* for lack of a better idea */
+			SOCK_ERRNO_SET(EIO);	/* for lack of a better idea */
 			goto cleanup;
 		}
 
@@ -215,7 +215,7 @@ pg_GSS_write(PGconn *conn, const void *ptr, size_t len)
 			libpq_append_conn_error(conn, "client tried to send oversize GSSAPI packet (%zu > %zu)",
 									(size_t) output.length,
 									PQ_GSS_MAX_PACKET_SIZE - sizeof(uint32));
-			errno = EIO;		/* for lack of a better idea */
+			SOCK_ERRNO_SET(EIO);	/* for lack of a better idea */
 			goto cleanup;
 		}
 
@@ -341,7 +341,7 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len)
 			/* If we still haven't got the length, return to the caller */
 			if (PqGSSRecvLength < sizeof(uint32))
 			{
-				errno = EWOULDBLOCK;
+				SOCK_ERRNO_SET(EWOULDBLOCK);
 				return -1;
 			}
 		}
@@ -354,7 +354,7 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len)
 			libpq_append_conn_error(conn, "oversize GSSAPI packet sent by the server (%zu > %zu)",
 									(size_t) input.length,
 									PQ_GSS_MAX_PACKET_SIZE - sizeof(uint32));
-			errno = EIO;		/* for lack of a better idea */
+			SOCK_ERRNO_SET(EIO);	/* for lack of a better idea */
 			return -1;
 		}
 
@@ -373,7 +373,7 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len)
 		/* If we don't yet have the whole packet, return to the caller */
 		if (PqGSSRecvLength - sizeof(uint32) < input.length)
 		{
-			errno = EWOULDBLOCK;
+			SOCK_ERRNO_SET(EWOULDBLOCK);
 			return -1;
 		}
 
@@ -393,7 +393,7 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len)
 			pg_GSS_error(libpq_gettext("GSSAPI unwrap error"), conn,
 						 major, minor);
 			ret = -1;
-			errno = EIO;		/* for lack of a better idea */
+			SOCK_ERRNO_SET(EIO);	/* for lack of a better idea */
 			goto cleanup;
 		}
 
@@ -401,7 +401,7 @@ pg_GSS_read(PGconn *conn, void *ptr, size_t len)
 		{
 			libpq_append_conn_error(conn, "incoming GSSAPI message did not use confidentiality");
 			ret = -1;
-			errno = EIO;		/* for lack of a better idea */
+			SOCK_ERRNO_SET(EIO);	/* for lack of a better idea */
 			goto cleanup;
 		}
 
@@ -437,7 +437,8 @@ gss_read(PGconn *conn, void *recv_buffer, size_t length, ssize_t *ret)
 	*ret = pqsecure_raw_read(conn, recv_buffer, length);
 	if (*ret < 0)
 	{
-		if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)
+		if (SOCK_ERRNO == EAGAIN || SOCK_ERRNO == EWOULDBLOCK ||
+			SOCK_ERRNO == EINTR)
 			return PGRES_POLLING_READING;
 		else
 			return PGRES_POLLING_FAILED;
@@ -457,7 +458,8 @@ gss_read(PGconn *conn, void *recv_buffer, size_t length, ssize_t *ret)
 		*ret = pqsecure_raw_read(conn, recv_buffer, length);
 		if (*ret < 0)
 		{
-			if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)
+			if (SOCK_ERRNO == EAGAIN || SOCK_ERRNO == EWOULDBLOCK ||
+				SOCK_ERRNO == EINTR)
 				return PGRES_POLLING_READING;
 			else
 				return PGRES_POLLING_FAILED;
@@ -520,7 +522,8 @@ pqsecure_open_gss(PGconn *conn)
 		ret = pqsecure_raw_write(conn, PqGSSSendBuffer + PqGSSSendNext, amount);
 		if (ret < 0)
 		{
-			if (errno == EAGAIN || errno == EWOULDBLOCK || errno == EINTR)
+			if (SOCK_ERRNO == EAGAIN || SOCK_ERRNO == EWOULDBLOCK ||
+				SOCK_ERRNO == EINTR)
 				return PGRES_POLLING_WRITING;
 			else
 				return PGRES_POLLING_FAILED;
-- 
2.43.7