RADIUS tests and improvements

Started by Thomas Munroabout 3 years ago9 messages
#1Thomas Munro
thomas.munro@gmail.com
3 attachment(s)

Hi,

Here's a draft patch to tackle a couple of TODOs in the RADIUS code in auth.c.

The first change is to replace select() with a standard latch loop
that responds to interrupts, postmaster death etc promptly. It's not
really too much of a big deal because the timeout was only 3 seconds
(hardcoded), but it's not good to have places that ignore ProcSignal,
and it's good to move code to our modern pattern for I/O multiplexing.

We know from experience that we have to crank timeouts up to be able
to run tests reliably on slow/valgrind/etc systems, so the second
change is to change the timeout to a GUC, as also requested by a
comment. One good side-effect is that it becomes easy and fast to
test the timed-out code path too, with a small value. While adding
the GUC I couldn't help wondering why RADIUS even needs a timeout
separate from authentication_timeout; another way to go here would be
to remove it completely, but that'd be a policy change (removing the 3
second timeout we always had). Thoughts?

The patch looks bigger than it really is because it changes the
indentation level.

But first, some basic tests to show that it works. We can test before
and after the change and have a non-zero level of confidence about
whacking the code around. Like existing similar tests, you need to
install an extra package (FreeRADIUS) and opt in with
PG_EXTRA_TESTS=radius. I figured out how to do that for our 3 CI
Unixen, so cfbot should run the tests and pass once I add this to the
March commitfest. FreeRADIUS claims to work on Windows too, but I
don't know how to set that up; maybe someday someone will fix that for
all the PG_EXTRA_TESTS tests. I've also seen this work on a Mac with
MacPorts. There's only one pathname in there that's a wild guess:
non-Debianoid Linux systems; if you know the answer there please LMK.

Attachments:

0001-Add-simple-test-for-RADIUS-authentication.patchtext/x-patch; charset=US-ASCII; name=0001-Add-simple-test-for-RADIUS-authentication.patchDownload
From 72a63c87b595a31f3d5e68722ca6bc7460190cc0 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sat, 31 Dec 2022 14:41:57 +1300
Subject: [PATCH 1/3] Add simple test for RADIUS authentication.

This is similar to the existing tests for ldap and kerberos.  It
requires FreeRADIUS to be installed, and opens ports that may be
considered insecure, so users have to opt in with PG_EXTRA_TESTS=radius.
---
 src/test/Makefile             |   2 +-
 src/test/meson.build          |   1 +
 src/test/radius/Makefile      |  23 +++++
 src/test/radius/meson.build   |  12 +++
 src/test/radius/t/001_auth.pl | 187 ++++++++++++++++++++++++++++++++++
 5 files changed, 224 insertions(+), 1 deletion(-)
 create mode 100644 src/test/radius/Makefile
 create mode 100644 src/test/radius/meson.build
 create mode 100644 src/test/radius/t/001_auth.pl

diff --git a/src/test/Makefile b/src/test/Makefile
index dbd3192874..687164412c 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl regress isolation modules authentication recovery subscription
+SUBDIRS = perl regress isolation modules authentication recovery radius subscription
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/meson.build b/src/test/meson.build
index 5f3c9c2ba2..b5da17b531 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -5,6 +5,7 @@ subdir('isolation')
 
 subdir('authentication')
 subdir('recovery')
+subdir('radius')
 subdir('subscription')
 subdir('modules')
 
diff --git a/src/test/radius/Makefile b/src/test/radius/Makefile
new file mode 100644
index 0000000000..56768a3ca9
--- /dev/null
+++ b/src/test/radius/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/radius
+#
+# Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/radius/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/radius
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/radius/meson.build b/src/test/radius/meson.build
new file mode 100644
index 0000000000..ea7afc4555
--- /dev/null
+++ b/src/test/radius/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'radius',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_auth.pl',
+    ],
+  },
+}
diff --git a/src/test/radius/t/001_auth.pl b/src/test/radius/t/001_auth.pl
new file mode 100644
index 0000000000..44db62a3d7
--- /dev/null
+++ b/src/test/radius/t/001_auth.pl
@@ -0,0 +1,187 @@
+
+# Copyright (c) 2021-2022, PostgreSQL Global Development Group
+
+# Debian: apt-get install freeradius
+# Homebrew: brew install freeradius-server
+# FreeBSD: pkg install freeradius3
+# MacPorts: port install freeradius
+
+use strict;
+use warnings;
+use File::Copy;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+my $radiusd_dir = "${PostgreSQL::Test::Utils::tmp_check}/radiusd_data";
+my $radiusd_conf = "radiusd.conf";
+my $radiusd_users = "users.txt";
+my $radiusd_prefix;
+my $radiusd;
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bradius\b/)
+{
+	plan skip_all => 'Potentially unsafe test RADIUS not enabled in PG_TEST_EXTRA';
+}
+elsif ($^O eq 'freebsd')
+{
+	$radiusd         = '/usr/local/sbin/radiusd';
+}
+elsif ($^O eq 'linux' && -f '/usr/sbin/freeradius')
+{
+	$radiusd         = '/usr/sbin/freeradius';
+}
+elsif ($^O eq 'linux')
+{
+	$radiusd         = '/usr/sbin/radiusd';
+}
+elsif ($^O eq 'darwin' && -d '/opt/local')
+{
+	# typical path for MacPorts
+	$radiusd         = '/opt/local/sbin/radiusd';
+	$radiusd_prefix  = '/opt/local';
+}
+elsif ($^O eq 'darwin' && -d '/opt/homebrew')
+{
+	# typical path for Homebrew on ARM
+	$radiusd         = '/opt/homebrew/bin/radiusd';
+	$radiusd_prefix  = '/opt/homebrew';
+}
+elsif ($^O eq 'darwin' && -d '/usr/local')
+{
+	# typical path for Homebrew on Intel
+	$radiusd         = '/usr/local/bin/radiusd';
+	$radiusd_prefix  = '/usr/local';
+}
+else
+{
+	plan skip_all =>
+	  "radius tests not supported on $^O or dependencies not installed";
+}
+
+my $radius_port     = PostgreSQL::Test::Cluster::get_free_port();
+
+note "setting up radiusd";
+
+mkdir $radiusd_dir or die "cannot create $radiusd_dir";
+
+append_to_file(
+	"$radiusd_dir/$radiusd_conf",
+	qq{client default {
+  ipaddr = "127.0.0.1"
+  secret = "secret"
+}
+
+modules {
+  files {
+    filename = "$radiusd_dir/users.txt"
+  }
+  pap {
+  }
+}
+
+server default {
+  listen {
+    type   = "auth"
+    ipv4addr = "127.0.0.1"
+    port = "$radius_port"
+  }
+  authenticate {
+    Auth-Type PAP {
+      pap
+    }
+  }
+  authorize {
+    files
+    pap
+  }
+}
+
+log {
+  destination = "files"
+  localstatedir = "$radiusd_dir"
+  logdir = "$radiusd_dir"
+  file = "$radiusd_dir/radius.log"
+}
+
+pidfile = "$radiusd_dir/radiusd.pid"
+});
+
+# help to find libraries that radiusd dlopens
+if ($radiusd_prefix)
+{
+	append_to_file(
+		"$radiusd_dir/$radiusd_conf",
+		qq{prefix="$radiusd_prefix"\n})
+}
+
+append_to_file(
+	"$radiusd_dir/$radiusd_users",
+	qq{test2 Cleartext-Password := "secret2"});
+
+system_or_bail $radiusd, '-xx', '-d', $radiusd_dir;
+
+END
+{
+	kill 'INT', `cat $radiusd_dir/radiusd.pid` if -f "$radiusd_dir/radiusd.pid";
+}
+
+note "setting up PostgreSQL instance";
+
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init;
+$node->append_conf('postgresql.conf', "log_connections = on\n");
+$node->start;
+
+$node->safe_psql('postgres', 'CREATE USER test1;');
+$node->safe_psql('postgres', 'CREATE USER test2;');
+
+note "running tests";
+
+sub test_access
+{
+	local $Test::Builder::Level = $Test::Builder::Level + 1;
+
+	my ($node, $role, $expected_res, $test_name, %params) = @_;
+	my $connstr = "user=$role";
+
+	if ($expected_res eq 0)
+	{
+		$node->connect_ok($connstr, $test_name, %params);
+	}
+	else
+	{
+		# No checks of the error message, only the status code.
+		$node->connect_fails($connstr, $test_name, %params);
+	}
+}
+
+note "enable RADIUS auth";
+
+unlink($node->data_dir . '/pg_hba.conf');
+$node->append_conf('pg_hba.conf',
+	qq{local all all radius radiusservers="127.0.0.1" radiussecrets="secret" radiusports="$radius_port"}
+);
+$node->restart;
+
+note "simple negative and positive tests";
+
+$ENV{"PGPASSWORD"} = 'wrong';
+test_access(
+	$node, 'test1', 2,
+	'authentication fails if user not found in RADIUS',
+	log_unlike => [qr/connection authenticated:/]);
+test_access(
+	$node, 'test2', 2,
+	'authentication fails with wrong password',
+	log_unlike => [qr/connection authenticated:/]);
+
+$ENV{"PGPASSWORD"} = 'secret2';
+test_access(
+	$node, 'test2', 0,
+	'authentication succeeds with right password',
+	log_like => [
+		qr/connection authenticated: identity="test2" method=radius/
+	],);
+
+done_testing();
-- 
2.38.1

0002-ci-Enable-RADIUS-test.patchtext/x-patch; charset=US-ASCII; name=0002-ci-Enable-RADIUS-test.patchDownload
From 81a03f06a11fbafe7f52073d1479e070a94bef45 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sat, 31 Dec 2022 15:08:19 +1300
Subject: [PATCH 2/3] ci: Enable RADIUS test.

XXX Need to get freeradius installed in the CI images for fast startup
---
 .cirrus.yml | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index 354102613d..ca8fb50d21 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -25,7 +25,7 @@ env:
   MTEST_ARGS: --print-errorlogs --no-rebuild -C build
   PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests
   TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf
-  PG_TEST_EXTRA: kerberos ldap ssl
+  PG_TEST_EXTRA: kerberos ldap ssl radius
 
 
 # What files to preserve in case tests fail
@@ -173,6 +173,7 @@ task:
     sysctl kern.corefile='/tmp/cores/%N.%P.core'
   setup_additional_packages_script: |
     #pkg install -y ...
+    pkg install -y freeradius3
 
   # NB: Intentionally build without -Dllvm. The freebsd image size is already
   # large enough to make VM startup slow, and even without llvm freebsd
@@ -278,6 +279,7 @@ task:
     LINUX_CONFIGURE_FEATURES: *LINUX_CONFIGURE_FEATURES
     LINUX_MESON_FEATURES: *LINUX_MESON_FEATURES
 
+
   depends_on: SanityCheck
   only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*linux.*'
 
@@ -311,6 +313,8 @@ task:
   setup_additional_packages_script: |
     #apt-get update
     #DEBIAN_FRONTEND=noninteractive apt-get -y install ...
+    apt-get update
+    DEBIAN_FRONTEND=noninteractive apt-get -y install freeradius
 
   matrix:
     - name: Linux - Debian Bullseye - Autoconf
@@ -457,6 +461,7 @@ task:
   setup_additional_packages_script: |
     brew install \
       ccache \
+      freeradius-server \
       icu4c \
       krb5 \
       llvm \
-- 
2.38.1

0003-Use-latch-API-to-wait-for-RADIUS-authentication.patchtext/x-patch; charset=US-ASCII; name=0003-Use-latch-API-to-wait-for-RADIUS-authentication.patchDownload
From 760b516554101d5482265847db3d680accd23cf7 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sat, 31 Dec 2022 23:36:47 +1300
Subject: [PATCH 3/3] Use latch API to wait for RADIUS authentication.

Handle interrupts while waiting for a response from a RADIUS server,
completing a TODO in comments.  Remote clients can't really interrupt
authentication (they don't have a cancel key yet), but it's important to
process other kinds of interrupts promptly.

Since CHECK_FOR_INTERRUPTS() might throw, leaving us with a leaked
socket, use PG_FINALLY() to make sure we clean up.

While here, also convert RADIUS_TIMEOUT to a GUC, another TODO in
comments, so that we can turn it up very high for automated testing on
slow/overloaded computers.

Now that the timeout is adjustable, we can also add a test of the
timeout code path with very short timeout.  Since the RADIUS protocol is
connectionless, we can provoke a timeout by sending a UDP packet to the
wrong port; nobody will write back to us and we will time out.
---
 src/backend/libpq/auth.c                      | 342 +++++++++---------
 src/backend/utils/misc/guc_tables.c           |  11 +
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/libpq/auth.h                      |   1 +
 src/test/radius/t/001_auth.pl                 |  15 +
 5 files changed, 197 insertions(+), 173 deletions(-)

diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 25b3a781cd..4511b0f3e6 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -2802,8 +2802,9 @@ typedef struct
 /* RADIUS service types */
 #define RADIUS_AUTHENTICATE_ONLY	8
 
-/* Seconds to wait - XXX: should be in a config variable! */
-#define RADIUS_TIMEOUT 3
+/* RADIUS GUCs */
+
+int			radius_timeout = 3000;
 
 static void
 radius_add_attribute(radius_packet *packet, uint8 type, const unsigned char *data, int len)
@@ -2949,8 +2950,7 @@ PerformRadiusTransaction(const char *server, const char *secret, const char *por
 	struct addrinfo *serveraddrs;
 	int			port;
 	socklen_t	addrsize;
-	fd_set		fdset;
-	struct timeval endtime;
+	TimestampTz endtime;
 	int			i,
 				j,
 				r;
@@ -3053,197 +3053,193 @@ PerformRadiusTransaction(const char *server, const char *secret, const char *por
 		return STATUS_ERROR;
 	}
 
-	memset(&localaddr, 0, sizeof(localaddr));
-	localaddr.sin6_family = serveraddrs[0].ai_family;
-	localaddr.sin6_addr = in6addr_any;
-	if (localaddr.sin6_family == AF_INET6)
-		addrsize = sizeof(struct sockaddr_in6);
-	else
-		addrsize = sizeof(struct sockaddr_in);
-
-	if (bind(sock, (struct sockaddr *) &localaddr, addrsize))
-	{
-		ereport(LOG,
-				(errmsg("could not bind local RADIUS socket: %m")));
-		closesocket(sock);
-		pg_freeaddrinfo_all(hint.ai_family, serveraddrs);
-		return STATUS_ERROR;
-	}
-
-	if (sendto(sock, radius_buffer, packetlength, 0,
-			   serveraddrs[0].ai_addr, serveraddrs[0].ai_addrlen) < 0)
-	{
-		ereport(LOG,
-				(errmsg("could not send RADIUS packet: %m")));
-		closesocket(sock);
-		pg_freeaddrinfo_all(hint.ai_family, serveraddrs);
-		return STATUS_ERROR;
-	}
-
-	/* Don't need the server address anymore */
-	pg_freeaddrinfo_all(hint.ai_family, serveraddrs);
-
-	/*
-	 * Figure out at what time we should time out. We can't just use a single
-	 * call to select() with a timeout, since somebody can be sending invalid
-	 * packets to our port thus causing us to retry in a loop and never time
-	 * out.
-	 *
-	 * XXX: Using WaitLatchOrSocket() and doing a CHECK_FOR_INTERRUPTS() if
-	 * the latch was set would improve the responsiveness to
-	 * timeouts/cancellations.
-	 */
-	gettimeofday(&endtime, NULL);
-	endtime.tv_sec += RADIUS_TIMEOUT;
-
-	while (true)
+	PG_TRY();
 	{
-		struct timeval timeout;
-		struct timeval now;
-		int64		timeoutval;
-		const char *errstr = NULL;
+		memset(&localaddr, 0, sizeof(localaddr));
+		localaddr.sin6_family = serveraddrs[0].ai_family;
+		localaddr.sin6_addr = in6addr_any;
+		if (localaddr.sin6_family == AF_INET6)
+			addrsize = sizeof(struct sockaddr_in6);
+		else
+			addrsize = sizeof(struct sockaddr_in);
 
-		gettimeofday(&now, NULL);
-		timeoutval = (endtime.tv_sec * 1000000 + endtime.tv_usec) - (now.tv_sec * 1000000 + now.tv_usec);
-		if (timeoutval <= 0)
+		if (bind(sock, (struct sockaddr *) &localaddr, addrsize))
 		{
 			ereport(LOG,
-					(errmsg("timeout waiting for RADIUS response from %s",
-							server)));
-			closesocket(sock);
+					(errmsg("could not bind local RADIUS socket: %m")));
+			pg_freeaddrinfo_all(hint.ai_family, serveraddrs);
 			return STATUS_ERROR;
 		}
-		timeout.tv_sec = timeoutval / 1000000;
-		timeout.tv_usec = timeoutval % 1000000;
-
-		FD_ZERO(&fdset);
-		FD_SET(sock, &fdset);
 
-		r = select(sock + 1, &fdset, NULL, NULL, &timeout);
-		if (r < 0)
-		{
-			if (errno == EINTR)
-				continue;
-
-			/* Anything else is an actual error */
-			ereport(LOG,
-					(errmsg("could not check status on RADIUS socket: %m")));
-			closesocket(sock);
-			return STATUS_ERROR;
-		}
-		if (r == 0)
+		if (sendto(sock, radius_buffer, packetlength, 0,
+				   serveraddrs[0].ai_addr, serveraddrs[0].ai_addrlen) < 0)
 		{
 			ereport(LOG,
-					(errmsg("timeout waiting for RADIUS response from %s",
-							server)));
-			closesocket(sock);
+					(errmsg("could not send RADIUS packet: %m")));
+			pg_freeaddrinfo_all(hint.ai_family, serveraddrs);
 			return STATUS_ERROR;
 		}
 
+		/* Don't need the server address anymore */
+		pg_freeaddrinfo_all(hint.ai_family, serveraddrs);
+
 		/*
-		 * Attempt to read the response packet, and verify the contents.
-		 *
-		 * Any packet that's not actually a RADIUS packet, or otherwise does
-		 * not validate as an explicit reject, is just ignored and we retry
-		 * for another packet (until we reach the timeout). This is to avoid
-		 * the possibility to denial-of-service the login by flooding the
-		 * server with invalid packets on the port that we're expecting the
-		 * RADIUS response on.
+		 * Figure out at what time we should time out. We can't just use a
+		 * single wait with a timeout, since somebody can be sending invalid
+		 * packets to our port thus causing us to retry in a loop and never
+		 * time out.
 		 */
+		endtime = GetCurrentTimestamp() + radius_timeout * 1000;
 
-		addrsize = sizeof(remoteaddr);
-		packetlength = recvfrom(sock, receive_buffer, RADIUS_BUFFER_SIZE, 0,
-								(struct sockaddr *) &remoteaddr, &addrsize);
-		if (packetlength < 0)
+		while (true)
 		{
-			ereport(LOG,
-					(errmsg("could not read RADIUS response: %m")));
-			closesocket(sock);
-			return STATUS_ERROR;
-		}
+			int			timeoutval;
+			const char *errstr = NULL;
 
-		if (remoteaddr.sin6_port != pg_hton16(port))
-		{
-			ereport(LOG,
-					(errmsg("RADIUS response from %s was sent from incorrect port: %d",
-							server, pg_ntoh16(remoteaddr.sin6_port))));
-			continue;
-		}
+			/* Remaining time, rounded up to the nearest millisecond. */
+			timeoutval = ((endtime - GetCurrentTimestamp()) + 999) / 1000;
+			if (timeoutval <= 0)
+			{
+				ereport(LOG,
+						(errmsg("timeout waiting for RADIUS response from %s",
+								server)));
+				return STATUS_ERROR;
+			}
 
-		if (packetlength < RADIUS_HEADER_LENGTH)
-		{
-			ereport(LOG,
-					(errmsg("RADIUS response from %s too short: %d", server, packetlength)));
-			continue;
-		}
+			/*
+			 * No point in supplying a wait_event value as we don't have a
+			 * row in pg_stat_activity yet.
+			 */
+			r = WaitLatchOrSocket(MyLatch,
+								  WL_SOCKET_READABLE | WL_EXIT_ON_PM_DEATH |
+								  WL_LATCH_SET | WL_TIMEOUT,
+								  sock,
+								  timeoutval,
+								  0);
+			if (r & WL_LATCH_SET)
+			{
+				ResetLatch(MyLatch);
+				CHECK_FOR_INTERRUPTS();
+				continue;
+			}
+			else if (r & WL_TIMEOUT)
+			{
+				ereport(LOG,
+						(errmsg("timeout waiting for RADIUS response from %s",
+								server)));
+				return STATUS_ERROR;
+			}
+			else
+			{
+				Assert(r & WL_SOCKET_READABLE);
+			}
 
-		if (packetlength != pg_ntoh16(receivepacket->length))
-		{
-			ereport(LOG,
-					(errmsg("RADIUS response from %s has corrupt length: %d (actual length %d)",
-							server, pg_ntoh16(receivepacket->length), packetlength)));
-			continue;
-		}
+			/*
+			 * Attempt to read the response packet, and verify the contents.
+			 *
+			 * Any packet that's not actually a RADIUS packet, or otherwise
+			 * does not validate as an explicit reject, is just ignored and we
+			 * retry for another packet (until we reach the timeout). This is
+			 * to avoid the possibility to denial-of-service the login by
+			 * flooding the server with invalid packets on the port that we're
+			 * expecting the RADIUS response on.
+			 */
 
-		if (packet->id != receivepacket->id)
-		{
-			ereport(LOG,
-					(errmsg("RADIUS response from %s is to a different request: %d (should be %d)",
-							server, receivepacket->id, packet->id)));
-			continue;
-		}
+			addrsize = sizeof(remoteaddr);
+			packetlength = recvfrom(sock, receive_buffer, RADIUS_BUFFER_SIZE, 0,
+									(struct sockaddr *) &remoteaddr, &addrsize);
+			if (packetlength < 0)
+			{
+				ereport(LOG,
+						(errmsg("could not read RADIUS response: %m")));
+				return STATUS_ERROR;
+			}
 
-		/*
-		 * Verify the response authenticator, which is calculated as
-		 * MD5(Code+ID+Length+RequestAuthenticator+Attributes+Secret)
-		 */
-		cryptvector = palloc(packetlength + strlen(secret));
-
-		memcpy(cryptvector, receivepacket, 4);	/* code+id+length */
-		memcpy(cryptvector + 4, packet->vector, RADIUS_VECTOR_LENGTH);	/* request
-																		 * authenticator, from
-																		 * original packet */
-		if (packetlength > RADIUS_HEADER_LENGTH)	/* there may be no
-													 * attributes at all */
-			memcpy(cryptvector + RADIUS_HEADER_LENGTH, receive_buffer + RADIUS_HEADER_LENGTH, packetlength - RADIUS_HEADER_LENGTH);
-		memcpy(cryptvector + packetlength, secret, strlen(secret));
-
-		if (!pg_md5_binary(cryptvector,
-						   packetlength + strlen(secret),
-						   encryptedpassword, &errstr))
-		{
-			ereport(LOG,
-					(errmsg("could not perform MD5 encryption of received packet: %s",
-							errstr)));
+			if (remoteaddr.sin6_port != pg_hton16(port))
+			{
+				ereport(LOG,
+						(errmsg("RADIUS response from %s was sent from incorrect port: %d",
+								server, pg_ntoh16(remoteaddr.sin6_port))));
+				continue;
+			}
+
+			if (packetlength < RADIUS_HEADER_LENGTH)
+			{
+				ereport(LOG,
+						(errmsg("RADIUS response from %s too short: %d", server, packetlength)));
+				continue;
+			}
+
+			if (packetlength != pg_ntoh16(receivepacket->length))
+			{
+				ereport(LOG,
+						(errmsg("RADIUS response from %s has corrupt length: %d (actual length %d)",
+								server, pg_ntoh16(receivepacket->length), packetlength)));
+				continue;
+			}
+
+			if (packet->id != receivepacket->id)
+			{
+				ereport(LOG,
+						(errmsg("RADIUS response from %s is to a different request: %d (should be %d)",
+								server, receivepacket->id, packet->id)));
+				continue;
+			}
+
+			/*
+			 * Verify the response authenticator, which is calculated as
+			 * MD5(Code+ID+Length+RequestAuthenticator+Attributes+Secret)
+			 */
+			cryptvector = palloc(packetlength + strlen(secret));
+
+			memcpy(cryptvector, receivepacket, 4);	/* code+id+length */
+
+			/* request * authenticator, from  original packet */
+			memcpy(cryptvector + 4, packet->vector, RADIUS_VECTOR_LENGTH);
+
+			if (packetlength > RADIUS_HEADER_LENGTH)	/* there may be no
+														 * attributes at all */
+				memcpy(cryptvector + RADIUS_HEADER_LENGTH, receive_buffer + RADIUS_HEADER_LENGTH, packetlength - RADIUS_HEADER_LENGTH);
+			memcpy(cryptvector + packetlength, secret, strlen(secret));
+
+			if (!pg_md5_binary(cryptvector,
+							   packetlength + strlen(secret),
+							   encryptedpassword, &errstr))
+			{
+				ereport(LOG,
+						(errmsg("could not perform MD5 encryption of received packet: %s",
+								errstr)));
+				pfree(cryptvector);
+				continue;
+			}
 			pfree(cryptvector);
-			continue;
-		}
-		pfree(cryptvector);
 
-		if (memcmp(receivepacket->vector, encryptedpassword, RADIUS_VECTOR_LENGTH) != 0)
-		{
-			ereport(LOG,
-					(errmsg("RADIUS response from %s has incorrect MD5 signature",
-							server)));
-			continue;
-		}
+			if (memcmp(receivepacket->vector, encryptedpassword, RADIUS_VECTOR_LENGTH) != 0)
+			{
+				ereport(LOG,
+						(errmsg("RADIUS response from %s has incorrect MD5 signature",
+								server)));
+				continue;
+			}
 
-		if (receivepacket->code == RADIUS_ACCESS_ACCEPT)
-		{
-			closesocket(sock);
-			return STATUS_OK;
-		}
-		else if (receivepacket->code == RADIUS_ACCESS_REJECT)
-		{
-			closesocket(sock);
-			return STATUS_EOF;
-		}
-		else
-		{
-			ereport(LOG,
-					(errmsg("RADIUS response from %s has invalid code (%d) for user \"%s\"",
-							server, receivepacket->code, user_name)));
-			continue;
-		}
-	}							/* while (true) */
+			if (receivepacket->code == RADIUS_ACCESS_ACCEPT)
+				return STATUS_OK;
+			else if (receivepacket->code == RADIUS_ACCESS_REJECT)
+				return STATUS_EOF;
+			else
+			{
+				ereport(LOG,
+						(errmsg("RADIUS response from %s has invalid code (%d) for user \"%s\"",
+								server, receivepacket->code, user_name)));
+				continue;
+			}
+		}						/* while (true) */
+	}
+	PG_FINALLY();
+	{
+		closesocket(sock);
+	}
+	PG_END_TRY();
+
+	pg_unreachable();
 }
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 68328b1402..17bbe9d940 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -3387,6 +3387,17 @@ struct config_int ConfigureNamesInt[] =
 		NULL, assign_tcp_user_timeout, show_tcp_user_timeout
 	},
 
+	{
+		{"radius_timeout", PGC_SIGHUP, CONN_AUTH_AUTH,
+			gettext_noop("RADIUS authentication timeout."),
+			NULL,
+			GUC_UNIT_MS
+		},
+		&radius_timeout,
+		3000, 1, INT_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"huge_page_size", PGC_POSTMASTER, RESOURCES_MEM,
 			gettext_noop("The size of huge page that should be requested."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 5afdeb04de..3f95c52f93 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -95,6 +95,7 @@
 #authentication_timeout = 1min		# 1s-600s
 #password_encryption = scram-sha-256	# scram-sha-256 or md5
 #db_user_namespace = off
+#radius_timeout = 3s
 
 # GSSAPI using Kerberos
 #krb_server_keyfile = 'FILE:${sysconfdir}/krb5.keytab'
diff --git a/src/include/libpq/auth.h b/src/include/libpq/auth.h
index 137bee7c45..86d246b1f8 100644
--- a/src/include/libpq/auth.h
+++ b/src/include/libpq/auth.h
@@ -19,6 +19,7 @@
 extern PGDLLIMPORT char *pg_krb_server_keyfile;
 extern PGDLLIMPORT bool pg_krb_caseins_users;
 extern PGDLLIMPORT char *pg_krb_realm;
+extern PGDLLIMPORT int radius_timeout;
 
 extern void ClientAuthentication(Port *port);
 extern void sendAuthRequest(Port *port, AuthRequest areq, const char *extradata,
diff --git a/src/test/radius/t/001_auth.pl b/src/test/radius/t/001_auth.pl
index 44db62a3d7..ebfd0d6301 100644
--- a/src/test/radius/t/001_auth.pl
+++ b/src/test/radius/t/001_auth.pl
@@ -60,6 +60,7 @@ else
 }
 
 my $radius_port     = PostgreSQL::Test::Cluster::get_free_port();
+my $not_radius_port = PostgreSQL::Test::Cluster::get_free_port();
 
 note "setting up radiusd";
 
@@ -131,6 +132,7 @@ note "setting up PostgreSQL instance";
 my $node = PostgreSQL::Test::Cluster->new('node');
 $node->init;
 $node->append_conf('postgresql.conf', "log_connections = on\n");
+$node->append_conf('postgresql.conf', "radius_timeout = '${PostgreSQL::Test::Utils::timeout_default}s'\n");
 $node->start;
 
 $node->safe_psql('postgres', 'CREATE USER test1;');
@@ -184,4 +186,17 @@ test_access(
 		qr/connection authenticated: identity="test2" method=radius/
 	],);
 
+# Set the timeout very short and point to a non-existent radius server
+$node->append_conf('postgresql.conf', "radius_timeout = '2ms'\n");
+unlink($node->data_dir . '/pg_hba.conf');
+$node->append_conf('pg_hba.conf',
+	qq{local all all radius radiusservers="127.0.0.1" radiussecrets="secret" radiusports="$not_radius_port"}
+);
+$node->restart;
+
+test_access(
+	$node, 'test2', 2,
+	'authentication fails with timeout',
+	log_like => [qr/timeout waiting for RADIUS response/]);
+
 done_testing();
-- 
2.38.1

#2Andreas Karlsson
andreas@proxel.se
In reply to: Thomas Munro (#1)
Re: RADIUS tests and improvements

On 1/3/23 04:11, Thomas Munro wrote:

Here's a draft patch to tackle a couple of TODOs in the RADIUS code in auth.c.

Nice to see someone working on this! I know of one company which could
have used the configurable timeout for radius because the 3 second
timeout is too short for 2FA. I think they ended up using PAM or some
other solution in the end, but I am not 100% sure.

[...] While adding
the GUC I couldn't help wondering why RADIUS even needs a timeout
separate from authentication_timeout; another way to go here would be
to remove it completely, but that'd be a policy change (removing the 3
second timeout we always had). Thoughts?

It was some time since I last looked at the code but my impression was
that the reason for having a separate timeout is that you can try the
next server after the first one timed out (multiple radius servers are
allowed). But I wonder if that really is a useful feature or if someone
just was too clever or it just was an accidental feature.

Andreas

#3Andreas Karlsson
andreas@proxel.se
In reply to: Andreas Karlsson (#2)
Re: RADIUS tests and improvements

On 1/3/23 22:03, Andreas Karlsson wrote:

On 1/3/23 04:11, Thomas Munro wrote:

Here's a draft patch to tackle a couple of TODOs in the RADIUS code in
auth.c.

Nice to see someone working on this!.

Another thing: shouldn't we set some wait event to indicate that we are
waiting the RADIUS server or is that pointless during authentication
since there are no queries running anyway?

Andreas

#4Thomas Munro
thomas.munro@gmail.com
In reply to: Andreas Karlsson (#3)
Re: RADIUS tests and improvements

On Wed, Jan 4, 2023 at 10:07 AM Andreas Karlsson <andreas@proxel.se> wrote:

Another thing: shouldn't we set some wait event to indicate that we are
waiting the RADIUS server or is that pointless during authentication
since there are no queries running anyway?

I initially added a wait_event value, but I couldn't see it
anywhere... there is no entry in pg_stat_activity for a backend that
is in that phase of authentication, so I just set it to zero.

#5Thomas Munro
thomas.munro@gmail.com
In reply to: Andreas Karlsson (#2)
Re: RADIUS tests and improvements

On Wed, Jan 4, 2023 at 10:03 AM Andreas Karlsson <andreas@proxel.se> wrote:

On 1/3/23 04:11, Thomas Munro wrote:

[...] While adding
the GUC I couldn't help wondering why RADIUS even needs a timeout
separate from authentication_timeout; another way to go here would be
to remove it completely, but that'd be a policy change (removing the 3
second timeout we always had). Thoughts?

It was some time since I last looked at the code but my impression was
that the reason for having a separate timeout is that you can try the
next server after the first one timed out (multiple radius servers are
allowed). But I wonder if that really is a useful feature or if someone
just was too clever or it just was an accidental feature.

Ah! Thanks, now that makes sense.

#6Andreas Karlsson
andreas@proxel.se
In reply to: Thomas Munro (#4)
Re: RADIUS tests and improvements

On 1/3/23 22:16, Thomas Munro wrote:

On Wed, Jan 4, 2023 at 10:07 AM Andreas Karlsson <andreas@proxel.se> wrote:

Another thing: shouldn't we set some wait event to indicate that we are
waiting the RADIUS server or is that pointless during authentication
since there are no queries running anyway?

I initially added a wait_event value, but I couldn't see it
anywhere... there is no entry in pg_stat_activity for a backend that
is in that phase of authentication, so I just set it to zero.

Thanks for the explanation, that makes a lot of sense!

Andreas

#7Thomas Munro
thomas.munro@gmail.com
In reply to: Andreas Karlsson (#6)
3 attachment(s)
Re: RADIUS tests and improvements

New improved version:

* fixed stupid misuse of PG_FINALLY() (oops, must have been thinking
of another language)
* realised that it was strange to have a GUC for the timeout, and made
a new HBA parameter instead
* added documentation for that
* used TimestampDifferenceMilliseconds() instead of open-coded TimestampTz maths

I don't exactly love the PG_TRY()/PG_CATCH() around the
CHECK_FOR_INTERRUPTS(). In fact this kind of CFI-with-cleanup problem
has been haunting me across several projects. For cases that memory
contexts and resource owners can't help with, I don't currently know
what else to do here. Better ideas welcome. If I just let that
socket leak because I know this backend will soon exit, I'd expect a
knock at the door from the programming police.

I don't actually know why we have
src/test/authentication/t/...{password,sasl,peer}..., but then
src/test/{kerberos,ldap,ssl}/t/001_auth.pl. For this one, I just
copied the second style, creating src/test/radius/t/001_auth.pl. I
can't explain why it should be like that, though. If I propose
another test for PAM, where should it go?

Attachments:

v2-0001-Add-simple-test-for-RADIUS-authentication.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Add-simple-test-for-RADIUS-authentication.patchDownload
From d7f57dfe4a316c8ac1270a5f35f837447e335153 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sat, 31 Dec 2022 14:41:57 +1300
Subject: [PATCH v2 1/3] Add simple test for RADIUS authentication.

This is similar to the existing tests for ldap and kerberos.  It
requires FreeRADIUS to be installed, and opens ports that may be
considered insecure, so users have to opt in with PG_EXTRA_TESTS=radius.

Discussion: https://postgr.es/m/CA%2BhUKGKxNoVjkMCksnj6z3BwiS3y2v6LN6z7_CisLK%2Brv%2B0V4g%40mail.gmail.com
---
 src/test/Makefile             |   2 +-
 src/test/meson.build          |   1 +
 src/test/radius/Makefile      |  23 +++++
 src/test/radius/meson.build   |  12 +++
 src/test/radius/t/001_auth.pl | 187 ++++++++++++++++++++++++++++++++++
 5 files changed, 224 insertions(+), 1 deletion(-)
 create mode 100644 src/test/radius/Makefile
 create mode 100644 src/test/radius/meson.build
 create mode 100644 src/test/radius/t/001_auth.pl

diff --git a/src/test/Makefile b/src/test/Makefile
index dbd3192874..687164412c 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -12,7 +12,7 @@ subdir = src/test
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 
-SUBDIRS = perl regress isolation modules authentication recovery subscription
+SUBDIRS = perl regress isolation modules authentication recovery radius subscription
 
 ifeq ($(with_icu),yes)
 SUBDIRS += icu
diff --git a/src/test/meson.build b/src/test/meson.build
index 5f3c9c2ba2..b5da17b531 100644
--- a/src/test/meson.build
+++ b/src/test/meson.build
@@ -5,6 +5,7 @@ subdir('isolation')
 
 subdir('authentication')
 subdir('recovery')
+subdir('radius')
 subdir('subscription')
 subdir('modules')
 
diff --git a/src/test/radius/Makefile b/src/test/radius/Makefile
new file mode 100644
index 0000000000..56768a3ca9
--- /dev/null
+++ b/src/test/radius/Makefile
@@ -0,0 +1,23 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/radius
+#
+# Portions Copyright (c) 1996-2022, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/radius/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/radius
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+check:
+	$(prove_check)
+
+installcheck:
+	$(prove_installcheck)
+
+clean distclean maintainer-clean:
+	rm -rf tmp_check
diff --git a/src/test/radius/meson.build b/src/test/radius/meson.build
new file mode 100644
index 0000000000..ea7afc4555
--- /dev/null
+++ b/src/test/radius/meson.build
@@ -0,0 +1,12 @@
+# Copyright (c) 2022, PostgreSQL Global Development Group
+
+tests += {
+  'name': 'radius',
+  'sd': meson.current_source_dir(),
+  'bd': meson.current_build_dir(),
+  'tap': {
+    'tests': [
+      't/001_auth.pl',
+    ],
+  },
+}
diff --git a/src/test/radius/t/001_auth.pl b/src/test/radius/t/001_auth.pl
new file mode 100644
index 0000000000..44db62a3d7
--- /dev/null
+++ b/src/test/radius/t/001_auth.pl
@@ -0,0 +1,187 @@
+
+# Copyright (c) 2021-2022, PostgreSQL Global Development Group
+
+# Debian: apt-get install freeradius
+# Homebrew: brew install freeradius-server
+# FreeBSD: pkg install freeradius3
+# MacPorts: port install freeradius
+
+use strict;
+use warnings;
+use File::Copy;
+use PostgreSQL::Test::Utils;
+use PostgreSQL::Test::Cluster;
+use Test::More;
+
+my $radiusd_dir = "${PostgreSQL::Test::Utils::tmp_check}/radiusd_data";
+my $radiusd_conf = "radiusd.conf";
+my $radiusd_users = "users.txt";
+my $radiusd_prefix;
+my $radiusd;
+
+if ($ENV{PG_TEST_EXTRA} !~ /\bradius\b/)
+{
+	plan skip_all => 'Potentially unsafe test RADIUS not enabled in PG_TEST_EXTRA';
+}
+elsif ($^O eq 'freebsd')
+{
+	$radiusd         = '/usr/local/sbin/radiusd';
+}
+elsif ($^O eq 'linux' && -f '/usr/sbin/freeradius')
+{
+	$radiusd         = '/usr/sbin/freeradius';
+}
+elsif ($^O eq 'linux')
+{
+	$radiusd         = '/usr/sbin/radiusd';
+}
+elsif ($^O eq 'darwin' && -d '/opt/local')
+{
+	# typical path for MacPorts
+	$radiusd         = '/opt/local/sbin/radiusd';
+	$radiusd_prefix  = '/opt/local';
+}
+elsif ($^O eq 'darwin' && -d '/opt/homebrew')
+{
+	# typical path for Homebrew on ARM
+	$radiusd         = '/opt/homebrew/bin/radiusd';
+	$radiusd_prefix  = '/opt/homebrew';
+}
+elsif ($^O eq 'darwin' && -d '/usr/local')
+{
+	# typical path for Homebrew on Intel
+	$radiusd         = '/usr/local/bin/radiusd';
+	$radiusd_prefix  = '/usr/local';
+}
+else
+{
+	plan skip_all =>
+	  "radius tests not supported on $^O or dependencies not installed";
+}
+
+my $radius_port     = PostgreSQL::Test::Cluster::get_free_port();
+
+note "setting up radiusd";
+
+mkdir $radiusd_dir or die "cannot create $radiusd_dir";
+
+append_to_file(
+	"$radiusd_dir/$radiusd_conf",
+	qq{client default {
+  ipaddr = "127.0.0.1"
+  secret = "secret"
+}
+
+modules {
+  files {
+    filename = "$radiusd_dir/users.txt"
+  }
+  pap {
+  }
+}
+
+server default {
+  listen {
+    type   = "auth"
+    ipv4addr = "127.0.0.1"
+    port = "$radius_port"
+  }
+  authenticate {
+    Auth-Type PAP {
+      pap
+    }
+  }
+  authorize {
+    files
+    pap
+  }
+}
+
+log {
+  destination = "files"
+  localstatedir = "$radiusd_dir"
+  logdir = "$radiusd_dir"
+  file = "$radiusd_dir/radius.log"
+}
+
+pidfile = "$radiusd_dir/radiusd.pid"
+});
+
+# help to find libraries that radiusd dlopens
+if ($radiusd_prefix)
+{
+	append_to_file(
+		"$radiusd_dir/$radiusd_conf",
+		qq{prefix="$radiusd_prefix"\n})
+}
+
+append_to_file(
+	"$radiusd_dir/$radiusd_users",
+	qq{test2 Cleartext-Password := "secret2"});
+
+system_or_bail $radiusd, '-xx', '-d', $radiusd_dir;
+
+END
+{
+	kill 'INT', `cat $radiusd_dir/radiusd.pid` if -f "$radiusd_dir/radiusd.pid";
+}
+
+note "setting up PostgreSQL instance";
+
+my $node = PostgreSQL::Test::Cluster->new('node');
+$node->init;
+$node->append_conf('postgresql.conf', "log_connections = on\n");
+$node->start;
+
+$node->safe_psql('postgres', 'CREATE USER test1;');
+$node->safe_psql('postgres', 'CREATE USER test2;');
+
+note "running tests";
+
+sub test_access
+{
+	local $Test::Builder::Level = $Test::Builder::Level + 1;
+
+	my ($node, $role, $expected_res, $test_name, %params) = @_;
+	my $connstr = "user=$role";
+
+	if ($expected_res eq 0)
+	{
+		$node->connect_ok($connstr, $test_name, %params);
+	}
+	else
+	{
+		# No checks of the error message, only the status code.
+		$node->connect_fails($connstr, $test_name, %params);
+	}
+}
+
+note "enable RADIUS auth";
+
+unlink($node->data_dir . '/pg_hba.conf');
+$node->append_conf('pg_hba.conf',
+	qq{local all all radius radiusservers="127.0.0.1" radiussecrets="secret" radiusports="$radius_port"}
+);
+$node->restart;
+
+note "simple negative and positive tests";
+
+$ENV{"PGPASSWORD"} = 'wrong';
+test_access(
+	$node, 'test1', 2,
+	'authentication fails if user not found in RADIUS',
+	log_unlike => [qr/connection authenticated:/]);
+test_access(
+	$node, 'test2', 2,
+	'authentication fails with wrong password',
+	log_unlike => [qr/connection authenticated:/]);
+
+$ENV{"PGPASSWORD"} = 'secret2';
+test_access(
+	$node, 'test2', 0,
+	'authentication succeeds with right password',
+	log_like => [
+		qr/connection authenticated: identity="test2" method=radius/
+	],);
+
+done_testing();
-- 
2.39.2

v2-0002-ci-Enable-RADIUS-test.patchtext/x-patch; charset=US-ASCII; name=v2-0002-ci-Enable-RADIUS-test.patchDownload
From 6804f63506dfb3c8473dfd0f4e19183c4405a188 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sat, 31 Dec 2022 15:08:19 +1300
Subject: [PATCH v2 2/3] ci: Enable RADIUS test.

XXX Need to get freeradius installed in the CI images for fast startup
before committing this.

Discussion: https://postgr.es/m/CA%2BhUKGKxNoVjkMCksnj6z3BwiS3y2v6LN6z7_CisLK%2Brv%2B0V4g%40mail.gmail.com
---
 .cirrus.yml | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/.cirrus.yml b/.cirrus.yml
index f212978752..f6e19fcc7f 100644
--- a/.cirrus.yml
+++ b/.cirrus.yml
@@ -25,7 +25,7 @@ env:
   MTEST_ARGS: --print-errorlogs --no-rebuild -C build
   PGCTLTIMEOUT: 120 # avoids spurious failures during parallel tests
   TEMP_CONFIG: ${CIRRUS_WORKING_DIR}/src/tools/ci/pg_ci_base.conf
-  PG_TEST_EXTRA: kerberos ldap ssl
+  PG_TEST_EXTRA: kerberos ldap ssl radius
 
 
 # What files to preserve in case tests fail
@@ -173,6 +173,7 @@ task:
     sysctl kern.corefile='/tmp/cores/%N.%P.core'
   setup_additional_packages_script: |
     #pkg install -y ...
+    pkg install -y freeradius3
 
   # NB: Intentionally build without -Dllvm. The freebsd image size is already
   # large enough to make VM startup slow, and even without llvm freebsd
@@ -284,6 +285,7 @@ task:
     LINUX_CONFIGURE_FEATURES: *LINUX_CONFIGURE_FEATURES
     LINUX_MESON_FEATURES: *LINUX_MESON_FEATURES
 
+
   depends_on: SanityCheck
   only_if: $CIRRUS_CHANGE_MESSAGE !=~ '.*\nci-os-only:.*' || $CIRRUS_CHANGE_MESSAGE =~ '.*\nci-os-only:[^\n]*linux.*'
 
@@ -317,6 +319,8 @@ task:
   setup_additional_packages_script: |
     #apt-get update
     #DEBIAN_FRONTEND=noninteractive apt-get -y install ...
+    apt-get update
+    DEBIAN_FRONTEND=noninteractive apt-get -y install freeradius
 
   matrix:
     - name: Linux - Debian Bullseye - Autoconf
@@ -463,6 +467,7 @@ task:
   setup_additional_packages_script: |
     brew install \
       ccache \
+      freeradius-server \
       icu4c \
       krb5 \
       llvm \
-- 
2.39.2

v2-0003-Use-latch-API-to-wait-for-RADIUS-authentication.patchtext/x-patch; charset=US-ASCII; name=v2-0003-Use-latch-API-to-wait-for-RADIUS-authentication.patchDownload
From 60a3427507929dd6d29359d3a89633d23d213181 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Sat, 31 Dec 2022 23:36:47 +1300
Subject: [PATCH v2 3/3] Use latch API to wait for RADIUS authentication.

Handle interrupts while waiting for a response from a RADIUS server,
completing a TODO in comments.  Remote clients can't really interrupt
authentication (they don't have a cancel key yet), but it's important to
process other kinds of interrupts promptly.

While here, also convert RADIUS_TIMEOUT to a configurable parameter,
another TODO that was left in comments, so that we can turn it up to
our standard very high 180s wait for automated testing on
slow/overloaded/valgrind build farm animals.

Now that the timeout is adjustable, we can also add a test of the
timeout code path with very short timeout.  Since the RADIUS protocol is
connectionless, we can provoke a timeout by sending a UDP packet to the
wrong port; nobody will write back to us and we will time out.

Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/CA%2BhUKGKxNoVjkMCksnj6z3BwiS3y2v6LN6z7_CisLK%2Brv%2B0V4g%40mail.gmail.com
---
 doc/src/sgml/client-auth.sgml | 12 +++++
 src/backend/libpq/auth.c      | 94 +++++++++++++++++++++--------------
 src/backend/libpq/hba.c       | 15 ++++++
 src/include/libpq/hba.h       |  1 +
 src/test/radius/t/001_auth.pl | 16 +++++-
 5 files changed, 100 insertions(+), 38 deletions(-)

diff --git a/doc/src/sgml/client-auth.sgml b/doc/src/sgml/client-auth.sgml
index b9d73deced..66c29b0de0 100644
--- a/doc/src/sgml/client-auth.sgml
+++ b/doc/src/sgml/client-auth.sgml
@@ -2147,6 +2147,18 @@ host ... ldap ldapbasedn="dc=example,dc=net"
        </listitem>
       </varlistentry>
 
+      <varlistentry>
+       <term><literal>radiustimeout</literal></term>
+       <listitem>
+        <para>
+         The time, in milliseconds, to wait for a response from a RADIUS
+         server before trying the next one in the list.  The default time,
+         if not specified, is 3000 milliseconds (3 seconds), which may not
+         be enough for some systems.
+        </para>
+       </listitem>
+      </varlistentry>
+
      </variablelist>
    </para>
 
diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 25b3a781cd..10c87414ce 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -195,7 +195,7 @@ static int	pg_SSPI_make_upn(char *accountname,
  *----------------------------------------------------------------
  */
 static int	CheckRADIUSAuth(Port *port);
-static int	PerformRadiusTransaction(const char *server, const char *secret, const char *portstr, const char *identifier, const char *user_name, const char *passwd);
+static int	PerformRadiusTransaction(const char *server, const char *secret, const char *portstr, const char *identifier, const char *user_name, const char *passwd, int timeout);
 
 
 /*
@@ -2802,9 +2802,6 @@ typedef struct
 /* RADIUS service types */
 #define RADIUS_AUTHENTICATE_ONLY	8
 
-/* Seconds to wait - XXX: should be in a config variable! */
-#define RADIUS_TIMEOUT 3
-
 static void
 radius_add_attribute(radius_packet *packet, uint8 type, const unsigned char *data, int len)
 {
@@ -2839,6 +2836,7 @@ CheckRADIUSAuth(Port *port)
 			   *secrets,
 			   *radiusports,
 			   *identifiers;
+	int			timeout;
 
 	/* Make sure struct alignment is correct */
 	Assert(offsetof(radius_packet, vector) == 4);
@@ -2858,6 +2856,10 @@ CheckRADIUSAuth(Port *port)
 		return STATUS_ERROR;
 	}
 
+	timeout = port->hba->radiustimeout;
+	if (timeout == 0)
+		timeout = 3000;			/* default to 3 seconds */
+
 	/* Send regular password request to client, and get the response */
 	sendAuthRequest(port, AUTH_REQ_PASSWORD, NULL, 0);
 
@@ -2886,7 +2888,8 @@ CheckRADIUSAuth(Port *port)
 												   radiusports ? lfirst(radiusports) : NULL,
 												   identifiers ? lfirst(identifiers) : NULL,
 												   port->user_name,
-												   passwd);
+												   passwd,
+												   timeout);
 
 		/*------
 		 * STATUS_OK = Login OK
@@ -2927,7 +2930,13 @@ CheckRADIUSAuth(Port *port)
 }
 
 static int
-PerformRadiusTransaction(const char *server, const char *secret, const char *portstr, const char *identifier, const char *user_name, const char *passwd)
+PerformRadiusTransaction(const char *server,
+						 const char *secret,
+						 const char *portstr,
+						 const char *identifier,
+						 const char *user_name,
+						 const char *passwd,
+						 int timeout)
 {
 	radius_packet radius_send_pack;
 	radius_packet radius_recv_pack;
@@ -2949,8 +2958,7 @@ PerformRadiusTransaction(const char *server, const char *secret, const char *por
 	struct addrinfo *serveraddrs;
 	int			port;
 	socklen_t	addrsize;
-	fd_set		fdset;
-	struct timeval endtime;
+	TimestampTz endtime;
 	int			i,
 				j,
 				r;
@@ -3085,26 +3093,18 @@ PerformRadiusTransaction(const char *server, const char *secret, const char *por
 
 	/*
 	 * Figure out at what time we should time out. We can't just use a single
-	 * call to select() with a timeout, since somebody can be sending invalid
-	 * packets to our port thus causing us to retry in a loop and never time
-	 * out.
-	 *
-	 * XXX: Using WaitLatchOrSocket() and doing a CHECK_FOR_INTERRUPTS() if
-	 * the latch was set would improve the responsiveness to
-	 * timeouts/cancellations.
+	 * wait with a timeout, since somebody can be sending invalid packets to
+	 * our port thus causing us to retry in a loop and never time out.
 	 */
-	gettimeofday(&endtime, NULL);
-	endtime.tv_sec += RADIUS_TIMEOUT;
+	endtime = TimestampTzPlusMilliseconds(GetCurrentTimestamp(), timeout);
 
 	while (true)
 	{
-		struct timeval timeout;
-		struct timeval now;
-		int64		timeoutval;
+		int			timeoutval;
 		const char *errstr = NULL;
 
-		gettimeofday(&now, NULL);
-		timeoutval = (endtime.tv_sec * 1000000 + endtime.tv_usec) - (now.tv_sec * 1000000 + now.tv_usec);
+		/* Remaining time, rounded up to the nearest millisecond. */
+		timeoutval = TimestampDifferenceMilliseconds(GetCurrentTimestamp(), endtime);
 		if (timeoutval <= 0)
 		{
 			ereport(LOG,
@@ -3113,25 +3113,39 @@ PerformRadiusTransaction(const char *server, const char *secret, const char *por
 			closesocket(sock);
 			return STATUS_ERROR;
 		}
-		timeout.tv_sec = timeoutval / 1000000;
-		timeout.tv_usec = timeoutval % 1000000;
 
-		FD_ZERO(&fdset);
-		FD_SET(sock, &fdset);
-
-		r = select(sock + 1, &fdset, NULL, NULL, &timeout);
-		if (r < 0)
+		/*
+		 * No point in supplying a wait_event value as we don't have a row in
+		 * pg_stat_activity yet.
+		 */
+		r = WaitLatchOrSocket(MyLatch,
+							  WL_SOCKET_READABLE | WL_EXIT_ON_PM_DEATH |
+							  WL_LATCH_SET | WL_TIMEOUT,
+							  sock,
+							  timeoutval,
+							  0);
+		if (r & WL_LATCH_SET)
 		{
-			if (errno == EINTR)
-				continue;
+			ResetLatch(MyLatch);
 
-			/* Anything else is an actual error */
-			ereport(LOG,
-					(errmsg("could not check status on RADIUS socket: %m")));
-			closesocket(sock);
-			return STATUS_ERROR;
+			/* Process interrupts, making sure to close the socket if we throw. */
+			if (INTERRUPTS_PENDING_CONDITION())
+			{
+				volatile pgsocket vsock = sock;
+				PG_TRY();
+				{
+					CHECK_FOR_INTERRUPTS();
+				}
+				PG_CATCH();
+				{
+					closesocket(vsock);
+					PG_RE_THROW();
+				}
+				PG_END_TRY();
+			}
+			continue;
 		}
-		if (r == 0)
+		else if (r & WL_TIMEOUT)
 		{
 			ereport(LOG,
 					(errmsg("timeout waiting for RADIUS response from %s",
@@ -3139,6 +3153,10 @@ PerformRadiusTransaction(const char *server, const char *secret, const char *por
 			closesocket(sock);
 			return STATUS_ERROR;
 		}
+		else
+		{
+			Assert(r & WL_SOCKET_READABLE);
+		}
 
 		/*
 		 * Attempt to read the response packet, and verify the contents.
@@ -3246,4 +3264,6 @@ PerformRadiusTransaction(const char *server, const char *secret, const char *por
 			continue;
 		}
 	}							/* while (true) */
+
+	pg_unreachable();
 }
diff --git a/src/backend/libpq/hba.c b/src/backend/libpq/hba.c
index adedbd3128..3a33f52af1 100644
--- a/src/backend/libpq/hba.c
+++ b/src/backend/libpq/hba.c
@@ -2514,6 +2514,21 @@ parse_hba_auth_opt(char *name, char *val, HbaLine *hbaline,
 		hbaline->radiusidentifiers = parsed_identifiers;
 		hbaline->radiusidentifiers_s = pstrdup(val);
 	}
+	else if (strcmp(name, "radiustimeout") == 0)
+	{
+		REQUIRE_AUTH_OPTION(uaRADIUS, "radiustimeout", "radius");
+
+		if ((hbaline->radiustimeout = atoi(val)) <= 0)
+		{
+			ereport(elevel,
+					(errcode(ERRCODE_CONFIG_FILE_ERROR),
+					 errmsg("could not parse RADIUS timeout \"%s\"",
+							val),
+					 errcontext("line %d of configuration file \"%s\"",
+								line_num, file_name)));
+			return false;
+		}
+	}
 	else
 	{
 		ereport(elevel,
diff --git a/src/include/libpq/hba.h b/src/include/libpq/hba.h
index 189f6d0df2..3f051d0ab5 100644
--- a/src/include/libpq/hba.h
+++ b/src/include/libpq/hba.h
@@ -135,6 +135,7 @@ typedef struct HbaLine
 	char	   *radiusidentifiers_s;
 	List	   *radiusports;
 	char	   *radiusports_s;
+	int			radiustimeout;
 } HbaLine;
 
 typedef struct IdentLine
diff --git a/src/test/radius/t/001_auth.pl b/src/test/radius/t/001_auth.pl
index 44db62a3d7..6d2e6e7345 100644
--- a/src/test/radius/t/001_auth.pl
+++ b/src/test/radius/t/001_auth.pl
@@ -60,6 +60,7 @@ else
 }
 
 my $radius_port     = PostgreSQL::Test::Cluster::get_free_port();
+my $not_radius_port = PostgreSQL::Test::Cluster::get_free_port();
 
 note "setting up radiusd";
 
@@ -159,8 +160,9 @@ sub test_access
 note "enable RADIUS auth";
 
 unlink($node->data_dir . '/pg_hba.conf');
+my $timeout = $PostgreSQL::Test::Utils::timeout_default * 1000;
 $node->append_conf('pg_hba.conf',
-	qq{local all all radius radiusservers="127.0.0.1" radiussecrets="secret" radiusports="$radius_port"}
+	qq{local all all radius radiusservers="127.0.0.1" radiussecrets="secret" radiusports="$radius_port" radiustimeout="$timeout"}
 );
 $node->restart;
 
@@ -184,4 +186,16 @@ test_access(
 		qr/connection authenticated: identity="test2" method=radius/
 	],);
 
+# Set the timeout very short and point to a non-existent radius server
+unlink($node->data_dir . '/pg_hba.conf');
+$node->append_conf('pg_hba.conf',
+	qq{local all all radius radiusservers="127.0.0.1" radiussecrets="secret" radiusports="$not_radius_port" radiustimeout="2"}
+);
+$node->restart;
+
+test_access(
+	$node, 'test2', 2,
+	'authentication fails with timeout',
+	log_like => [qr/timeout waiting for RADIUS response/]);
+
 done_testing();
-- 
2.39.2

#8Michael Paquier
michael@paquier.xyz
In reply to: Thomas Munro (#7)
Re: RADIUS tests and improvements

On Sat, Mar 04, 2023 at 02:23:10PM +1300, Thomas Munro wrote:

I don't exactly love the PG_TRY()/PG_CATCH() around the
CHECK_FOR_INTERRUPTS().

In fact this kind of CFI-with-cleanup problem
has been haunting me across several projects. For cases that memory
contexts and resource owners can't help with, I don't currently know
what else to do here. Better ideas welcome.

Like adding a Open/CloseSocket() in fd.c to control the leaks?

If I just let that
socket leak because I know this backend will soon exit, I'd expect a
knock at the door from the programming police.

Hmm. It seems to me that you'd better have two patches instead of one
here? First, one to introduce the new parameter to control the
timeout, and a second to improve the responsiveness with a
WaitLatch()? If the CFI proves to be an issue, it would be sad to
have to revert the configuration part, which is worth on its own.

I don't actually know why we have
src/test/authentication/t/...{password,sasl,peer}..., but then
src/test/{kerberos,ldap,ssl}/t/001_auth.pl. For this one, I just
copied the second style, creating src/test/radius/t/001_auth.pl. I
can't explain why it should be like that, though. If I propose
another test for PAM, where should it go?

My take would be to keep the number of directories in src/test/ to a
minimum in the long run. Still, this is a case-by-case, as it depends
on if a set of tests needs an expanded set of modules, configuration
files and/or multiple scripts. ssl has its own set of configuration
files with its module, so it makes sense to be independent. ldap has
its LdapServer.pm with a configuration file, again I'm OK with a
separate case. Kerberos has its own README, but IMO it could also be
moved to src/test/authentication/ as it has a simple structure, with
its requirements moved into a different README.

What this patch set does for the RADIUS test is simple enough in
structure that I would also add it in src/test/authentication/. That
means less Make-fu and less Meson-fu.

In 0001, PG_TEST_EXTRA requires radius for the test. This needs an
update of regress.sgml where the values available are listed. I think
that you'd better document that freeradius is required for the test in
one of the README (either create a new one in radius/, or add this
information to the one in authentication, as you feel).
--
Michael

#9Daniel Gustafsson
daniel@yesql.se
In reply to: Michael Paquier (#8)
Re: RADIUS tests and improvements

Hi Thomas,

Have you have a chance to look at and address the feedback given in this
thread?

--
Daniel Gustafsson