cryptohash: missing locking functions for OpenSSL <= 1.0.2?

Started by Jacob Championalmost 5 years ago8 messages
#1Jacob Champion
pchampion@vmware.com

While reviewing the NSS patch [1]/messages/by-id/40095f48c3c6d556293cb0ecf80ea10cdf7d26b3.camel@vmware.com, I noticed that the cryptohash
implementation for OpenSSL doesn't set up any locking callbacks in
frontend code. I think there has to be a call to
OPENSSL_set_locking_callback() before libpq starts reaching into the
EVP_* API, if ENABLE_THREAD_SAFETY and HAVE_CRYPTO_LOCK are both true.

This would only affect threaded libpq clients running OpenSSL 1.0.2 and
below, and it looks like the most likely code path to be affected is
the OpenSSL error stack. So if anything went wrong with one of those
hash calls, it's possible that libpq would crash (or worse, silently
misbehave somewhere in the TLS stack) instead of gracefully reporting
an error. [2]https://github.com/openssl/openssl/issues/4690 is an example of this in the wild.

--Jacob

[1]: /messages/by-id/40095f48c3c6d556293cb0ecf80ea10cdf7d26b3.camel@vmware.com
[2]: https://github.com/openssl/openssl/issues/4690

#2Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#1)
Re: cryptohash: missing locking functions for OpenSSL <= 1.0.2?

On Wed, Feb 17, 2021 at 06:34:36PM +0000, Jacob Champion wrote:

This would only affect threaded libpq clients running OpenSSL 1.0.2 and
below, and it looks like the most likely code path to be affected is
the OpenSSL error stack. So if anything went wrong with one of those
hash calls, it's possible that libpq would crash (or worse, silently
misbehave somewhere in the TLS stack) instead of gracefully reporting
an error. [2] is an example of this in the wild.

I have been discussing a bit this issue with Jacob, and that's a
problem we would need to address on HEAD. First, I have been looking
at this stuff in older versions with MD5 and SHA256 used by SCRAM when
it comes to ~13 with libpq:
- MD5 is based on the internal implementation of Postgres even when
building libpq with OpenSSL, so that would not be an issue.
- SHA256 is a different story though, because when building with
OpenSSL we would go through SHA256_{Init,Update,Final} for SCRAM
authentication. In the context of a SSL connection, the crypto part
is initialized. But that would *not* be the case of a connection in a
non-SSL context. Fortunately, and after looking at the OpenSSL code
(fips_md_init_ctx, HASH_UPDATE, etc.), there is no sign of locking
handling or errors, so I think that we are safe there.

Now comes the case of HEAD that uses EVP for MD5 and SHA256. A SSL
connection would initialize the crypto part, but that does not happen
for a non-SSL connection. So, logically, one could run into issues if
using MD5 or SCRAM with OpenSSL <= 1.0.2 (pgbench with a high number
of threads does not complain by the way), and we are not yet in a
stage where we should drop this version either, even if it has been
EOL'd by upstream at the end of 2019.

We have the code in place to properly initialize the crypto locking in
libpq with ENABLE_THREAD_SAFETY, but the root of the issue is that the
SSL and crypto initializations are grouped together. What we need to
do here is to split those phases of the initialization so as non-SSL
connections can use the crypto part properly, as pqsecure_initialize
gets only called now when libpq negotiates SSL with the postmaster.
It seems to me that we should make sure of a proper reset of the
crypto part within pqDropConnection(), while the initialization needs
to happen in PQconnectPoll().
--
Michael

#3Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#2)
1 attachment(s)
Re: cryptohash: missing locking functions for OpenSSL <= 1.0.2?

On Thu, Feb 18, 2021 at 11:04:05AM +0900, Michael Paquier wrote:

We have the code in place to properly initialize the crypto locking in
libpq with ENABLE_THREAD_SAFETY, but the root of the issue is that the
SSL and crypto initializations are grouped together. What we need to
do here is to split those phases of the initialization so as non-SSL
connections can use the crypto part properly, as pqsecure_initialize
gets only called now when libpq negotiates SSL with the postmaster.
It seems to me that we should make sure of a proper reset of the
crypto part within pqDropConnection(), while the initialization needs
to happen in PQconnectPoll().

So, I have tried a couple of things with a debug build of OpenSSL
1.0.2 at hand (two locks for the crypto and SSL initializations but
SSL_new() grabs some random bytes that need the same callback to be
set or the state of the threads is messed up, some global states to
control the load), and the simplest solution I have come up with is to
control in each pg_conn if the crypto callbacks have been initialized
or not so as we avoid multiple inits and/or drops of the state for a
single connection. I have arrived at this conclusion after hunting
down cases with pqDropConnection() which would could be called
multiple times, particularly if there are connection attempts to
multiple hosts.

The attached patch implements things this way, and initializes the
crypto callbacks before sending the startup packet, before deciding if
SSL needs to be requested or not. I have played with several
threading scenarios with this patch, with and without OpenSSL, and the
numbers match in terms of callback loading and unloading (the global
counter used in fe-secure-openssl.c gets to zero).
--
Michael

Attachments:

cryptohash-libpq.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index db71fea169..1c05fa44a8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2882,6 +2882,13 @@ keep_going:						/* We will come back to here until there is
 				}
 #endif
 
+				/*
+				 * Enable the crypto callbacks before checking if SSL needs
+				 * to be done and before sending the startup packet.
+				 */
+				if (pqsecure_initialize(conn, false, true) < 0)
+					goto error_return;
+
 #ifdef USE_SSL
 
 				/*
@@ -2999,8 +3006,12 @@ keep_going:						/* We will come back to here until there is
 					{
 						/* mark byte consumed */
 						conn->inStart = conn->inCursor;
-						/* Set up global SSL state if required */
-						if (pqsecure_initialize(conn) != 0)
+						/*
+						 * Set up global SSL state if required.  The crypto
+						 * state has already been set, so there is no need
+						 * to make that happen again.
+						 */
+						if (pqsecure_initialize(conn, true, false) != 0)
 							goto error_return;
 					}
 					else if (SSLok == 'N')
diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index 0fa10a23b4..8b6f860d22 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -85,7 +85,7 @@ static bool pq_init_crypto_lib = true;
 static bool ssl_lib_initialized = false;
 
 #ifdef ENABLE_THREAD_SAFETY
-static long ssl_open_connections = 0;
+static long crypto_open_connections = 0;
 
 #ifndef WIN32
 static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER;
@@ -111,7 +111,7 @@ pgtls_init_library(bool do_ssl, int do_crypto)
 	 * Disallow changing the flags while we have open connections, else we'd
 	 * get completely confused.
 	 */
-	if (ssl_open_connections != 0)
+	if (crypto_open_connections != 0)
 		return;
 #endif
 
@@ -635,7 +635,7 @@ pq_lockingcallback(int mode, int n, const char *file, int line)
  * override it.
  */
 int
-pgtls_init(PGconn *conn)
+pgtls_init(PGconn *conn, bool do_ssl, bool do_crypto)
 {
 #ifdef ENABLE_THREAD_SAFETY
 #ifdef WIN32
@@ -684,16 +684,21 @@ pgtls_init(PGconn *conn)
 			}
 		}
 
-		if (ssl_open_connections++ == 0)
+		if (do_crypto && !conn->crypto_loaded)
 		{
-			/*
-			 * These are only required for threaded libcrypto applications,
-			 * but make sure we don't stomp on them if they're already set.
-			 */
-			if (CRYPTO_get_id_callback() == NULL)
-				CRYPTO_set_id_callback(pq_threadidcallback);
-			if (CRYPTO_get_locking_callback() == NULL)
-				CRYPTO_set_locking_callback(pq_lockingcallback);
+			if (crypto_open_connections++ == 0)
+			{
+				/*
+				 * These are only required for threaded libcrypto applications,
+				 * but make sure we don't stomp on them if they're already set.
+				 */
+				if (CRYPTO_get_id_callback() == NULL)
+					CRYPTO_set_id_callback(pq_threadidcallback);
+				if (CRYPTO_get_locking_callback() == NULL)
+					CRYPTO_set_locking_callback(pq_lockingcallback);
+			}
+
+			conn->crypto_loaded = true;
 		}
 	}
 #endif							/* HAVE_CRYPTO_LOCK */
@@ -701,17 +706,20 @@ pgtls_init(PGconn *conn)
 
 	if (!ssl_lib_initialized)
 	{
-		if (pq_init_ssl_lib)
+		if (do_ssl)
 		{
+			if (pq_init_ssl_lib)
+			{
 #ifdef HAVE_OPENSSL_INIT_SSL
-			OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL);
+				OPENSSL_init_ssl(OPENSSL_INIT_LOAD_CONFIG, NULL);
 #else
-			OPENSSL_config(NULL);
-			SSL_library_init();
-			SSL_load_error_strings();
+				OPENSSL_config(NULL);
+				SSL_library_init();
+				SSL_load_error_strings();
 #endif
+			}
+			ssl_lib_initialized = true;
 		}
-		ssl_lib_initialized = true;
 	}
 
 #ifdef ENABLE_THREAD_SAFETY
@@ -740,10 +748,10 @@ destroy_ssl_system(void)
 	if (pthread_mutex_lock(&ssl_config_mutex))
 		return;
 
-	if (pq_init_crypto_lib && ssl_open_connections > 0)
-		--ssl_open_connections;
+	if (pq_init_crypto_lib && crypto_open_connections > 0)
+		--crypto_open_connections;
 
-	if (pq_init_crypto_lib && ssl_open_connections == 0)
+	if (pq_init_crypto_lib && crypto_open_connections == 0)
 	{
 		/*
 		 * No connections left, unregister libcrypto callbacks, if no one
@@ -1400,48 +1408,68 @@ open_client_SSL(PGconn *conn)
 void
 pgtls_close(PGconn *conn)
 {
-	bool		destroy_needed = false;
-
-	if (conn->ssl)
+	if (conn->ssl_in_use)
 	{
-		/*
-		 * We can't destroy everything SSL-related here due to the possible
-		 * later calls to OpenSSL routines which may need our thread
-		 * callbacks, so set a flag here and check at the end.
-		 */
-		destroy_needed = true;
+		bool		destroy_needed = false;
 
-		SSL_shutdown(conn->ssl);
-		SSL_free(conn->ssl);
-		conn->ssl = NULL;
-		conn->ssl_in_use = false;
-	}
+		if (conn->ssl)
+		{
+			/*
+			 * We can't destroy everything SSL-related here due to the possible
+			 * later calls to OpenSSL routines which may need our thread
+			 * callbacks, so set a flag here and check at the end.
+			 */
 
-	if (conn->peer)
-	{
-		X509_free(conn->peer);
-		conn->peer = NULL;
-	}
+			SSL_shutdown(conn->ssl);
+			SSL_free(conn->ssl);
+			conn->ssl = NULL;
+			conn->ssl_in_use = false;
+
+			destroy_needed = true;
+		}
+
+		if (conn->peer)
+		{
+			X509_free(conn->peer);
+			conn->peer = NULL;
+		}
 
 #ifdef USE_SSL_ENGINE
-	if (conn->engine)
-	{
-		ENGINE_finish(conn->engine);
-		ENGINE_free(conn->engine);
-		conn->engine = NULL;
-	}
+		if (conn->engine)
+		{
+			ENGINE_finish(conn->engine);
+			ENGINE_free(conn->engine);
+			conn->engine = NULL;
+		}
 #endif
 
-	/*
-	 * This will remove our SSL locking hooks, if this is the last SSL
-	 * connection, which means we must wait to call it until after all SSL
-	 * calls have been made, otherwise we can end up with a race condition and
-	 * possible deadlocks.
-	 *
-	 * See comments above destroy_ssl_system().
-	 */
-	if (destroy_needed)
-		destroy_ssl_system();
+		/*
+		 * This will remove our crypto locking hooks, if this is the last SSL
+		 * connection, which means we must wait to call it until after all SSL
+		 * calls have been made, otherwise we can end up with a race condition
+		 * and possible deadlocks.
+		 *
+		 * See comments above destroy_ssl_system().
+		 */
+		if (destroy_needed && conn->crypto_loaded)
+		{
+			destroy_ssl_system();
+			conn->crypto_loaded = false;
+		}
+	}
+	else
+	{
+		/*
+		 * In the non-SSL case, just remove the crypto callbacks if they
+		 * are set.  This will be done once this is the last connection
+		 * to handle.
+		 */
+		if (conn->crypto_loaded)
+		{
+			destroy_ssl_system();
+			conn->crypto_loaded = false;
+		}
+	}
 }
 
 
diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
index c601071838..b15d8d137c 100644
--- a/src/interfaces/libpq/fe-secure.c
+++ b/src/interfaces/libpq/fe-secure.c
@@ -159,12 +159,12 @@ PQinitOpenSSL(int do_ssl, int do_crypto)
  *	Initialize global SSL context
  */
 int
-pqsecure_initialize(PGconn *conn)
+pqsecure_initialize(PGconn *conn, bool do_ssl, bool do_crypto)
 {
 	int			r = 0;
 
 #ifdef USE_SSL
-	r = pgtls_init(conn);
+	r = pgtls_init(conn, do_ssl, do_crypto);
 #endif
 
 	return r;
@@ -191,8 +191,7 @@ void
 pqsecure_close(PGconn *conn)
 {
 #ifdef USE_SSL
-	if (conn->ssl_in_use)
-		pgtls_close(conn);
+	pgtls_close(conn);
 #endif
 }
 
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index ce36aabd25..fd42885153 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -485,6 +485,12 @@ struct pg_conn
 	void	   *engine;			/* dummy field to keep struct the same if
 								 * OpenSSL version changes */
 #endif
+	bool		crypto_loaded;	/* Track if libcrypto locking callbacks
+								 * have been done for this connection.
+								 * This can be removed once support for
+								 * OpenSSL 1.0.2 is removed as this
+								 * this locking is handled internally in
+								 * OpenSSL >= 1.1.0. */
 #endif							/* USE_OPENSSL */
 #endif							/* USE_SSL */
 
@@ -682,7 +688,7 @@ extern int	pqWriteReady(PGconn *conn);
 
 /* === in fe-secure.c === */
 
-extern int	pqsecure_initialize(PGconn *);
+extern int	pqsecure_initialize(PGconn *, bool, bool);
 extern PostgresPollingStatusType pqsecure_open_client(PGconn *);
 extern void pqsecure_close(PGconn *);
 extern ssize_t pqsecure_read(PGconn *, void *ptr, size_t len);
@@ -715,7 +721,7 @@ extern void pgtls_init_library(bool do_ssl, int do_crypto);
  *
  * Returns 0 if OK, -1 on failure (adding a message to conn->errorMessage).
  */
-extern int	pgtls_init(PGconn *conn);
+extern int	pgtls_init(PGconn *conn, bool do_ssl, bool do_crypto);
 
 /*
  *	Begin or continue negotiating a secure session.
#4Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#3)
2 attachment(s)
Re: cryptohash: missing locking functions for OpenSSL <= 1.0.2?

On Fri, Feb 19, 2021 at 02:37:06PM +0900, Michael Paquier wrote:

The attached patch implements things this way, and initializes the
crypto callbacks before sending the startup packet, before deciding if
SSL needs to be requested or not. I have played with several
threading scenarios with this patch, with and without OpenSSL, and the
numbers match in terms of callback loading and unloading (the global
counter used in fe-secure-openssl.c gets to zero).

I have done more work and much more tests with this patch, polishing
things as of the attached v2. First, I don't see any performance
impact or concurrency issues, using up to 200 threads with pgbench -C
-n -j N -c N -f blah.sql where the SQL file includes a single
meta-command like that for instance:
\set a 1

This ensures that connection requests happen a maximum in concurrency,
and libpq stays close to the maximum for the number of open threads.
Attached is a second, simple program that I have used to stress the
case of threads using both SSL and non-SSL connections in parallel to
check for the consistency of the callbacks and their release, mainly
across MD5 and SCRAM.

Extra eyes are welcome here, though I feel comfortable with the
approach taken here.
--
Michael

Attachments:

pthread_libpq.ctext/x-csrc; charset=us-asciiDownload
cryptohash-libpq-v2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 9812a14662..f23942ab2b 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2918,6 +2918,16 @@ keep_going:						/* We will come back to here until there is
 
 #ifdef USE_SSL
 
+				/*
+				 * Enable the libcrypto callbacks before checking if SSL needs
+				 * to be done.  This is done before sending the startup packet
+				 * as depending on the type of authentication done, like MD5
+				 * or SCRAM, the callbacks would be required even without a
+				 * SSL connection
+				 */
+				if (pqsecure_initialize(conn, false, true) < 0)
+					goto error_return;
+
 				/*
 				 * If SSL is enabled and we haven't already got encryption of
 				 * some sort running, request SSL instead of sending the
@@ -3033,8 +3043,14 @@ keep_going:						/* We will come back to here until there is
 					{
 						/* mark byte consumed */
 						conn->inStart = conn->inCursor;
-						/* Set up global SSL state if required */
-						if (pqsecure_initialize(conn) != 0)
+
+						/*
+						 * Set up global SSL state if required.  The crypto
+						 * state has already been set if libpq took care of
+						 * doing that, so there is no need to make that happen
+						 * again.
+						 */
+						if (pqsecure_initialize(conn, true, false) != 0)
 							goto error_return;
 					}
 					else if (SSLok == 'N')
diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index 0fa10a23b4..dbac824702 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -85,7 +85,7 @@ static bool pq_init_crypto_lib = true;
 static bool ssl_lib_initialized = false;
 
 #ifdef ENABLE_THREAD_SAFETY
-static long ssl_open_connections = 0;
+static long crypto_open_connections = 0;
 
 #ifndef WIN32
 static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER;
@@ -111,7 +111,7 @@ pgtls_init_library(bool do_ssl, int do_crypto)
 	 * Disallow changing the flags while we have open connections, else we'd
 	 * get completely confused.
 	 */
-	if (ssl_open_connections != 0)
+	if (crypto_open_connections != 0)
 		return;
 #endif
 
@@ -635,7 +635,7 @@ pq_lockingcallback(int mode, int n, const char *file, int line)
  * override it.
  */
 int
-pgtls_init(PGconn *conn)
+pgtls_init(PGconn *conn, bool do_ssl, bool do_crypto)
 {
 #ifdef ENABLE_THREAD_SAFETY
 #ifdef WIN32
@@ -684,22 +684,28 @@ pgtls_init(PGconn *conn)
 			}
 		}
 
-		if (ssl_open_connections++ == 0)
+		if (do_crypto && !conn->crypto_loaded)
 		{
-			/*
-			 * These are only required for threaded libcrypto applications,
-			 * but make sure we don't stomp on them if they're already set.
-			 */
-			if (CRYPTO_get_id_callback() == NULL)
-				CRYPTO_set_id_callback(pq_threadidcallback);
-			if (CRYPTO_get_locking_callback() == NULL)
-				CRYPTO_set_locking_callback(pq_lockingcallback);
+			if (crypto_open_connections++ == 0)
+			{
+				/*
+				 * These are only required for threaded libcrypto
+				 * applications, but make sure we don't stomp on them if
+				 * they're already set.
+				 */
+				if (CRYPTO_get_id_callback() == NULL)
+					CRYPTO_set_id_callback(pq_threadidcallback);
+				if (CRYPTO_get_locking_callback() == NULL)
+					CRYPTO_set_locking_callback(pq_lockingcallback);
+			}
+
+			conn->crypto_loaded = true;
 		}
 	}
 #endif							/* HAVE_CRYPTO_LOCK */
 #endif							/* ENABLE_THREAD_SAFETY */
 
-	if (!ssl_lib_initialized)
+	if (!ssl_lib_initialized && do_ssl)
 	{
 		if (pq_init_ssl_lib)
 		{
@@ -740,10 +746,10 @@ destroy_ssl_system(void)
 	if (pthread_mutex_lock(&ssl_config_mutex))
 		return;
 
-	if (pq_init_crypto_lib && ssl_open_connections > 0)
-		--ssl_open_connections;
+	if (pq_init_crypto_lib && crypto_open_connections > 0)
+		--crypto_open_connections;
 
-	if (pq_init_crypto_lib && ssl_open_connections == 0)
+	if (pq_init_crypto_lib && crypto_open_connections == 0)
 	{
 		/*
 		 * No connections left, unregister libcrypto callbacks, if no one
@@ -1402,46 +1408,61 @@ pgtls_close(PGconn *conn)
 {
 	bool		destroy_needed = false;
 
-	if (conn->ssl)
+	if (conn->ssl_in_use)
 	{
-		/*
-		 * We can't destroy everything SSL-related here due to the possible
-		 * later calls to OpenSSL routines which may need our thread
-		 * callbacks, so set a flag here and check at the end.
-		 */
-		destroy_needed = true;
+		if (conn->ssl)
+		{
+			/*
+			 * We can't destroy everything SSL-related here due to the
+			 * possible later calls to OpenSSL routines which may need our
+			 * thread callbacks, so set a flag here and check at the end.
+			 */
 
-		SSL_shutdown(conn->ssl);
-		SSL_free(conn->ssl);
-		conn->ssl = NULL;
-		conn->ssl_in_use = false;
-	}
+			SSL_shutdown(conn->ssl);
+			SSL_free(conn->ssl);
+			conn->ssl = NULL;
+			conn->ssl_in_use = false;
 
-	if (conn->peer)
-	{
-		X509_free(conn->peer);
-		conn->peer = NULL;
-	}
+			destroy_needed = true;
+		}
+
+		if (conn->peer)
+		{
+			X509_free(conn->peer);
+			conn->peer = NULL;
+		}
 
 #ifdef USE_SSL_ENGINE
-	if (conn->engine)
-	{
-		ENGINE_finish(conn->engine);
-		ENGINE_free(conn->engine);
-		conn->engine = NULL;
-	}
+		if (conn->engine)
+		{
+			ENGINE_finish(conn->engine);
+			ENGINE_free(conn->engine);
+			conn->engine = NULL;
+		}
 #endif
+	}
+	else
+	{
+		/*
+		 * In the non-SSL case, just remove the crypto callbacks.  This code
+		 * path has no dependency on any pending SSL calls.
+		 */
+		destroy_needed = true;
+	}
 
 	/*
-	 * This will remove our SSL locking hooks, if this is the last SSL
-	 * connection, which means we must wait to call it until after all SSL
-	 * calls have been made, otherwise we can end up with a race condition and
-	 * possible deadlocks.
+	 * This will remove our crypto locking hooks if this is the last
+	 * connection using libcrypto which means we must wait to call it until
+	 * after all the potential SSL calls have been made, otherwise we can end
+	 * up with a race condition and possible deadlocks.
 	 *
 	 * See comments above destroy_ssl_system().
 	 */
-	if (destroy_needed)
+	if (destroy_needed && conn->crypto_loaded)
+	{
 		destroy_ssl_system();
+		conn->crypto_loaded = false;
+	}
 }
 
 
diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
index c601071838..b15d8d137c 100644
--- a/src/interfaces/libpq/fe-secure.c
+++ b/src/interfaces/libpq/fe-secure.c
@@ -159,12 +159,12 @@ PQinitOpenSSL(int do_ssl, int do_crypto)
  *	Initialize global SSL context
  */
 int
-pqsecure_initialize(PGconn *conn)
+pqsecure_initialize(PGconn *conn, bool do_ssl, bool do_crypto)
 {
 	int			r = 0;
 
 #ifdef USE_SSL
-	r = pgtls_init(conn);
+	r = pgtls_init(conn, do_ssl, do_crypto);
 #endif
 
 	return r;
@@ -191,8 +191,7 @@ void
 pqsecure_close(PGconn *conn)
 {
 #ifdef USE_SSL
-	if (conn->ssl_in_use)
-		pgtls_close(conn);
+	pgtls_close(conn);
 #endif
 }
 
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 0c9e95f1a7..b045915e6d 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -506,6 +506,11 @@ struct pg_conn
 	void	   *engine;			/* dummy field to keep struct the same if
 								 * OpenSSL version changes */
 #endif
+	bool		crypto_loaded;	/* Track if libcrypto locking callbacks have
+								 * been done for this connection. This can be
+								 * removed once support for OpenSSL 1.0.2 is
+								 * removed as this this locking is handled
+								 * internally in OpenSSL >= 1.1.0. */
 #endif							/* USE_OPENSSL */
 #endif							/* USE_SSL */
 
@@ -703,7 +708,7 @@ extern int	pqWriteReady(PGconn *conn);
 
 /* === in fe-secure.c === */
 
-extern int	pqsecure_initialize(PGconn *);
+extern int	pqsecure_initialize(PGconn *, bool, bool);
 extern PostgresPollingStatusType pqsecure_open_client(PGconn *);
 extern void pqsecure_close(PGconn *);
 extern ssize_t pqsecure_read(PGconn *, void *ptr, size_t len);
@@ -732,11 +737,13 @@ extern void pgtls_init_library(bool do_ssl, int do_crypto);
  * Initialize SSL library.
  *
  * The conn parameter is only used to be able to pass back an error
- * message - no connection-local setup is made here.
+ * message - no connection-local setup is made here.  do_ssl controls
+ * if SSL is initialized, and do_crypto does the same for the crypto
+ * part.
  *
  * Returns 0 if OK, -1 on failure (adding a message to conn->errorMessage).
  */
-extern int	pgtls_init(PGconn *conn);
+extern int	pgtls_init(PGconn *conn, bool do_ssl, bool do_crypto);
 
 /*
  *	Begin or continue negotiating a secure session.
#5Jacob Champion
pchampion@vmware.com
In reply to: Michael Paquier (#4)
Re: cryptohash: missing locking functions for OpenSSL <= 1.0.2?

On Wed, 2021-03-03 at 15:30 +0900, Michael Paquier wrote:

Extra eyes are welcome here, though I feel comfortable with the
approach taken here.

I have one suggestion for the new logic:

else
{
/*
* In the non-SSL case, just remove the crypto callbacks. This code
* path has no dependency on any pending SSL calls.
*/
destroy_needed = true;
}
[...]
if (destroy_needed && conn->crypto_loaded)
{
destroy_ssl_system();
conn->crypto_loaded = false;
}

I had to convince myself that this logic is correct -- we set
destroy_needed even if crypto is not enabled, but then check later to
make sure that crypto_loaded is true before doing anything. What would
you think about moving the conn->crypto_loaded check to the else
branch, so that destroy_needed is only set if we actually need it?

Either way, the patch looks good to me and behaves nicely in testing.
Thanks!

--Jacob

#6Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#5)
1 attachment(s)
Re: cryptohash: missing locking functions for OpenSSL <= 1.0.2?

On Mon, Mar 08, 2021 at 06:06:32PM +0000, Jacob Champion wrote:

I had to convince myself that this logic is correct -- we set
destroy_needed even if crypto is not enabled, but then check later to
make sure that crypto_loaded is true before doing anything. What would
you think about moving the conn->crypto_loaded check to the else
branch, so that destroy_needed is only set if we actually need it?

Do you mean something like the attached? If I recall my mood from the
moment, I think that I did that to be more careful with the case where
the client has its own set of callbacks set (pq_init_crypto_lib as
false) but that does not matter as this is double-checked in
destroy_ssl_system(). I have adjusted some comments after more
review.
--
Michael

Attachments:

cryptohash-libpq-v3.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 29054bad7b..e4a85c37c4 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2887,6 +2887,16 @@ keep_going:						/* We will come back to here until there is
 
 #ifdef USE_SSL
 
+				/*
+				 * Enable the libcrypto callbacks before checking if SSL needs
+				 * to be done.  This is done before sending the startup packet
+				 * as depending on the type of authentication done, like MD5
+				 * or SCRAM, the callbacks would be required even without a
+				 * SSL connection
+				 */
+				if (pqsecure_initialize(conn, false, true) < 0)
+					goto error_return;
+
 				/*
 				 * If SSL is enabled and we haven't already got encryption of
 				 * some sort running, request SSL instead of sending the
@@ -2998,8 +3008,14 @@ keep_going:						/* We will come back to here until there is
 					{
 						/* mark byte consumed */
 						conn->inStart = conn->inCursor;
-						/* Set up global SSL state if required */
-						if (pqsecure_initialize(conn) != 0)
+
+						/*
+						 * Set up global SSL state if required.  The crypto
+						 * state has already been set if libpq took care of
+						 * doing that, so there is no need to make that happen
+						 * again.
+						 */
+						if (pqsecure_initialize(conn, true, false) != 0)
 							goto error_return;
 					}
 					else if (SSLok == 'N')
diff --git a/src/interfaces/libpq/fe-secure-openssl.c b/src/interfaces/libpq/fe-secure-openssl.c
index 0fa10a23b4..3de8f89264 100644
--- a/src/interfaces/libpq/fe-secure-openssl.c
+++ b/src/interfaces/libpq/fe-secure-openssl.c
@@ -85,7 +85,7 @@ static bool pq_init_crypto_lib = true;
 static bool ssl_lib_initialized = false;
 
 #ifdef ENABLE_THREAD_SAFETY
-static long ssl_open_connections = 0;
+static long crypto_open_connections = 0;
 
 #ifndef WIN32
 static pthread_mutex_t ssl_config_mutex = PTHREAD_MUTEX_INITIALIZER;
@@ -111,7 +111,7 @@ pgtls_init_library(bool do_ssl, int do_crypto)
 	 * Disallow changing the flags while we have open connections, else we'd
 	 * get completely confused.
 	 */
-	if (ssl_open_connections != 0)
+	if (crypto_open_connections != 0)
 		return;
 #endif
 
@@ -635,7 +635,7 @@ pq_lockingcallback(int mode, int n, const char *file, int line)
  * override it.
  */
 int
-pgtls_init(PGconn *conn)
+pgtls_init(PGconn *conn, bool do_ssl, bool do_crypto)
 {
 #ifdef ENABLE_THREAD_SAFETY
 #ifdef WIN32
@@ -684,22 +684,28 @@ pgtls_init(PGconn *conn)
 			}
 		}
 
-		if (ssl_open_connections++ == 0)
+		if (do_crypto && !conn->crypto_loaded)
 		{
-			/*
-			 * These are only required for threaded libcrypto applications,
-			 * but make sure we don't stomp on them if they're already set.
-			 */
-			if (CRYPTO_get_id_callback() == NULL)
-				CRYPTO_set_id_callback(pq_threadidcallback);
-			if (CRYPTO_get_locking_callback() == NULL)
-				CRYPTO_set_locking_callback(pq_lockingcallback);
+			if (crypto_open_connections++ == 0)
+			{
+				/*
+				 * These are only required for threaded libcrypto
+				 * applications, but make sure we don't stomp on them if
+				 * they're already set.
+				 */
+				if (CRYPTO_get_id_callback() == NULL)
+					CRYPTO_set_id_callback(pq_threadidcallback);
+				if (CRYPTO_get_locking_callback() == NULL)
+					CRYPTO_set_locking_callback(pq_lockingcallback);
+			}
+
+			conn->crypto_loaded = true;
 		}
 	}
 #endif							/* HAVE_CRYPTO_LOCK */
 #endif							/* ENABLE_THREAD_SAFETY */
 
-	if (!ssl_lib_initialized)
+	if (!ssl_lib_initialized && do_ssl)
 	{
 		if (pq_init_ssl_lib)
 		{
@@ -740,10 +746,10 @@ destroy_ssl_system(void)
 	if (pthread_mutex_lock(&ssl_config_mutex))
 		return;
 
-	if (pq_init_crypto_lib && ssl_open_connections > 0)
-		--ssl_open_connections;
+	if (pq_init_crypto_lib && crypto_open_connections > 0)
+		--crypto_open_connections;
 
-	if (pq_init_crypto_lib && ssl_open_connections == 0)
+	if (pq_init_crypto_lib && crypto_open_connections == 0)
 	{
 		/*
 		 * No connections left, unregister libcrypto callbacks, if no one
@@ -1402,46 +1408,63 @@ pgtls_close(PGconn *conn)
 {
 	bool		destroy_needed = false;
 
-	if (conn->ssl)
+	if (conn->ssl_in_use)
 	{
-		/*
-		 * We can't destroy everything SSL-related here due to the possible
-		 * later calls to OpenSSL routines which may need our thread
-		 * callbacks, so set a flag here and check at the end.
-		 */
-		destroy_needed = true;
+		if (conn->ssl)
+		{
+			/*
+			 * We can't destroy everything SSL-related here due to the
+			 * possible later calls to OpenSSL routines which may need our
+			 * thread callbacks, so set a flag here and check at the end.
+			 */
 
-		SSL_shutdown(conn->ssl);
-		SSL_free(conn->ssl);
-		conn->ssl = NULL;
-		conn->ssl_in_use = false;
-	}
+			SSL_shutdown(conn->ssl);
+			SSL_free(conn->ssl);
+			conn->ssl = NULL;
+			conn->ssl_in_use = false;
 
-	if (conn->peer)
-	{
-		X509_free(conn->peer);
-		conn->peer = NULL;
-	}
+			destroy_needed = true;
+		}
+
+		if (conn->peer)
+		{
+			X509_free(conn->peer);
+			conn->peer = NULL;
+		}
 
 #ifdef USE_SSL_ENGINE
-	if (conn->engine)
-	{
-		ENGINE_finish(conn->engine);
-		ENGINE_free(conn->engine);
-		conn->engine = NULL;
-	}
+		if (conn->engine)
+		{
+			ENGINE_finish(conn->engine);
+			ENGINE_free(conn->engine);
+			conn->engine = NULL;
+		}
 #endif
+	}
+	else
+	{
+		/*
+		 * In the non-SSL case, just remove the crypto callbacks if they
+		 * are loaded.  This code path has no dependency on any pending
+		 * SSL calls.
+		 */
+		if (conn->crypto_loaded)
+			destroy_needed = true;
+	}
 
 	/*
-	 * This will remove our SSL locking hooks, if this is the last SSL
-	 * connection, which means we must wait to call it until after all SSL
-	 * calls have been made, otherwise we can end up with a race condition and
-	 * possible deadlocks.
+	 * This will remove our crypto locking hooks if this is the last
+	 * connection using libcrypto which means we must wait to call it until
+	 * after all the potential SSL calls have been made, otherwise we can end
+	 * up with a race condition and possible deadlocks.
 	 *
 	 * See comments above destroy_ssl_system().
 	 */
 	if (destroy_needed)
+	{
 		destroy_ssl_system();
+		conn->crypto_loaded = false;
+	}
 }
 
 
diff --git a/src/interfaces/libpq/fe-secure.c b/src/interfaces/libpq/fe-secure.c
index c601071838..b15d8d137c 100644
--- a/src/interfaces/libpq/fe-secure.c
+++ b/src/interfaces/libpq/fe-secure.c
@@ -159,12 +159,12 @@ PQinitOpenSSL(int do_ssl, int do_crypto)
  *	Initialize global SSL context
  */
 int
-pqsecure_initialize(PGconn *conn)
+pqsecure_initialize(PGconn *conn, bool do_ssl, bool do_crypto)
 {
 	int			r = 0;
 
 #ifdef USE_SSL
-	r = pgtls_init(conn);
+	r = pgtls_init(conn, do_ssl, do_crypto);
 #endif
 
 	return r;
@@ -191,8 +191,7 @@ void
 pqsecure_close(PGconn *conn)
 {
 #ifdef USE_SSL
-	if (conn->ssl_in_use)
-		pgtls_close(conn);
+	pgtls_close(conn);
 #endif
 }
 
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index adf149a76f..2f052f61f8 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -486,6 +486,11 @@ struct pg_conn
 	void	   *engine;			/* dummy field to keep struct the same if
 								 * OpenSSL version changes */
 #endif
+	bool		crypto_loaded;	/* Track if libcrypto locking callbacks have
+								 * been done for this connection. This can be
+								 * removed once support for OpenSSL 1.0.2 is
+								 * removed as this locking is handled
+								 * internally in OpenSSL >= 1.1.0. */
 #endif							/* USE_OPENSSL */
 #endif							/* USE_SSL */
 
@@ -667,7 +672,7 @@ extern int	pqWriteReady(PGconn *conn);
 
 /* === in fe-secure.c === */
 
-extern int	pqsecure_initialize(PGconn *);
+extern int	pqsecure_initialize(PGconn *, bool, bool);
 extern PostgresPollingStatusType pqsecure_open_client(PGconn *);
 extern void pqsecure_close(PGconn *);
 extern ssize_t pqsecure_read(PGconn *, void *ptr, size_t len);
@@ -696,11 +701,13 @@ extern void pgtls_init_library(bool do_ssl, int do_crypto);
  * Initialize SSL library.
  *
  * The conn parameter is only used to be able to pass back an error
- * message - no connection-local setup is made here.
+ * message - no connection-local setup is made here.  do_ssl controls
+ * if SSL is initialized, and do_crypto does the same for the crypto
+ * part.
  *
  * Returns 0 if OK, -1 on failure (adding a message to conn->errorMessage).
  */
-extern int	pgtls_init(PGconn *conn);
+extern int	pgtls_init(PGconn *conn, bool do_ssl, bool do_crypto);
 
 /*
  *	Begin or continue negotiating a secure session.
#7Jacob Champion
pchampion@vmware.com
In reply to: Michael Paquier (#6)
Re: cryptohash: missing locking functions for OpenSSL <= 1.0.2?

On Wed, 2021-03-10 at 17:21 +0900, Michael Paquier wrote:

Do you mean something like the attached?

Yes! Patch LGTM.

--Jacob

#8Michael Paquier
michael@paquier.xyz
In reply to: Jacob Champion (#7)
Re: cryptohash: missing locking functions for OpenSSL <= 1.0.2?

On Wed, Mar 10, 2021 at 05:05:38PM +0000, Jacob Champion wrote:

On Wed, 2021-03-10 at 17:21 +0900, Michael Paquier wrote:

Do you mean something like the attached?

Yes! Patch LGTM.

Thanks Jacob for double-checking. I have looked at that again slowly
today, and applied it after some light adjustments in the comments.
--
Michael