libpq OpenSSL and multithreading

Started by Daniel Gustafsson10 months ago8 messageshackers
Jump to latest
#1Daniel Gustafsson
daniel@yesql.se

The OpenSSL code in libpq have two issues for multithreading: the verify_cb
callback use a global variable to pass back error detail state and there is one
use of strerror().

The attached fixes both, with no functional change, in order to pave the way
for multithreading:

* Rather than using a global variable the callback use a new member in the
Port struct for passing the string, and the Port struct is in turn passed as
private data in the SSL object
* The strerror call is replaced with a strerror_r call using the already
existing errorbuffer

These were tested on OpenSSL 1.1.1 through 3.5. Parking this in the next
commitfest.

--
Daniel Gustafsson

Attachments:

0001-libpq-Fix-multithreading-issues-in-the-OpenSSL-suppo.patchapplication/octet-stream; name=0001-libpq-Fix-multithreading-issues-in-the-OpenSSL-suppo.patch; x-unix-mode=0644Download+34-8
#2Peter Eisentraut
peter_e@gmx.net
In reply to: Daniel Gustafsson (#1)
Re: libpq OpenSSL and multithreading

On 27.06.25 19:24, Daniel Gustafsson wrote:

The OpenSSL code in libpq have two issues for multithreading: the verify_cb
callback use a global variable to pass back error detail state and there is one
use of strerror().

Slightly misleading title: This is actually about the *backend* libpq code.

The attached fixes both, with no functional change, in order to pave the way
for multithreading:

* Rather than using a global variable the callback use a new member in the
Port struct for passing the string, and the Port struct is in turn passed as
private data in the SSL object

Couldn't this also be done by making that global variable thread-local?
But getting rid of it is even nicer.

* The strerror call is replaced with a strerror_r call using the already
existing errorbuffer

This one was already discussed some time ago at
</messages/by-id/daa87d79-c044-46c4-8458-8d77241ed7b0@eisentraut.org&gt;:

But the bigger issue is that the use of a static buffer makes
this not thread-safe, so having it use strerror_r to fill that
buffer is just putting lipstick on a pig.

It looks like your patch doesn't address that?

#3Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#2)
Re: libpq OpenSSL and multithreading

On Wed, Jul 02, 2025 at 01:44:55PM +0200, Peter Eisentraut wrote:

Couldn't this also be done by making that global variable thread-local? But
getting rid of it is even nicer.

Getting rid of it like Daniel is proposing is by putting this
information in the backend Port is much more elegant IMO.
--
Michael

#4Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#2)
Re: libpq OpenSSL and multithreading

On 2 Jul 2025, at 13:44, Peter Eisentraut <peter@eisentraut.org> wrote:

On 27.06.25 19:24, Daniel Gustafsson wrote:

The OpenSSL code in libpq have two issues for multithreading: the verify_cb
callback use a global variable to pass back error detail state and there is one
use of strerror().

Slightly misleading title: This is actually about the *backend* libpq code.

Point taken, proposed commitmessage has been updated.

The attached fixes both, with no functional change, in order to pave the way
for multithreading:
* Rather than using a global variable the callback use a new member in the
Port struct for passing the string, and the Port struct is in turn passed as
private data in the SSL object

Couldn't this also be done by making that global variable thread-local? But getting rid of it is even nicer.

It absolutely could, and I briefly considered this approach, but I prefer
getting rid of it altogether.

* The strerror call is replaced with a strerror_r call using the already
existing errorbuffer

This one was already discussed some time ago at </messages/by-id/daa87d79-c044-46c4-8458-8d77241ed7b0@eisentraut.org&gt;:

But the bigger issue is that the use of a static buffer makes
this not thread-safe, so having it use strerror_r to fill that
buffer is just putting lipstick on a pig.

It looks like your patch doesn't address that?

Doh, of course. The attached v2 takes a stab at moving from using a static
buffer to a palloced buffer with the caller being responsible for freeing. In
paths which lead to proc_exit we can omit freeing.

--
Daniel Gustafsson

Attachments:

v2-0001-libpq-Make-SSL-errorhandling-in-backend-threadsaf.patchapplication/octet-stream; name=v2-0001-libpq-Make-SSL-errorhandling-in-backend-threadsaf.patch; x-unix-mode=0644Download+104-45
#5Peter Eisentraut
peter_e@gmx.net
In reply to: Daniel Gustafsson (#4)
Re: libpq OpenSSL and multithreading

On 03.07.25 18:01, Daniel Gustafsson wrote:

On 2 Jul 2025, at 13:44, Peter Eisentraut <peter@eisentraut.org> wrote:

On 27.06.25 19:24, Daniel Gustafsson wrote:

The OpenSSL code in libpq have two issues for multithreading: the verify_cb
callback use a global variable to pass back error detail state and there is one
use of strerror().

Slightly misleading title: This is actually about the *backend* libpq code.

Point taken, proposed commitmessage has been updated.

The attached fixes both, with no functional change, in order to pave the way
for multithreading:
* Rather than using a global variable the callback use a new member in the
Port struct for passing the string, and the Port struct is in turn passed as
private data in the SSL object

Couldn't this also be done by making that global variable thread-local? But getting rid of it is even nicer.

I suggest that instead of adding the context to the Port structure, make
a separate context struct for this purpose, for example:

struct verify_cb_context
{
const char *cert_errdetail;
};

int
be_tls_open_server(Port *port)
{
static struct verify_cb_context verify_cb_context;

SSL_set_ex_data(port->ssl, 0, &verify_cb_context);

This avoids the "circle motion" your patch describes.

It also avoids questions about whether Port.cert_errdetail is properly
cleaned up whenever be_tls_open_server() exits early.

It absolutely could, and I briefly considered this approach, but I prefer
getting rid of it altogether.

* The strerror call is replaced with a strerror_r call using the already
existing errorbuffer

This one was already discussed some time ago at </messages/by-id/daa87d79-c044-46c4-8458-8d77241ed7b0@eisentraut.org&gt;:

But the bigger issue is that the use of a static buffer makes
this not thread-safe, so having it use strerror_r to fill that
buffer is just putting lipstick on a pig.

It looks like your patch doesn't address that?

Doh, of course. The attached v2 takes a stab at moving from using a static
buffer to a palloced buffer with the caller being responsible for freeing. In
paths which lead to proc_exit we can omit freeing.

This seems like an extremely inconvenient solution, as can be seen by
the amount of changes your patch introduces. We could just make errbuf
thread-local and be done, without having to change the API. (This is
how glibc's strerror() works internally.)

#6Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#5)
Re: libpq OpenSSL and multithreading

On 1 Sep 2025, at 07:27, Peter Eisentraut <peter@eisentraut.org> wrote:

I suggest that instead of adding the context to the Port structure, make a separate context struct for this purpose, for example:

Fair enough, done in the attached.

This seems like an extremely inconvenient solution, as can be seen by the amount of changes your patch introduces. We could just make errbuf thread-local and be done, without having to change the API. (This is how glibc's strerror() works internally.)

I assume you mean simply leaving it be for now awaiting more thread primitives
to be added to fully support thread local storage? (sidenote; if our thread
local store code will use TLS then be-secure-openssl.c will be challenging to
read =)).

I've left out this portion in the attached and only left the callback private
data change.

--
Daniel Gustafsson

Attachments:

v3-0001-libpq-Make-SSL-certificate-callback-in-backend-th.patchapplication/octet-stream; name=v3-0001-libpq-Make-SSL-certificate-callback-in-backend-th.patch; x-unix-mode=0644Download+26-6
#7Peter Eisentraut
peter_e@gmx.net
In reply to: Daniel Gustafsson (#6)
Re: libpq OpenSSL and multithreading

On 22.10.25 10:59, Daniel Gustafsson wrote:

On 1 Sep 2025, at 07:27, Peter Eisentraut <peter@eisentraut.org> wrote:

I suggest that instead of adding the context to the Port structure, make a separate context struct for this purpose, for example:

Fair enough, done in the attached.

This looks good to me. (I would not have the CallbackErr typedef, since
that additional abstraction doesn't buy anything. But it's a small
difference.)

This seems like an extremely inconvenient solution, as can be seen by the amount of changes your patch introduces. We could just make errbuf thread-local and be done, without having to change the API. (This is how glibc's strerror() works internally.)

I assume you mean simply leaving it be for now awaiting more thread primitives
to be added to fully support thread local storage?

Yes

(sidenote; if our thread
local store code will use TLS then be-secure-openssl.c will be challenging to
read =)).

Yes, let's rename it to SSL to avoid this. ;-)

#8Daniel Gustafsson
daniel@yesql.se
In reply to: Peter Eisentraut (#7)
Re: libpq OpenSSL and multithreading

On 29 Oct 2025, at 10:41, Peter Eisentraut <peter@eisentraut.org> wrote:
On 22.10.25 10:59, Daniel Gustafsson wrote:

On 1 Sep 2025, at 07:27, Peter Eisentraut <peter@eisentraut.org> wrote:
I suggest that instead of adding the context to the Port structure, make a separate context struct for this purpose, for example:

Fair enough, done in the attached.

This looks good to me. (I would not have the CallbackErr typedef, since that additional abstraction doesn't buy anything. But it's a small difference.)

Thanks, I removed the typedef and went ahead with this patch.

(sidenote; if our thread
local store code will use TLS then be-secure-openssl.c will be challenging to
read =)).

Yes, let's rename it to SSL to avoid this. ;-)

Touché! =)

--
Daniel Gustafsson