Direct SSL connection and ALPN loose ends

Started by Heikki Linnakangasover 1 year ago23 messages
#1Heikki Linnakangas
hlinnaka@iki.fi

There's been a bunch of bugs, and discussion on the intended behavior of
sslnegotiation and ALPN. This email summarizes the current status:

## Status and loose ends for beta1

All reported bugs have now been fixed. We now enforce ALPN in all the
right places. Please let me know if I missed something.

There are two open items remaining that I intend to address in the next
few days, before beta1:

- I am going to rename sslnegotiation=requiredirect to
sslnegotiation=directonly. I acknowledge that there is still some debate
on this: Jacob (and Robert?) would prefer to change the behavior
instead, so that sslnegotiation=requiredirect would also imply or
require sslmode=require, while IMHO the settings should be orthogonal so
that sslmode controls whether SSL is used or not, and sslnegotiation
controls how the SSL layer is negotiated when SSL is used. Given that
they are orthogonal, "directonly" is a better name. I will also take
another look at the documentation, if it needs clarification on that
point. If you have more comments on whether this is a good idea or not
or how sslnegotiation should work, please reply on the other thread,
let's keep this one focused on the overall status. [1]/messages/by-id/CA+TgmobV9JEk4AFy61Xw+2+cCTBqdTsDopkeB+gb81kq3f-o6A@mail.gmail.com

- The registration of the ALPN name with IANA hasn't been finished yet
[2]: https://mailarchive.ietf.org/arch/msg/tls-reg-review/9LWPzQfOpbc8dTT7vc9ahNeNaiw/
I changed the request to "postgresql". The string we have in 'master' is
currently "TBD-pgsql". I'm very confident that the registration will go
through with "postgresql", so my plan is to commit that change before
beta1, even if the IANA process hasn't completed by then.

## V18 material

- Add an option to disable traditional SSL negotiation in the server.
There was discussion on doing this via HBA rules or as a global option,
and the consensus seems to be for a global option. This would be just to
reduce the attach surface, there is no known vulnerabilities or other
issues with the traditional negotiation. And maybe to help with testing. [3]/messages/by-id/CA+TgmoaLpDVY2ywqQUfxvKEQZ+nwkabcw_f=i4Zyivt9CLjcmA@mail.gmail.com

These are not directly related to sslnegotiation, but came up in the
discussion:

- Clarify the situation with sslmode=require and gssencmode=require
combination, by replacing sslmode and gssencmode options with a single
"encryption=[ssl|gss|none], [...]" option. [4]/messages/by-id/3a6f126c-e1aa-4dcc-9252-9868308f6cf0@iki.fi

- Make sslmode=require the default. This is orthogonal to the SSL
negotiation, but I think the root cause for the disagreements on
sslnegotiation is actually that we'd like SSL to be the default. [5]/messages/by-id/CA+TgmoaNkRerEmB9JPgW0FhcJAe337AA=5kp6je9KekQhhRbmA@mail.gmail.com

The details of these need to be hashed out, in particular the
backwards-compatibility and migration aspects, but the consensus seems
to be that it's the right direction.

## V19 and beyond

In the future, once v17 is ubiquitous and the ecosystem (pgbouncer etc)
have added direct SSL support, we can change the default sslnegotiation
from 'postgres' to 'direct'. I'm thinking 3-5 years from now. In the
more distant future, we could remove the traditional SSLRequest
negotiation altogether and always use direct SSL negotiation.

There's no rush on these.

## Retrospective

There were a lot more cleanups required for this work than I expected,
given that there were little changes to the patches between January and
March commitfests. I was mostly worried about the refactoring of the
retry logic in libpq (and about the pre-existing logic too to be honest,
it was complicated before these changes already). That's why I added a
lot more tests for that. However, I did not foresee all the ALPN related
issues. In hindsight, it would have been good to commit most of the ALPN
changes first, and with more tests. Jacob wrote a python test suite; I
should've played more with that, that could have demonstrated the ALPN
issues earlier.

[1]: /messages/by-id/CA+TgmobV9JEk4AFy61Xw+2+cCTBqdTsDopkeB+gb81kq3f-o6A@mail.gmail.com
/messages/by-id/CA+TgmobV9JEk4AFy61Xw+2+cCTBqdTsDopkeB+gb81kq3f-o6A@mail.gmail.com

[2]: https://mailarchive.ietf.org/arch/msg/tls-reg-review/9LWPzQfOpbc8dTT7vc9ahNeNaiw/
https://mailarchive.ietf.org/arch/msg/tls-reg-review/9LWPzQfOpbc8dTT7vc9ahNeNaiw/

[3]: /messages/by-id/CA+TgmoaLpDVY2ywqQUfxvKEQZ+nwkabcw_f=i4Zyivt9CLjcmA@mail.gmail.com
/messages/by-id/CA+TgmoaLpDVY2ywqQUfxvKEQZ+nwkabcw_f=i4Zyivt9CLjcmA@mail.gmail.com

[4]: /messages/by-id/3a6f126c-e1aa-4dcc-9252-9868308f6cf0@iki.fi
/messages/by-id/3a6f126c-e1aa-4dcc-9252-9868308f6cf0@iki.fi

[5]: /messages/by-id/CA+TgmoaNkRerEmB9JPgW0FhcJAe337AA=5kp6je9KekQhhRbmA@mail.gmail.com
/messages/by-id/CA+TgmoaNkRerEmB9JPgW0FhcJAe337AA=5kp6je9KekQhhRbmA@mail.gmail.com

--
Heikki Linnakangas
Neon (https://neon.tech)

#2Ranier Vilela
ranier.vf@gmail.com
In reply to: Heikki Linnakangas (#1)
1 attachment(s)
re: Direct SSL connection and ALPN loose ends

Hi,

With TLS 1.3 and others there is possibly a security flaw using ALPN [1]terminate-tlsv1-3-handshake-if-alpn-is-missing <https://stackoverflow.com/questions/77271498/terminate-tlsv1-3-handshake-if-alpn-is-missing&gt;.

It seems to me that the ALPN protocol can be bypassed if the client does
not correctly inform the ClientHello header.

So, the suggestion is to check the ClientHello header in the server and
terminate the TLS handshake early.

Patch attached.

best regards,
Ranier Vilela

[1]: terminate-tlsv1-3-handshake-if-alpn-is-missing <https://stackoverflow.com/questions/77271498/terminate-tlsv1-3-handshake-if-alpn-is-missing&gt;
<https://stackoverflow.com/questions/77271498/terminate-tlsv1-3-handshake-if-alpn-is-missing&gt;

Attachments:

terminate-tls-handshake-if-no-alpn.patchapplication/octet-stream; name=terminate-tls-handshake-if-no-alpn.patchDownload
diff --git a/src/backend/libpq/be-secure-openssl.c b/src/backend/libpq/be-secure-openssl.c
index 60cf68aac4..be41d75de0 100644
--- a/src/backend/libpq/be-secure-openssl.c
+++ b/src/backend/libpq/be-secure-openssl.c
@@ -67,6 +67,7 @@ static int	ssl_external_passwd_cb(char *buf, int size, int rwflag, void *userdat
 static int	dummy_ssl_passwd_cb(char *buf, int size, int rwflag, void *userdata);
 static int	verify_cb(int ok, X509_STORE_CTX *ctx);
 static void info_cb(const SSL *ssl, int type, int args);
+static int client_hello_cb(SSL *ssl, int *al,	void *arg);
 static int	alpn_cb(SSL *ssl,
 					const unsigned char **out,
 					unsigned char *outlen,
@@ -443,6 +444,9 @@ be_tls_open_server(Port *port)
 	/* set up debugging/info callback */
 	SSL_CTX_set_info_callback(SSL_context, info_cb);
 
+	/* enable ClientHello */
+	SSL_CTX_set_client_hello_cb(SSL_context, client_hello_cb, NULL);
+
 	/* enable ALPN */
 	SSL_CTX_set_alpn_select_cb(SSL_context, alpn_cb, port);
 
@@ -1304,6 +1308,30 @@ info_cb(const SSL *ssl, int type, int args)
 /* See pqcomm.h comments on OpenSSL implementation of ALPN (RFC 7301) */
 static const unsigned char alpn_protos[] = PG_ALPN_PROTOCOL_VECTOR;
 
+/*
+ * Server callback for ClientHello validation.
+ */
+static int
+client_hello_cb(SSL *ssl,
+		int *al,
+		void *arg)
+{
+	const unsigned char * data;
+    size_t len;
+
+    if (SSL_client_hello_get0_ext(ssl, TLSEXT_TYPE_application_layer_protocol_negotiation, &data, &len) == 1)
+        return SSL_CLIENT_HELLO_SUCCESS;
+    else
+    {
+ 		/*
+		 * The client doesn't support ALPN negotiation protocol.  Reject the connection
+		 * with TLS "no_application_protocol" alert, per RFC 7301.
+		 */
+       *al = TLS1_AD_NO_APPLICATION_PROTOCOL;
+        return SSL_CLIENT_HELLO_ERROR;
+    }
+}
+
 /*
  * Server callback for ALPN negotiation. We use the standard "helper" function
  * even though currently we only accept one value.
#3Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Ranier Vilela (#2)
Re: Direct SSL connection and ALPN loose ends

On 29/04/2024 20:10, Ranier Vilela wrote:

Hi,

With TLS 1.3 and others there is possibly a security flaw using ALPN [1].

It seems to me that the ALPN protocol can be bypassed if the client does
not correctly inform the ClientHello header.

So, the suggestion is to check the ClientHello header in the server and
terminate the TLS handshake early.

Sounds to me like it's working as designed. ALPN in general is optional;
if the client doesn't request it, then you proceed without it. We do
require ALPN for direct SSL connections though. We can, because direct
SSL connections is a new feature in Postgres. But we cannot require it
for the connections negotiated with SSLRequest, or we break
compatibility with old clients that don't use ALPN.

There is a check in direct SSL mode that ALPN was used
(ProcessSSLStartup in backend_startup.c):

if (!port->alpn_used)
{
ereport(COMMERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg("received direct SSL connection request without ALPN protocol negotiation extension")));
goto reject;
}

That happens immediately after the SSL connection has been established.

Hmm. I guess it would be better to abort the connection earlier, without
completing the TLS handshake. Otherwise the client might send the first
message in wrong protocol to the PostgreSQL server. That's not a
security issue for the PostgreSQL server: the server disconnects without
reading the message. And I don't see any way for an ALPACA attack when
the server ignores the client's message. Nevertheless, from the point of
view of keeping the attack surface as small as possible, aborting
earlier seems better.

--
Heikki Linnakangas
Neon (https://neon.tech)

#4Ranier Vilela
ranier.vf@gmail.com
In reply to: Heikki Linnakangas (#3)
Re: Direct SSL connection and ALPN loose ends

Em seg., 29 de abr. de 2024 às 14:56, Heikki Linnakangas <hlinnaka@iki.fi>
escreveu:

On 29/04/2024 20:10, Ranier Vilela wrote:

Hi,

With TLS 1.3 and others there is possibly a security flaw using ALPN [1].

It seems to me that the ALPN protocol can be bypassed if the client does
not correctly inform the ClientHello header.

So, the suggestion is to check the ClientHello header in the server and
terminate the TLS handshake early.

Sounds to me like it's working as designed. ALPN in general is optional;
if the client doesn't request it, then you proceed without it. We do
require ALPN for direct SSL connections though. We can, because direct
SSL connections is a new feature in Postgres. But we cannot require it
for the connections negotiated with SSLRequest, or we break
compatibility with old clients that don't use ALPN.

Ok.
But what if I have a server configured for TLS 1.3 and that requires ALPN
to allow access?
What about a client configured without ALPN requiring connection?

There is a check in direct SSL mode that ALPN was used
(ProcessSSLStartup in backend_startup.c):

if (!port->alpn_used)
{
ereport(COMMERROR,
(errcode(ERRCODE_PROTOCOL_VIOLATION),
errmsg("received direct SSL connection

request without ALPN protocol negotiation extension")));

goto reject;
}

That happens immediately after the SSL connection has been established.

Hmm. I guess it would be better to abort the connection earlier, without
completing the TLS handshake. Otherwise the client might send the first
message in wrong protocol to the PostgreSQL server. That's not a
security issue for the PostgreSQL server: the server disconnects without
reading the message. And I don't see any way for an ALPACA attack when
the server ignores the client's message. Nevertheless, from the point of
view of keeping the attack surface as small as possible, aborting
earlier seems better.

So the ClientHello callback is the correct way to determine the end.

best regards,
Ranier Vilela

#5Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Ranier Vilela (#4)
Re: Direct SSL connection and ALPN loose ends

On 29/04/2024 21:06, Ranier Vilela wrote:

Em seg., 29 de abr. de 2024 às 14:56, Heikki Linnakangas
<hlinnaka@iki.fi <mailto:hlinnaka@iki.fi>> escreveu:

On 29/04/2024 20:10, Ranier Vilela wrote:

Hi,

With TLS 1.3 and others there is possibly a security flaw using

ALPN [1].

It seems to me that the ALPN protocol can be bypassed if the

client does

not correctly inform the ClientHello header.

So, the suggestion is to check the ClientHello header in the

server and

terminate the TLS handshake early.

Sounds to me like it's working as designed. ALPN in general is
optional;
if the client doesn't request it, then you proceed without it. We do
require ALPN for direct SSL connections though. We can, because direct
SSL connections is a new feature in Postgres. But we cannot require it
for the connections negotiated with SSLRequest, or we break
compatibility with old clients that don't use ALPN.

Ok.
But what if I have a server configured for TLS 1.3 and that requires
ALPN to allow access?
What about a client configured without ALPN requiring connection?

Sorry, I don't understand the questions. What about them?

--
Heikki Linnakangas
Neon (https://neon.tech)

#6Ranier Vilela
ranier.vf@gmail.com
In reply to: Heikki Linnakangas (#5)
Re: Direct SSL connection and ALPN loose ends

Em seg., 29 de abr. de 2024 às 15:36, Heikki Linnakangas <hlinnaka@iki.fi>
escreveu:

On 29/04/2024 21:06, Ranier Vilela wrote:

Em seg., 29 de abr. de 2024 às 14:56, Heikki Linnakangas
<hlinnaka@iki.fi <mailto:hlinnaka@iki.fi>> escreveu:

On 29/04/2024 20:10, Ranier Vilela wrote:

Hi,

With TLS 1.3 and others there is possibly a security flaw using

ALPN [1].

It seems to me that the ALPN protocol can be bypassed if the

client does

not correctly inform the ClientHello header.

So, the suggestion is to check the ClientHello header in the

server and

terminate the TLS handshake early.

Sounds to me like it's working as designed. ALPN in general is
optional;
if the client doesn't request it, then you proceed without it. We do
require ALPN for direct SSL connections though. We can, because

direct

SSL connections is a new feature in Postgres. But we cannot require

it

for the connections negotiated with SSLRequest, or we break
compatibility with old clients that don't use ALPN.

Ok.
But what if I have a server configured for TLS 1.3 and that requires
ALPN to allow access?
What about a client configured without ALPN requiring connection?

Sorry, I don't understand the questions. What about them?

Sorry, I'll try to be clearer.
The way it is designed, can we impose TLS 1.3 and ALPN to allow access to a
public server?

And if on the other side we have a client, configured without ALPN,
when requesting access, the server will refuse?

best regards,
Ranier Vilela

#7Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Heikki Linnakangas (#1)
Re: Direct SSL connection and ALPN loose ends

On Mon, Apr 29, 2024 at 8:24 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

I was mostly worried about the refactoring of the
retry logic in libpq (and about the pre-existing logic too to be honest,
it was complicated before these changes already).

Some changes in the v17 negotiation fallback order caught my eye:

1. For sslmode=prefer, a modern v3 error during negotiation now
results in a fallback to plaintext. For v16 this resulted in an
immediate failure. (v2 errors retain the v16 behavior.)
2. For gssencmode=prefer, a legacy v2 error during negotiation now
results in an immediate failure. In v16 it allowed fallback to SSL or
plaintext depending on sslmode.

Are both these changes intentional/desirable? Change #1 seems to
partially undo the decision made in a49fbaaf:

Don't assume that "E" response to NEGOTIATE_SSL_CODE means pre-7.0 server.

These days, such a response is far more likely to signify a server-side
problem, such as fork failure. [...]

Hence, it seems best to just eliminate the assumption that backing off
to non-SSL/2.0 protocol is the way to recover from an "E" response, and
instead treat the server error the same as we would in non-SSL cases.

Thanks,
--Jacob

#8Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Jacob Champion (#7)
Re: Direct SSL connection and ALPN loose ends

On 17/06/2024 17:11, Jacob Champion wrote:

On Mon, Apr 29, 2024 at 8:24 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

I was mostly worried about the refactoring of the
retry logic in libpq (and about the pre-existing logic too to be honest,
it was complicated before these changes already).

Some changes in the v17 negotiation fallback order caught my eye:

1. For sslmode=prefer, a modern v3 error during negotiation now
results in a fallback to plaintext. For v16 this resulted in an
immediate failure. (v2 errors retain the v16 behavior.)
2. For gssencmode=prefer, a legacy v2 error during negotiation now
results in an immediate failure. In v16 it allowed fallback to SSL or
plaintext depending on sslmode.

Are both these changes intentional/desirable? Change #1 seems to
partially undo the decision made in a49fbaaf:

Don't assume that "E" response to NEGOTIATE_SSL_CODE means pre-7.0 server.

These days, such a response is far more likely to signify a server-side
problem, such as fork failure. [...]

Hence, it seems best to just eliminate the assumption that backing off
to non-SSL/2.0 protocol is the way to recover from an "E" response, and
instead treat the server error the same as we would in non-SSL cases.

They were not intentional. Let me think about the desirable part :-).

By "negotiation", which part of the protocol are we talking about
exactly? In the middle of the TLS handshake? After sending the startup
packet?

I think the behavior with v2 and v3 errors should be the same. And I
think an immediate failure is appropriate on any v2/v3 error during
negotiation, assuming we don't use those errors for things like "TLS not
supported", which would warrant a fallback.

--
Heikki Linnakangas
Neon (https://neon.tech)

#9Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Heikki Linnakangas (#8)
Re: Direct SSL connection and ALPN loose ends

On Mon, Jun 17, 2024 at 8:24 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

By "negotiation", which part of the protocol are we talking about
exactly? In the middle of the TLS handshake? After sending the startup
packet?

By "negotiation" I mean the server's response to the startup packet.
I.e. "supported"/"not supported"/"error".

I think the behavior with v2 and v3 errors should be the same. And I
think an immediate failure is appropriate on any v2/v3 error during
negotiation, assuming we don't use those errors for things like "TLS not
supported", which would warrant a fallback.

For GSS encryption, it was my vague understanding that older servers
respond with an error rather than the "not supported" indication. For
TLS, though, the decision in a49fbaaf (immediate failure) seemed
reasonable.

Thanks,
--Jacob

#10Andres Freund
andres@anarazel.de
In reply to: Heikki Linnakangas (#1)
Re: Direct SSL connection and ALPN loose ends

Hi,

On 2024-04-29 18:24:04 +0300, Heikki Linnakangas wrote:

All reported bugs have now been fixed. We now enforce ALPN in all the right
places. Please let me know if I missed something.

Very minor and not really your responsibility:

If provided with the necessary key information, wireshark can decode TLS
exchanges when using sslnegotiation=postgres but not with direct. Presumably
it needs to be taught postgres' ALPN id or something.

Example with direct:

476 6513.310308457 192.168.0.113 → 192.168.0.200 48978 5432 142 TLSv1.3 Finished
477 6513.310341492 192.168.0.113 → 192.168.0.200 48978 5432 151 TLSv1.3 Application Data
478 6513.320730295 192.168.0.200 → 192.168.0.113 5432 48978 147 TLSv1.3 New Session Ticket
479 6513.320745684 192.168.0.200 → 192.168.0.113 5432 48978 147 TLSv1.3 New Session Ticket
480 6513.321175713 192.168.0.113 → 192.168.0.200 48978 5432 68 TCP 48978 → 5432 [ACK] Seq=915 Ack=1665 Win=62848 Len=0 TSval=3779915421 TSecr=3469016093
481 6513.323161553 192.168.0.200 → 192.168.0.113 5432 48978 518 TLSv1.3 Application Data
482 6513.323626180 192.168.0.113 → 192.168.0.200 48978 5432 125 TLSv1.3 Application Data
483 6513.333977769 192.168.0.200 → 192.168.0.113 5432 48978 273 TLSv1.3 Application Data
484 6513.334581920 192.168.0.113 → 192.168.0.200 48978 5432 95 TLSv1.3 Application Data
485 6513.334666116 192.168.0.113 → 192.168.0.200 48978 5432 92 TLSv1.3 Alert (Level: Warning, Description: Close Notify)

Example with postgres:

502 6544.752799560 192.168.0.113 → 192.168.0.200 46300 5432 142 TLSv1.3 Finished
503 6544.752842863 192.168.0.113 → 192.168.0.200 46300 5432 151 PGSQL >?
504 6544.763152222 192.168.0.200 → 192.168.0.113 5432 46300 147 TLSv1.3 New Session Ticket
505 6544.763163155 192.168.0.200 → 192.168.0.113 5432 46300 147 TLSv1.3 New Session Ticket
506 6544.763587595 192.168.0.113 → 192.168.0.200 46300 5432 68 TCP 46300 → 5432 [ACK] Seq=923 Ack=1666 Win=62848 Len=0 TSval=3779946864 TSecr=3469047536
507 6544.765024827 192.168.0.200 → 192.168.0.113 5432 46300 518 PGSQL <R/S/S/S/S/S/S/S/S/S/S/S/S/S/S/K/Z
508 6544.766288155 192.168.0.113 → 192.168.0.200 46300 5432 125 PGSQL >Q
509 6544.776974164 192.168.0.200 → 192.168.0.113 5432 46300 273 PGSQL <T/D/D/D/D/D/D/D/D/D/D/C/Z
510 6544.777597927 192.168.0.113 → 192.168.0.200 46300 5432 95 PGSQL >X
511 6544.777631520 192.168.0.113 → 192.168.0.200 46300 5432 92 TLSv1.3 Alert (Level: Warning, Description: Close Notify)

Note that in the second one it knows what's inside the "Application Data"
messages and decodes them (S: startup, authentication ok, parameters, cancel key,
ready for query, C: simple query, S: description, 10 rows, command complete,
ready for query).

In the GUI you can obviously go into the "postgres messages" in more detail
than I know how to do on the console.

A second aspect is that I'm not super happy about the hack of stashing data
into Port. I think medium term we'd be better off separating out the
buffering for unencrypted and encrypted data properly. It turns out that not
having any buffering *below* openssl (i.e. the encrypted data) hurts both for
the send and receive side, due to a) increased number of syscalls b) too many
small packets being sent, as we use TCP_NODELAY c) kernel memory copies being
slower due to the small increments.

Greetings,

Andres Freund

#11Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#9)
Re: Direct SSL connection and ALPN loose ends

On Mon, Jun 17, 2024 at 9:23 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

I think the behavior with v2 and v3 errors should be the same. And I
think an immediate failure is appropriate on any v2/v3 error during
negotiation, assuming we don't use those errors for things like "TLS not
supported", which would warrant a fallback.

For GSS encryption, it was my vague understanding that older servers
respond with an error rather than the "not supported" indication. For
TLS, though, the decision in a49fbaaf (immediate failure) seemed
reasonable.

Would an open item for this be appropriate?

Thanks,
--Jacob

#12Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Jacob Champion (#11)
Re: Direct SSL connection and ALPN loose ends

On 20/06/2024 20:02, Jacob Champion wrote:

On Mon, Jun 17, 2024 at 9:23 AM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

I think the behavior with v2 and v3 errors should be the same. And I
think an immediate failure is appropriate on any v2/v3 error during
negotiation, assuming we don't use those errors for things like "TLS not
supported", which would warrant a fallback.

For GSS encryption, it was my vague understanding that older servers
respond with an error rather than the "not supported" indication. For
TLS, though, the decision in a49fbaaf (immediate failure) seemed
reasonable.

Would an open item for this be appropriate?

Added.

By "negotiation" I mean the server's response to the startup packet.
I.e. "supported"/"not supported"/"error".

Ok, I'm still a little confused, probably a terminology issue. The
server doesn't respond with "supported" or "not supported" to the
startup packet, that happens earlier. I think you mean the SSLRequst /
GSSRequest packet, which is sent *before* the startup packet?

I think the behavior with v2 and v3 errors should be the same. And I
think an immediate failure is appropriate on any v2/v3 error during
negotiation, assuming we don't use those errors for things like "TLS not
supported", which would warrant a fallback.

For GSS encryption, it was my vague understanding that older servers
respond with an error rather than the "not supported" indication. For
TLS, though, the decision in a49fbaaf (immediate failure) seemed
reasonable.

Hmm, right, GSS encryption was introduced in v12, and older versions
respond with an error to a GSSRequest.

We probably could make the same assumption for GSS as we did for TLS in
a49fbaaf, i.e. that an error means that something's wrong with the
server, rather than that it's just very old and doesn't support GSS. But
the case for that is a lot weaker case than with TLS. There are still
pre-v12 servers out there in the wild.

--
Heikki Linnakangas
Neon (https://neon.tech)

#13Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Heikki Linnakangas (#12)
Re: Direct SSL connection and ALPN loose ends

On Thu, Jun 20, 2024 at 4:13 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

By "negotiation" I mean the server's response to the startup packet.
I.e. "supported"/"not supported"/"error".

Ok, I'm still a little confused, probably a terminology issue. The
server doesn't respond with "supported" or "not supported" to the
startup packet, that happens earlier. I think you mean the SSLRequst /
GSSRequest packet, which is sent *before* the startup packet?

Yes, sorry. (I'm used to referring to those as startup packets too, ha.)

Hmm, right, GSS encryption was introduced in v12, and older versions
respond with an error to a GSSRequest.

We probably could make the same assumption for GSS as we did for TLS in
a49fbaaf, i.e. that an error means that something's wrong with the
server, rather than that it's just very old and doesn't support GSS. But
the case for that is a lot weaker case than with TLS. There are still
pre-v12 servers out there in the wild.

Right. Since we default to gssencmode=prefer, if you have Kerberos
creds in your environment, I think this could potentially break
existing software that connects to v11 servers once you upgrade libpq.

Thanks,
--Jacob

#14Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Jacob Champion (#13)
5 attachment(s)
Re: Direct SSL connection and ALPN loose ends

On 21/06/2024 02:32, Jacob Champion wrote:

On Thu, Jun 20, 2024 at 4:13 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

By "negotiation" I mean the server's response to the startup packet.
I.e. "supported"/"not supported"/"error".

Ok, I'm still a little confused, probably a terminology issue. The
server doesn't respond with "supported" or "not supported" to the
startup packet, that happens earlier. I think you mean the SSLRequst /
GSSRequest packet, which is sent *before* the startup packet?

Yes, sorry. (I'm used to referring to those as startup packets too, ha.)

Yeah I'm not sure what the right term would be.

Hmm, right, GSS encryption was introduced in v12, and older versions
respond with an error to a GSSRequest.

We probably could make the same assumption for GSS as we did for TLS in
a49fbaaf, i.e. that an error means that something's wrong with the
server, rather than that it's just very old and doesn't support GSS. But
the case for that is a lot weaker case than with TLS. There are still
pre-v12 servers out there in the wild.

Right. Since we default to gssencmode=prefer, if you have Kerberos
creds in your environment, I think this could potentially break
existing software that connects to v11 servers once you upgrade libpq.

When you connect to a V11 server and attempt to perform GSSAPI
authentication, it will respond with a V3 error that says: "unsupported
frontend protocol 1234.5680: server supports 2.0 to 3.0". That was a
surprise to me until I tested it just now. I thought that it would
respond with a protocol V2 error, but it is not so. The backend sets
FrontendProtocol to 1234.5680 before sending the error, and because it
is >= 3, the error is sent with protocol version 3.

Given that, I think it is a good thing to fail the connection completely
on receiving a V2 error.

Attached is a patch to fix the other issue, with falling back from SSL
to plaintext. And some tests and comment fixes I spotted while at it.

0001: A small comment fix
0002: This is the main patch that fixes the SSL fallback issue

0003: This adds fault injection tests to exercise these early error
codepaths. It is not ready to be merged, as it contains a hack to skip
locking. See thread at
/messages/by-id/e1ffb822-054e-4006-ac06-50532767f75b@iki.fi.

0004: More tests, for what happens if the server sends an error after
responding "yes" to the SSLRequest or GSSRequest, but before performing
the SSL/GSS handshake.

Attached is also a little stand-alone perl program that listens on a
socket, and when you connect to it, it immediately sends a V2 or V3
error, depending on the argument. That's useful for testing. It could be
used as an alternative strategy to the injection points I used in the
0003-0004 patches, but for now I just used it for manual testing.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

0001-Fix-outdated-comment-after-removal-of-direct-SSL-fal.patchtext/x-patch; charset=UTF-8; name=0001-Fix-outdated-comment-after-removal-of-direct-SSL-fal.patchDownload
From 4b988b1c74066fe8b5f98acb4ecd20150a47bc33 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Mon, 24 Jun 2024 20:41:41 +0300
Subject: [PATCH 1/4] Fix outdated comment after removal of direct SSL fallback

The option to fall back from direct SSL to negotiated SSL or a
plaintext connection was removed in commit fb5718f35f.
---
 src/interfaces/libpq/fe-connect.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 071b1b34aa1..e003279fb6c 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3564,8 +3564,8 @@ keep_going:						/* We will come back to here until there is
 				if (pollres == PGRES_POLLING_FAILED)
 				{
 					/*
-					 * Failed direct ssl connection, possibly try a new
-					 * connection with postgres negotiation
+					 * SSL handshake failed.  We will retry with a plaintext
+					 * connection, if permitted by sslmode.
 					 */
 					CONNECTION_FAILED();
 				}
-- 
2.39.2

0002-Fix-fallback-behavior-when-server-sends-an-ERROR-ear.patchtext/x-patch; charset=UTF-8; name=0002-Fix-fallback-behavior-when-server-sends-an-ERROR-ear.patchDownload
From 7df581e902b76665de4c751ddbc1309143e6c0b8 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Mon, 24 Jun 2024 18:14:52 +0300
Subject: [PATCH 2/4] Fix fallback behavior when server sends an ERROR early at
 startup

With sslmode=prefer, the desired behavior is to completely fail the
connection attempt, *not* fall back to a plaintext connection, if the
server responds to the SSLRequest with an error ('E') response instead
of rejecting SSL with an 'N' response. This was broken in commit
05fd30c0e7.

Discussion: https://www.postgresql.org/message-id/CAOYmi%2Bnwvu21mJ4DYKUa98HdfM_KZJi7B1MhyXtnsyOO-PB6Ww%40mail.gmail.com
---
 src/interfaces/libpq/fe-connect.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index e003279fb6c..360d9a45476 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3521,6 +3521,12 @@ keep_going:						/* We will come back to here until there is
 						 * byte here.
 						 */
 						conn->status = CONNECTION_AWAITING_RESPONSE;
+
+						/*
+						 * Don't fall back to a plaintext connection after
+						 * reading the error.
+						 */
+						conn->failed_enc_methods |= conn->allowed_enc_methods & (~conn->current_enc_method);
 						goto keep_going;
 					}
 					else
@@ -3612,6 +3618,13 @@ keep_going:						/* We will come back to here until there is
 						 * into AWAITING_RESPONSE state and let the code there
 						 * deal with it.  Note we have *not* consumed the "E"
 						 * byte here.
+						 *
+						 * Note that unlike on an error response to
+						 * SSLRequest, we allow falling back to SSL or
+						 * plaintext connection here.  GSS support was
+						 * introduced in PostgreSQL version 12, so an error
+						 * response might mean that we are connecting to a
+						 * pre-v12 server.
 						 */
 						conn->status = CONNECTION_AWAITING_RESPONSE;
 						goto keep_going;
@@ -3659,6 +3672,10 @@ keep_going:						/* We will come back to here until there is
 				}
 				else if (pollres == PGRES_POLLING_FAILED)
 				{
+					/*
+					 * GSS handshake failed.  We will retry with an SSL or
+					 * plaintext connection, if permitted by the options.
+					 */
 					CONNECTION_FAILED();
 				}
 				/* Else, return POLLING_READING or POLLING_WRITING status */
-- 
2.39.2

0003-WIP-Add-injection-points-and-test-for-early-backend-.patchtext/x-patch; charset=UTF-8; name=0003-WIP-Add-injection-points-and-test-for-early-backend-.patchDownload
From e893d2e53a9bde6ddf4462d9f8b206c51ebd40cb Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Mon, 24 Jun 2024 18:15:53 +0300
Subject: [PATCH 3/4] WIP: Add injection points and test for early backend
 startup errors

---
 src/backend/tcop/backend_startup.c            |  9 ++++
 src/backend/utils/misc/injection_point.c      | 33 +++++++++++-
 src/include/utils/injection_point.h           |  1 +
 src/interfaces/libpq/meson.build              |  1 +
 .../libpq/t/005_negotiate_encryption.pl       | 50 +++++++++++++++++++
 5 files changed, 92 insertions(+), 2 deletions(-)

diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index cfa27551964..7b4a3e23458 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -33,6 +33,7 @@
 #include "tcop/backend_startup.h"
 #include "tcop/tcopprot.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
 #include "utils/timeout.h"
@@ -160,6 +161,14 @@ BackendInitialize(ClientSocket *client_sock, CAC_state cac)
 
 	whereToSendOutput = DestRemote; /* now safe to ereport to client */
 
+	/* For testing client error handling */
+	INJECTION_POINT("backend-initialize");
+	if (IsInjectionPointAttached("backend-initialize-v2-error"))
+	{
+		FrontendProtocol = PG_PROTOCOL(2,0);
+		elog(FATAL, "protocol version 2 error triggered");
+	}
+
 	/* set these to empty in case they are needed before we set them up */
 	port->remote_host = "";
 	port->remote_port = "";
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index 5c2a0d2297e..d3995dfd6fc 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -24,6 +24,7 @@
 #include "port/pg_bitutils.h"
 #include "storage/fd.h"
 #include "storage/lwlock.h"
+#include "storage/proc.h"
 #include "storage/shmem.h"
 #include "utils/hsearch.h"
 #include "utils/injection_point.h"
@@ -281,6 +282,31 @@ InjectionPointDetach(const char *name)
 #endif
 }
 
+/*
+ * Test if an injection point is defined.
+ */
+bool
+IsInjectionPointAttached(const char *name)
+{
+#ifdef USE_INJECTION_POINTS
+	bool		found;
+
+	/* FIXME: just skip locking if we don't have a PGPROC entry yet. Obviously unsafe.. */
+	if (MyProc != NULL && IsUnderPostmaster)
+		LWLockAcquire(InjectionPointLock, LW_SHARED);
+
+	(void) hash_search(InjectionPointHash, name, HASH_FIND, &found);
+
+	if (MyProc != NULL && IsUnderPostmaster)
+		LWLockRelease(InjectionPointLock);
+
+	return found;
+#else
+	elog(ERROR, "Injection points are not supported by this build");
+	return false;				/* silence compiler */
+#endif
+}
+
 /*
  * Execute an injection point, if defined.
  *
@@ -296,11 +322,14 @@ InjectionPointRun(const char *name)
 	InjectionPointCallback injection_callback;
 	const void *private_data;
 
-	LWLockAcquire(InjectionPointLock, LW_SHARED);
+	/* FIXME: just skip locking if we don't have a PGPROC entry yet. Obviously unsafe.. */
+	if (MyProc != NULL && IsUnderPostmaster)
+		LWLockAcquire(InjectionPointLock, LW_SHARED);
 	entry_by_name = (InjectionPointEntry *)
 		hash_search(InjectionPointHash, name,
 					HASH_FIND, &found);
-	LWLockRelease(InjectionPointLock);
+	if (MyProc != NULL && IsUnderPostmaster)
+		LWLockRelease(InjectionPointLock);
 
 	/*
 	 * If not found, do nothing and remove it from the local cache if it
diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index a61d5d44391..5715eefb31e 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -35,6 +35,7 @@ extern void InjectionPointAttach(const char *name,
 								 const void *private_data,
 								 int private_data_size);
 extern void InjectionPointRun(const char *name);
+extern bool IsInjectionPointAttached(const char *name);
 extern bool InjectionPointDetach(const char *name);
 
 #endif							/* INJECTION_POINT_H */
diff --git a/src/interfaces/libpq/meson.build b/src/interfaces/libpq/meson.build
index ed2a4048d18..7623aeadab7 100644
--- a/src/interfaces/libpq/meson.build
+++ b/src/interfaces/libpq/meson.build
@@ -121,6 +121,7 @@ tests += {
       't/005_negotiate_encryption.pl',
     ],
     'env': {
+      'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
       'with_ssl': ssl_library,
       'with_gssapi': gssapi.found() ? 'yes' : 'no',
       'with_krb_srvnam': 'postgres',
diff --git a/src/interfaces/libpq/t/005_negotiate_encryption.pl b/src/interfaces/libpq/t/005_negotiate_encryption.pl
index c3f70d31bc8..e6bea0f04f8 100644
--- a/src/interfaces/libpq/t/005_negotiate_encryption.pl
+++ b/src/interfaces/libpq/t/005_negotiate_encryption.pl
@@ -90,6 +90,8 @@ my $kerberos_enabled =
   $ENV{PG_TEST_EXTRA} && $ENV{PG_TEST_EXTRA} =~ /\bkerberos\b/;
 my $ssl_supported = $ENV{with_ssl} eq 'openssl';
 
+my $injection_points_supported = $ENV{enable_injection_points} eq 'yes';
+
 ###
 ### Prepare test server for GSSAPI and SSL authentication, with a few
 ### different test users and helper functions. We don't actually
@@ -155,6 +157,10 @@ $node->safe_psql('postgres', 'CREATE USER ssluser;');
 $node->safe_psql('postgres', 'CREATE USER nossluser;');
 $node->safe_psql('postgres', 'CREATE USER gssuser;');
 $node->safe_psql('postgres', 'CREATE USER nogssuser;');
+if ($injection_points_supported != 0)
+{
+	$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;')
+}
 
 my $unixdir = $node->safe_psql('postgres', 'SHOW unix_socket_directories;');
 chomp($unixdir);
@@ -320,6 +326,27 @@ nossluser   .            disable      postgres       connect, authok
 		\@all_sslmodes, \@all_sslnegotiations,
 		parse_table($test_table));
 
+	if ($injection_points_supported != 0)
+	{
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-initialize', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser sslmode=prefer",
+			'backenderror -> fail');
+		$node->restart;
+
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-initialize-v2-error', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser sslmode=prefer",
+			'v2error -> fail');
+		$node->restart;
+	}
+
 	# Disable SSL again
 	$node->adjust_conf('postgresql.conf', 'ssl', 'off');
 	$node->reload;
@@ -403,6 +430,27 @@ nogssuser   disable      disable      postgres       connect, authok
 	test_matrix($node, $server_config, [ 'testuser', 'gssuser', 'nogssuser' ],
 		\@all_gssencmodes, $sslmodes, $sslnegotiations,
 		parse_table($test_table));
+
+	if ($injection_points_supported != 0)
+	{
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-initialize', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser gssencmode=prefer sslmode=disable",
+			'backenderror, backenderror -> fail');
+		$node->restart;
+
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-initialize-v2-error', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser gssencmode=prefer sslmode=disable",
+			'v2error -> fail');
+		$node->restart;
+	}
 }
 
 ###
@@ -750,6 +798,8 @@ sub parse_log_events
 		push @events, "gssreject" if $line =~ /GSSENCRequest rejected/;
 		push @events, "authfail" if $line =~ /no pg_hba.conf entry/;
 		push @events, "authok" if $line =~ /connection authenticated/;
+		push @events, "backenderror" if $line =~ /error triggered for injection point backend-/;
+		push @events, "v2error" if $line =~ /protocol version 2 error triggered/;
 	}
 
 	# No events at all is represented by "-"
-- 
2.39.2

0004-WIP-Add-tests-for-errors-during-ssl-or-gssapi-handsh.patchtext/x-patch; charset=UTF-8; name=0004-WIP-Add-tests-for-errors-during-ssl-or-gssapi-handsh.patchDownload
From 572b3e8ba2ab33fa695cb557ab6d0b6ae0e0ae63 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Mon, 24 Jun 2024 23:13:28 +0300
Subject: [PATCH 4/4] WIP: Add tests for errors during ssl or gssapi handshake

---
 src/backend/libpq/be-secure-gssapi.c           |  3 +++
 src/backend/libpq/be-secure.c                  |  3 +++
 .../libpq/t/005_negotiate_encryption.pl        | 18 ++++++++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/src/backend/libpq/be-secure-gssapi.c b/src/backend/libpq/be-secure-gssapi.c
index bc04e78abba..483636503c1 100644
--- a/src/backend/libpq/be-secure-gssapi.c
+++ b/src/backend/libpq/be-secure-gssapi.c
@@ -21,6 +21,7 @@
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "utils/injection_point.h"
 #include "utils/memutils.h"
 
 
@@ -499,6 +500,8 @@ secure_open_gssapi(Port *port)
 				minor;
 	gss_cred_id_t delegated_creds;
 
+	INJECTION_POINT("backend-gssapi-startup");
+
 	/*
 	 * Allocate subsidiary Port data for GSSAPI operations.
 	 */
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 1663f36b6b8..ef20ea755b7 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -30,6 +30,7 @@
 #include "libpq/libpq.h"
 #include "miscadmin.h"
 #include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
 #include "utils/wait_event.h"
 
 char	   *ssl_library;
@@ -129,6 +130,8 @@ secure_open_server(Port *port)
 	}
 	Assert(pq_buffer_remaining_data() == 0);
 
+	INJECTION_POINT("backend-ssl-startup");
+
 	r = be_tls_open_server(port);
 
 	if (port->raw_buf_remaining > 0)
diff --git a/src/interfaces/libpq/t/005_negotiate_encryption.pl b/src/interfaces/libpq/t/005_negotiate_encryption.pl
index e6bea0f04f8..991badf28ec 100644
--- a/src/interfaces/libpq/t/005_negotiate_encryption.pl
+++ b/src/interfaces/libpq/t/005_negotiate_encryption.pl
@@ -345,6 +345,15 @@ nossluser   .            disable      postgres       connect, authok
 			"user=testuser sslmode=prefer",
 			'v2error -> fail');
 		$node->restart;
+
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-ssl-startup', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser sslmode=prefer",
+			'connect, sslaccept, backenderror, reconnect, authok -> plain');
+		$node->restart;
 	}
 
 	# Disable SSL again
@@ -450,6 +459,15 @@ nogssuser   disable      disable      postgres       connect, authok
 			"user=testuser gssencmode=prefer sslmode=disable",
 			'v2error -> fail');
 		$node->restart;
+
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-gssapi-startup', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser gssencmode=prefer sslmode=disable",
+			'connect, gssaccept, backenderror, reconnect, authok -> plain');
+		$node->restart;
 	}
 }
 
-- 
2.39.2

serve-pgsql-error.plapplication/x-perl; name=serve-pgsql-error.plDownload
#15Vladimir Sitnikov
sitnikov.vladimir@gmail.com
In reply to: Heikki Linnakangas (#14)
Re: Direct SSL connection and ALPN loose ends

I reviewed the documentation for "direct ALPN connections' ', and it looks
like it could be improved.
Here's the link:
https://www.postgresql.org/docs/17/protocol-flow.html#PROTOCOL-FLOW-SSL

The currently suggested values for "sslnegotiations" are "direct" and
"postgres".
The project name is PostgreSQL and the ALPN name is postgresql. Is there a
reason why property value uses "postgres"?
Can the value be renamed to postgresql for consistency?

"SSL". Technically, the proper term is TLS, and even the document refers to
"IANA TLS ALPN Protocol IDs" (TLS, not SSL).
I would not die on that hill, however, going for tlsnegotiation would look
better than sslnegotiation.

Vladimir

#16Dave Cramer
davecramer@postgres.rocks
In reply to: Vladimir Sitnikov (#15)
Re: Direct SSL connection and ALPN loose ends

On Tue, 25 Jun 2024 at 09:37, Vladimir Sitnikov <sitnikov.vladimir@gmail.com>
wrote:

I reviewed the documentation for "direct ALPN connections' ', and it looks
like it could be improved.
Here's the link:
https://www.postgresql.org/docs/17/protocol-flow.html#PROTOCOL-FLOW-SSL

The currently suggested values for "sslnegotiations" are "direct" and
"postgres".
The project name is PostgreSQL and the ALPN name is postgresql. Is there a
reason why property value uses "postgres"?
Can the value be renamed to postgresql for consistency?

+1 I found it strange that we are not using postgresql

"SSL". Technically, the proper term is TLS, and even the document refers
to "IANA TLS ALPN Protocol IDs" (TLS, not SSL).
I would not die on that hill, however, going for tlsnegotiation would look
better than sslnegotiation.

+1 again, unusual to use SSL when this really is TLS.

Dave

#17Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Jacob Champion (#13)
Re: Direct SSL connection and ALPN loose ends

On Thu, Jun 20, 2024 at 4:32 PM Jacob Champion
<jacob.champion@enterprisedb.com> wrote:

Thanks,
--Jacob

Hey Heikki,

[sending this to the list in case it's not just me]

I cannot for the life of me get GMail to deliver your latest message,
even though I see it on postgresql.org. It's not in spam; it's just
gone. I wonder if it's possibly the Perl server script causing
virus-scanner issues?

--Jacob

#18Jacob Champion
jacob.champion@enterprisedb.com
In reply to: Dave Cramer (#16)
Re: Direct SSL connection and ALPN loose ends

On Tue, Jun 25, 2024 at 7:20 AM Dave Cramer <davecramer@postgres.rocks> wrote:

On Tue, 25 Jun 2024 at 09:37, Vladimir Sitnikov <sitnikov.vladimir@gmail.com> wrote:

"SSL". Technically, the proper term is TLS, and even the document refers to "IANA TLS ALPN Protocol IDs" (TLS, not SSL).
I would not die on that hill, however, going for tlsnegotiation would look better than sslnegotiation.

+1 again, unusual to use SSL when this really is TLS.

This was sort of litigated last ye-(checks notes) oh no, three years ago:

/messages/by-id/CE12DD5C-4BB3-4166-BC9A-39779568734C@yesql.se

I'm your side when it comes to the use of the TLS acronym, personally,
but I think introducing a brand new option that interfaces with
sslmode and sslrootcert and etc. while not being named like them would
be outright unhelpful. And the idea of switching everything to use TLS
in docs seemed to be met with a solid "meh" on the other thread.

--Jacob

#19Michael Paquier
michael@paquier.xyz
In reply to: Heikki Linnakangas (#14)
Re: Direct SSL connection and ALPN loose ends

On Mon, Jun 24, 2024 at 11:30:53PM +0300, Heikki Linnakangas wrote:

Given that, I think it is a good thing to fail the connection completely on
receiving a V2 error.

Attached is a patch to fix the other issue, with falling back from SSL to
plaintext. And some tests and comment fixes I spotted while at it.

0001: A small comment fix

Already committed as of cc68ca6d420e.

0002: This is the main patch that fixes the SSL fallback issue

+ conn->failed_enc_methods |= conn->allowed_enc_methods &
(~conn->current_enc_method);

Sounds reasonable to me.

It's a bit annoying to have to guess that current_enc_method is
tracking only one method at a time (aka these three fields are not
documented in libpq-int.h), while allowed_enc_methods and
failed_enc_methods is a bitwise combination of the methods that are
still allowed or that have already failed.

0003: This adds fault injection tests to exercise these early error
codepaths. It is not ready to be merged, as it contains a hack to skip
locking. See thread at
/messages/by-id/e1ffb822-054e-4006-ac06-50532767f75b@iki.fi.

Locking when running an injection point has been replaced by some
atomics in 86db52a5062a.

+    if (IsInjectionPointAttached("backend-initialize-v2-error"))
+    {
+        FrontendProtocol = PG_PROTOCOL(2,0);
+        elog(FATAL, "protocol version 2 error triggered");
+    }

This is an attempt to do stack manipulation with an injection point
set. FrontendProtocol is a global variable, so you could have a new
callback setting up this global variable directly, then FATAL (I
really don't mind is modules/injection_points finishes with a library
of callbacks).

Not sure to like much this new IsInjectionPointAttached() that does a
search in the existing injection point pool, though. This leads to
more code footprint in the core backend, and I'm trying to minimize
that. Not everybody agrees with this view, I'd guess, which is also
fine.

0004: More tests, for what happens if the server sends an error after
responding "yes" to the SSLRequest or GSSRequest, but before performing the
SSL/GSS handshake.

No objections to these two additions.

Attached is also a little stand-alone perl program that listens on a socket,
and when you connect to it, it immediately sends a V2 or V3 error, depending
on the argument. That's useful for testing. It could be used as an
alternative strategy to the injection points I used in the 0003-0004
patches, but for now I just used it for manual testing.

Nice toy.
--
Michael

#20Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Michael Paquier (#19)
3 attachment(s)
Re: Direct SSL connection and ALPN loose ends

On 16/07/2024 09:54, Michael Paquier wrote:

On Mon, Jun 24, 2024 at 11:30:53PM +0300, Heikki Linnakangas wrote:

0002: This is the main patch that fixes the SSL fallback issue

+ conn->failed_enc_methods |= conn->allowed_enc_methods &
(~conn->current_enc_method);

Sounds reasonable to me.

It's a bit annoying to have to guess that current_enc_method is
tracking only one method at a time (aka these three fields are not
documented in libpq-int.h), while allowed_enc_methods and
failed_enc_methods is a bitwise combination of the methods that are
still allowed or that have already failed.

Yeah. In hindsight I'm still not very happy with the code structure with
"allowed_enc_methods" and "current_enc_methods" and all that. The
fallback logic is still complicated. It's better than in v16, IMHO, but
still not great. This patch seems like the best fix for v17, but I
wouldn't mind another round of refactoring for v18, if anyone's got some
good ideas on how to structure it better. All these new tests are a
great asset when refactoring this again.

+    if (IsInjectionPointAttached("backend-initialize-v2-error"))
+    {
+        FrontendProtocol = PG_PROTOCOL(2,0);
+        elog(FATAL, "protocol version 2 error triggered");
+    }

This is an attempt to do stack manipulation with an injection point
set. FrontendProtocol is a global variable, so you could have a new
callback setting up this global variable directly, then FATAL (I
really don't mind is modules/injection_points finishes with a library
of callbacks).

Not sure to like much this new IsInjectionPointAttached() that does a
search in the existing injection point pool, though. This leads to
more code footprint in the core backend, and I'm trying to minimize
that. Not everybody agrees with this view, I'd guess, which is also
fine.

Yeah, I'm also not too excited about the additional code in the backend,
but I'm also not excited about writing another test C module just for
this. I'm inclined to commit this as it is, but we can certainly revisit
this later, since it's just test code.

Here's a new rebased version with some minor cleanup. Notably, I added
docs for the new IS_INJECTION_POINT_ATTACHED() macro.

--
Heikki Linnakangas
Neon (https://neon.tech)

Attachments:

v2-0001-Fix-fallback-behavior-when-server-sends-an-ERROR-.patchtext/x-patch; charset=UTF-8; name=v2-0001-Fix-fallback-behavior-when-server-sends-an-ERROR-.patchDownload
From 651db616fcb19b89c94cacb30c27267ac7c17070 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Tue, 23 Jul 2024 20:03:27 +0300
Subject: [PATCH v2 1/3] Fix fallback behavior when server sends an ERROR early
 at startup

With sslmode=prefer, the desired behavior is to completely fail the
connection attempt, *not* fall back to a plaintext connection, if the
server responds to the SSLRequest with an error ('E') response instead
of rejecting SSL with an 'N' response. This was broken in commit
05fd30c0e7.

Reported-by: Jacob Champion
Discussion: https://www.postgresql.org/message-id/CAOYmi%2Bnwvu21mJ4DYKUa98HdfM_KZJi7B1MhyXtnsyOO-PB6Ww%40mail.gmail.com
---
 src/interfaces/libpq/fe-connect.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index e003279fb6..360d9a4547 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3521,6 +3521,12 @@ keep_going:						/* We will come back to here until there is
 						 * byte here.
 						 */
 						conn->status = CONNECTION_AWAITING_RESPONSE;
+
+						/*
+						 * Don't fall back to a plaintext connection after
+						 * reading the error.
+						 */
+						conn->failed_enc_methods |= conn->allowed_enc_methods & (~conn->current_enc_method);
 						goto keep_going;
 					}
 					else
@@ -3612,6 +3618,13 @@ keep_going:						/* We will come back to here until there is
 						 * into AWAITING_RESPONSE state and let the code there
 						 * deal with it.  Note we have *not* consumed the "E"
 						 * byte here.
+						 *
+						 * Note that unlike on an error response to
+						 * SSLRequest, we allow falling back to SSL or
+						 * plaintext connection here.  GSS support was
+						 * introduced in PostgreSQL version 12, so an error
+						 * response might mean that we are connecting to a
+						 * pre-v12 server.
 						 */
 						conn->status = CONNECTION_AWAITING_RESPONSE;
 						goto keep_going;
@@ -3659,6 +3672,10 @@ keep_going:						/* We will come back to here until there is
 				}
 				else if (pollres == PGRES_POLLING_FAILED)
 				{
+					/*
+					 * GSS handshake failed.  We will retry with an SSL or
+					 * plaintext connection, if permitted by the options.
+					 */
 					CONNECTION_FAILED();
 				}
 				/* Else, return POLLING_READING or POLLING_WRITING status */
-- 
2.39.2

v2-0002-Add-test-for-early-backend-startup-errors.patchtext/x-patch; charset=UTF-8; name=v2-0002-Add-test-for-early-backend-startup-errors.patchDownload
From 649e2c89f62484940c55b02635e393bdd85654af Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Tue, 23 Jul 2024 20:20:37 +0300
Subject: [PATCH v2 2/3] Add test for early backend startup errors

The new test tests the libpq fallback behavior on an early error.

This adds an IS_INJECTION_POINT_ATTACHED() macro, to allow writing the
test code alongside the normal source code. In principle, the new test
could've been written by an extra test module with a callback that
sets the FrontendProtocol global variable, but I think it's more
straightforward to have the test code

Discussion: https://www.postgresql.org/message-id/CAOYmi%2Bnwvu21mJ4DYKUa98HdfM_KZJi7B1MhyXtnsyOO-PB6Ww%40mail.gmail.com
---
 doc/src/sgml/xfunc.sgml                       | 25 ++++++++++
 src/backend/tcop/backend_startup.c            | 16 ++++++
 src/backend/utils/misc/injection_point.c      | 15 ++++++
 src/include/utils/injection_point.h           |  3 ++
 src/interfaces/libpq/Makefile                 |  4 +-
 src/interfaces/libpq/meson.build              |  1 +
 .../libpq/t/005_negotiate_encryption.pl       | 50 +++++++++++++++++++
 7 files changed, 113 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/xfunc.sgml b/doc/src/sgml/xfunc.sgml
index 7e92e89846..5b584a4f14 100644
--- a/doc/src/sgml/xfunc.sgml
+++ b/doc/src/sgml/xfunc.sgml
@@ -3672,6 +3672,31 @@ custom_injection_callback(const char *name, const void *private_data)
      logic.
     </para>
 
+    <para>
+     An alternative way to define the action to take when an injection point
+     is reached is to add the testing code alongside the normal source
+     code. This can be useful if the action e.g. depends on local variables
+     that are not accessible to loaded modules. The
+     <function>IS_INJECTION_POINT_ATTACHED</function> macro can then be used
+     to check if an injection point is attached, for example:
+<programlisting>
+#ifdef USE_INJECTION_POINTS
+if (IS_INJECTION_POINT_ATTACHED("before-foobar"))
+{
+    /* change a local variable if injection point is attached */
+    local_var = 123;
+
+    /* also execute the callback */
+    INJECTION_POINT_CACHED("before-foobar");
+}
+#endif
+</programlisting>
+     Note that the callback attached to the injection point will not be
+     executed by the <function>IS_INJECTION_POINT_ATTACHED</function>
+     macro. If you want to execute the callback, you must also call
+     <function>INJECTION_POINT_CACHED</function> like in the above example.
+    </para>
+
     <para>
      Optionally, it is possible to detach an injection point by calling:
 <programlisting>
diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index a2f94b1050..b840d95e4d 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -33,6 +33,7 @@
 #include "tcop/backend_startup.h"
 #include "tcop/tcopprot.h"
 #include "utils/builtins.h"
+#include "utils/injection_point.h"
 #include "utils/memutils.h"
 #include "utils/ps_status.h"
 #include "utils/timeout.h"
@@ -213,6 +214,21 @@ BackendInitialize(ClientSocket *client_sock, CAC_state cac)
 							remote_host)));
 	}
 
+	/* For testing client error handling */
+#ifdef USE_INJECTION_POINTS
+	INJECTION_POINT("backend-initialize");
+	if (IS_INJECTION_POINT_ATTACHED("backend-initialize-v2-error"))
+	{
+		/*
+		 * This simulates an early error from a pre-v14 server, which used the
+		 * version 2 protocol for any errors that occurred before processing
+		 * the startup packet.
+		 */
+		FrontendProtocol = PG_PROTOCOL(2, 0);
+		elog(FATAL, "protocol version 2 error triggered");
+	}
+#endif
+
 	/*
 	 * If we did a reverse lookup to name, we might as well save the results
 	 * rather than possibly repeating the lookup during authentication.
diff --git a/src/backend/utils/misc/injection_point.c b/src/backend/utils/misc/injection_point.c
index 8ad0c27bc8..ec23f1f4d5 100644
--- a/src/backend/utils/misc/injection_point.c
+++ b/src/backend/utils/misc/injection_point.c
@@ -23,6 +23,7 @@
 #include "miscadmin.h"
 #include "storage/fd.h"
 #include "storage/lwlock.h"
+#include "storage/proc.h"
 #include "storage/shmem.h"
 #include "utils/hsearch.h"
 #include "utils/injection_point.h"
@@ -570,3 +571,17 @@ InjectionPointCached(const char *name)
 	elog(ERROR, "Injection points are not supported by this build");
 #endif
 }
+
+/*
+ * Test if an injection point is defined.
+ */
+bool
+IsInjectionPointAttached(const char *name)
+{
+#ifdef USE_INJECTION_POINTS
+	return InjectionPointCacheRefresh(name) != NULL;
+#else
+	elog(ERROR, "Injection points are not supported by this build");
+	return false;				/* silence compiler */
+#endif
+}
diff --git a/src/include/utils/injection_point.h b/src/include/utils/injection_point.h
index a385e3df64..6be20b4857 100644
--- a/src/include/utils/injection_point.h
+++ b/src/include/utils/injection_point.h
@@ -18,10 +18,12 @@
 #define INJECTION_POINT_LOAD(name) InjectionPointLoad(name)
 #define INJECTION_POINT(name) InjectionPointRun(name)
 #define INJECTION_POINT_CACHED(name) InjectionPointCached(name)
+#define IS_INJECTION_POINT_ATTACHED(name) IsInjectionPointAttached(name)
 #else
 #define INJECTION_POINT_LOAD(name) ((void) name)
 #define INJECTION_POINT(name) ((void) name)
 #define INJECTION_POINT_CACHED(name) ((void) name)
+#define IS_INJECTION_POINT_ATTACHED(name) (false)
 #endif
 
 /*
@@ -41,6 +43,7 @@ extern void InjectionPointAttach(const char *name,
 extern void InjectionPointLoad(const char *name);
 extern void InjectionPointRun(const char *name);
 extern void InjectionPointCached(const char *name);
+extern bool IsInjectionPointAttached(const char *name);
 extern bool InjectionPointDetach(const char *name);
 
 #endif							/* INJECTION_POINT_H */
diff --git a/src/interfaces/libpq/Makefile b/src/interfaces/libpq/Makefile
index b36a765764..27f8499d8a 100644
--- a/src/interfaces/libpq/Makefile
+++ b/src/interfaces/libpq/Makefile
@@ -9,11 +9,13 @@
 #
 #-------------------------------------------------------------------------
 
+EXTRA_INSTALL=src/test/modules/injection_points
+
 subdir = src/interfaces/libpq
 top_builddir = ../../..
 include $(top_builddir)/src/Makefile.global
 
-export with_ssl with_gssapi with_krb_srvnam
+export with_ssl with_gssapi with_krb_srvnam enable_injection_points
 
 PGFILEDESC = "PostgreSQL Access Library"
 
diff --git a/src/interfaces/libpq/meson.build b/src/interfaces/libpq/meson.build
index ed2a4048d1..7623aeadab 100644
--- a/src/interfaces/libpq/meson.build
+++ b/src/interfaces/libpq/meson.build
@@ -121,6 +121,7 @@ tests += {
       't/005_negotiate_encryption.pl',
     ],
     'env': {
+      'enable_injection_points': get_option('injection_points') ? 'yes' : 'no',
       'with_ssl': ssl_library,
       'with_gssapi': gssapi.found() ? 'yes' : 'no',
       'with_krb_srvnam': 'postgres',
diff --git a/src/interfaces/libpq/t/005_negotiate_encryption.pl b/src/interfaces/libpq/t/005_negotiate_encryption.pl
index 251c405926..e21c883ab4 100644
--- a/src/interfaces/libpq/t/005_negotiate_encryption.pl
+++ b/src/interfaces/libpq/t/005_negotiate_encryption.pl
@@ -90,6 +90,8 @@ my $kerberos_enabled =
   $ENV{PG_TEST_EXTRA} && $ENV{PG_TEST_EXTRA} =~ /\bkerberos\b/;
 my $ssl_supported = $ENV{with_ssl} eq 'openssl';
 
+my $injection_points_supported = $ENV{enable_injection_points} eq 'yes';
+
 ###
 ### Prepare test server for GSSAPI and SSL authentication, with a few
 ### different test users and helper functions. We don't actually
@@ -155,6 +157,10 @@ $node->safe_psql('postgres', 'CREATE USER ssluser;');
 $node->safe_psql('postgres', 'CREATE USER nossluser;');
 $node->safe_psql('postgres', 'CREATE USER gssuser;');
 $node->safe_psql('postgres', 'CREATE USER nogssuser;');
+if ($injection_points_supported != 0)
+{
+	$node->safe_psql('postgres', 'CREATE EXTENSION injection_points;')
+}
 
 my $unixdir = $node->safe_psql('postgres', 'SHOW unix_socket_directories;');
 chomp($unixdir);
@@ -312,6 +318,27 @@ nossluser   .            disable      postgres       connect, authok
 		['disable'], \@all_sslmodes, \@all_sslnegotiations,
 		parse_table($test_table));
 
+	if ($injection_points_supported != 0)
+	{
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-initialize', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser sslmode=prefer",
+			'connect, backenderror -> fail');
+		$node->restart;
+
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-initialize-v2-error', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser sslmode=prefer",
+			'connect, v2error -> fail');
+		$node->restart;
+	}
+
 	# Disable SSL again
 	$node->adjust_conf('postgresql.conf', 'ssl', 'off');
 	$node->reload;
@@ -393,6 +420,27 @@ nogssuser   disable      disable      postgres       connect, authok
 	test_matrix($node, [ 'testuser', 'gssuser', 'nogssuser' ],
 		\@all_gssencmodes, $sslmodes, $sslnegotiations,
 		parse_table($test_table));
+
+	if ($injection_points_supported != 0)
+	{
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-initialize', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser gssencmode=prefer sslmode=disable",
+			'backenderror, backenderror -> fail');
+		$node->restart;
+
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-initialize-v2-error', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser gssencmode=prefer sslmode=disable",
+			'v2error -> fail');
+		$node->restart;
+	}
 }
 
 ###
@@ -738,6 +786,8 @@ sub parse_log_events
 		push @events, "gssreject" if $line =~ /GSSENCRequest rejected/;
 		push @events, "authfail" if $line =~ /no pg_hba.conf entry/;
 		push @events, "authok" if $line =~ /connection authenticated/;
+		push @events, "backenderror" if $line =~ /error triggered for injection point backend-/;
+		push @events, "v2error" if $line =~ /protocol version 2 error triggered/;
 	}
 
 	# No events at all is represented by "-"
-- 
2.39.2

v2-0003-Add-tests-for-errors-during-SSL-or-GSSAPI-handsha.patchtext/x-patch; charset=UTF-8; name=v2-0003-Add-tests-for-errors-during-SSL-or-GSSAPI-handsha.patchDownload
From 87d8fbd5e65a7298eeb3335280e15c6da2d2a4bc Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Tue, 23 Jul 2024 20:21:54 +0300
Subject: [PATCH v2 3/3] Add tests for errors during SSL or GSSAPI handshake

These test that libpq correctly falls back to a plaintext connection
on handshake error, in the "prefer" modes.
---
 src/backend/libpq/be-secure-gssapi.c           |  3 +++
 src/backend/libpq/be-secure.c                  |  3 +++
 .../libpq/t/005_negotiate_encryption.pl        | 18 ++++++++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/src/backend/libpq/be-secure-gssapi.c b/src/backend/libpq/be-secure-gssapi.c
index bc04e78abb..483636503c 100644
--- a/src/backend/libpq/be-secure-gssapi.c
+++ b/src/backend/libpq/be-secure-gssapi.c
@@ -21,6 +21,7 @@
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "utils/injection_point.h"
 #include "utils/memutils.h"
 
 
@@ -499,6 +500,8 @@ secure_open_gssapi(Port *port)
 				minor;
 	gss_cred_id_t delegated_creds;
 
+	INJECTION_POINT("backend-gssapi-startup");
+
 	/*
 	 * Allocate subsidiary Port data for GSSAPI operations.
 	 */
diff --git a/src/backend/libpq/be-secure.c b/src/backend/libpq/be-secure.c
index 1663f36b6b..ef20ea755b 100644
--- a/src/backend/libpq/be-secure.c
+++ b/src/backend/libpq/be-secure.c
@@ -30,6 +30,7 @@
 #include "libpq/libpq.h"
 #include "miscadmin.h"
 #include "tcop/tcopprot.h"
+#include "utils/injection_point.h"
 #include "utils/wait_event.h"
 
 char	   *ssl_library;
@@ -129,6 +130,8 @@ secure_open_server(Port *port)
 	}
 	Assert(pq_buffer_remaining_data() == 0);
 
+	INJECTION_POINT("backend-ssl-startup");
+
 	r = be_tls_open_server(port);
 
 	if (port->raw_buf_remaining > 0)
diff --git a/src/interfaces/libpq/t/005_negotiate_encryption.pl b/src/interfaces/libpq/t/005_negotiate_encryption.pl
index e21c883ab4..eadec9145f 100644
--- a/src/interfaces/libpq/t/005_negotiate_encryption.pl
+++ b/src/interfaces/libpq/t/005_negotiate_encryption.pl
@@ -337,6 +337,15 @@ nossluser   .            disable      postgres       connect, authok
 			"user=testuser sslmode=prefer",
 			'connect, v2error -> fail');
 		$node->restart;
+
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-ssl-startup', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser sslmode=prefer",
+			'connect, sslaccept, backenderror, reconnect, authok -> plain');
+		$node->restart;
 	}
 
 	# Disable SSL again
@@ -440,6 +449,15 @@ nogssuser   disable      disable      postgres       connect, authok
 			"user=testuser gssencmode=prefer sslmode=disable",
 			'v2error -> fail');
 		$node->restart;
+
+		$node->safe_psql('postgres',
+						 "SELECT injection_points_attach('backend-gssapi-startup', 'error');",
+						 connstr => "user=localuser host=$unixdir");
+		connect_test(
+			$node,
+			"user=testuser gssencmode=prefer sslmode=disable",
+			'connect, gssaccept, backenderror, reconnect, authok -> plain');
+		$node->restart;
 	}
 }
 
-- 
2.39.2

#21Michael Paquier
michael@paquier.xyz
In reply to: Heikki Linnakangas (#20)
Re: Direct SSL connection and ALPN loose ends

On Tue, Jul 23, 2024 at 08:32:29PM +0300, Heikki Linnakangas wrote:

All these new tests are a great asset when refactoring this again.

Thanks for doing that. The coverage, especially with v2, is going to
be really useful.

Yeah, I'm also not too excited about the additional code in the backend, but
I'm also not excited about writing another test C module just for this. I'm
inclined to commit this as it is, but we can certainly revisit this later,
since it's just test code.

The point would be to rely on the existing injection_points module,
with a new callback in it. The callbacks could be on a file of their
own in the module, for clarity. What you have is OK for me anyway, it
is good to add more options to developers in this area and this gets
used in core. That's also enough to manipulate the stack in or even
out of core.

Here's a new rebased version with some minor cleanup. Notably, I added docs
for the new IS_INJECTION_POINT_ATTACHED() macro.

0001 looks OK.

+       push @events, "backenderror" if $line =~ /error triggered for
injection point backend-/;
+       push @events, "v2error" if $line =~ /protocol version 2 error
triggered/;

Perhaps append an "injection_" for these two keywords?

+#include "storage/proc.h"

This inclusion in injection_point.c should not be needed.

sets the FrontendProtocol global variable, but I think it's more
straightforward to have the test code

The last sentence in the commit message of 0002 seems to be
unfinished.

Could you run a perltidy on 005_negotiate_encryption.pl? There are a
bunch of new diffs in it.
--
Michael

#22Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Michael Paquier (#21)
Re: Direct SSL connection and ALPN loose ends

On 24/07/2024 02:37, Michael Paquier wrote:

On Tue, Jul 23, 2024 at 08:32:29PM +0300, Heikki Linnakangas wrote:

All these new tests are a great asset when refactoring this again.

Thanks for doing that. The coverage, especially with v2, is going to
be really useful.

Yeah, I'm also not too excited about the additional code in the backend, but
I'm also not excited about writing another test C module just for this. I'm
inclined to commit this as it is, but we can certainly revisit this later,
since it's just test code.

The point would be to rely on the existing injection_points module,
with a new callback in it. The callbacks could be on a file of their
own in the module, for clarity.

Hmm, do we want injection_points module to be a dumping ground for
callbacks that are only useful for very specific injection points, in
specific tests? I view it as a more general purpose module, containing
callbacks that are useful for many different tests. Don't get me wrong,
I'm not necessarily against it, and it would be expedient, that's just
not how I see the purpose of injection_points.

What you have is OK for me anyway, it
is good to add more options to developers in this area and this gets
used in core. That's also enough to manipulate the stack in or even
out of core.

Ok, I kept it that way.

Here's a new rebased version with some minor cleanup. Notably, I added docs
for the new IS_INJECTION_POINT_ATTACHED() macro.

0001 looks OK.

+       push @events, "backenderror" if $line =~ /error triggered for
injection point backend-/;
+       push @events, "v2error" if $line =~ /protocol version 2 error
triggered/;

Perhaps append an "injection_" for these two keywords?

+#include "storage/proc.h"

This inclusion in injection_point.c should not be needed.

sets the FrontendProtocol global variable, but I think it's more
straightforward to have the test code

The last sentence in the commit message of 0002 seems to be
unfinished.

Fixed.

Could you run a perltidy on 005_negotiate_encryption.pl? There are a
bunch of new diffs in it.

Fixed.

Committed, thanks for the review, and thanks Jacob for the testing!

--
Heikki Linnakangas
Neon (https://neon.tech)

#23Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Andres Freund (#10)
Re: Direct SSL connection and ALPN loose ends

On 17/06/2024 21:33, Andres Freund wrote:

If provided with the necessary key information, wireshark can decode TLS
exchanges when using sslnegotiation=postgres but not with direct. Presumably
it needs to be taught postgres' ALPN id or something.

I opened https://gitlab.com/wireshark/wireshark/-/merge_requests/16612
to fix that in the wireshark pgsql protocol dissector.

--
Heikki Linnakangas
Neon (https://neon.tech)