libpq: Fix wrong connection status on invalid "connect_timeout"

Started by Lars Kanisabout 6 years ago7 messages
#1Lars Kanis
lars@greiz-reinsdorf.de
1 attachment(s)

Greetings,

libpq since PostgreSQL-12 has stricter checks for integer values in
connection parameters. They were introduced by commit
https://github.com/postgres/postgres/commit/e7a2217978d9cbb2149bfcb4ef1e45716cfcbefb
.

However in case of "connect_timeout" such an invalid integer value leads
to a connection status other than CONNECTION_OK or CONNECTION_BAD. The
wrong parameter is therefore not properly reported to user space. This
patch fixes this by explicit setting CONNECTION_BAD.

The issue was raised on ruby-pg: https://github.com/ged/ruby-pg/issues/302

It originally came up at Heroku:
https://github.com/heroku/stack-images/issues/147

--

Kind Regards,
Lars Kanis

Attachments:

0001-Fix-wrong-connection-status-on-invalid-connect_timeo.patchtext/x-patch; name=0001-Fix-wrong-connection-status-on-invalid-connect_timeo.patchDownload
From f0c0dc3511c66a1e0fe0e4cc7ad2995f5f40bd7c Mon Sep 17 00:00:00 2001
From: Lars Kanis <lars@greiz-reinsdorf.de>
Date: Thu, 17 Oct 2019 19:37:51 +0200
Subject: [PATCH] Fix wrong connection status on invalid "connect_timeout"

Commit e7a2217978d9cbb2149bfcb4ef1e45716cfcbefb introduced
stricter checks for integer values in connection parameters.
However in case of "connect_timeout" such invalid integer
values leaded to a connection status other than CONNECTION_OK
or CONNECTION_BAD. This patch explicit sets CONNECTION_BAD.

Signed-off-by: Lars Kanis <lars@greiz-reinsdorf.de>
---
 src/interfaces/libpq/fe-connect.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2..44d927e 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2008,7 +2008,11 @@ connectDBComplete(PGconn *conn)
 	{
 		if (!parse_int_param(conn->connect_timeout, &timeout, conn,
 							 "connect_timeout"))
+		{
+			/* invalid integer */
+			conn->status = CONNECTION_BAD;
 			return 0;
+		}
 
 		if (timeout > 0)
 		{
-- 
2.17.1

#2Lars Kanis
lars@greiz-reinsdorf.de
In reply to: Lars Kanis (#1)
Re: libpq: Fix wrong connection status on invalid "connect_timeout"

I verified that all other integer parameters properly set CONNECTION_BAD
in case of invalid values. These are:

* port
* keepalives_idle
* keepalives_interval
* keepalives_count
* tcp_user_timeout

That's why I changed connectDBComplete() only, instead of setting the
status directly in parse_int_param().

--

Kind Regards,
Lars Kanis

Am 17.10.19 um 20:04 schrieb Lars Kanis:

Greetings,

libpq since PostgreSQL-12 has stricter checks for integer values in
connection parameters. They were introduced by commit
https://github.com/postgres/postgres/commit/e7a2217978d9cbb2149bfcb4ef1e45716cfcbefb
.

However in case of "connect_timeout" such an invalid integer value leads
to a connection status other than CONNECTION_OK or CONNECTION_BAD. The
wrong parameter is therefore not properly reported to user space. This
patch fixes this by explicit setting CONNECTION_BAD.

The issue was raised on ruby-pg: https://github.com/ged/ruby-pg/issues/302

It originally came up at Heroku:
https://github.com/heroku/stack-images/issues/147

--
--
Kind Regards,
Lars Kanis

#3Michael Paquier
michael@paquier.xyz
In reply to: Lars Kanis (#2)
1 attachment(s)
Re: libpq: Fix wrong connection status on invalid "connect_timeout"

On Thu, Oct 17, 2019 at 10:10:17PM +0200, Lars Kanis wrote:

That's why I changed connectDBComplete() only, instead of setting the
status directly in parse_int_param().

Yes, you shouldn't do that as the keepalive parameters and
tcp_user_timeout have some specific handling when it comes to defaults
depending on the platform and we have some retry logic when specifying
multiple hosts.

Now, there is actually more to it than it looks at first glance. Your
patch is pointing out at a failure within the regression tests of the
ECPG driver, as any parameters part of a connection string may have
trailing spaces which are considered as incorrect by the patch,
causing the connection to fail.

In short, on HEAD this succeeds but would fail with your patch:
$ psql 'postgresql:///postgres?host=/tmp&connect_timeout=14 &port=5432'
psql: error: could not connect to server: invalid integer value "14 "
for connection option "connect_timeout"

Parameter names are more restrictive, as URLs don't allow leading or
trailing spaces for them. On HEAD, we allow leading spaces for
integer parameters as the parsing uses strtol(3), but not for the
trailing spaces, which is a bit crazy and I think that we had better
not break that if the parameter value correctly defines a proper
integer. So attached is a patch to skip trailing whitespaces as well,
which also fixes the issue with ECPG. I have refactored the parsing
logic a bit while on it. The comment at the top of parse_int_param()
needs to be reworked a bit more.

We could add some TAP tests for that, but I don't see a good area to
check after connection parameters. We have tests for multi-host
strings in 001_stream_rep.pl but that already feels misplaced as those
tests are for recovery. Perhaps we could add directly regression
tests for libpq. I'll start a new thread about that once we are done
here, the topic is larger.

(Note to self: Ed Morley needs to be credited for the report as well.)
--
Michael

Attachments:

libpq-parse-fix.patchtext/x-diff; charset=us-asciiDownload
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2efe..9af830321c 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1687,7 +1687,7 @@ useKeepalives(PGconn *conn)
 /*
  * Parse and try to interpret "value" as an integer value, and if successful,
  * store it in *result, complaining if there is any trailing garbage or an
- * overflow.
+ * overflow.  This allows any number of leading and trailing whitespaces.
  */
 static bool
 parse_int_param(const char *value, int *result, PGconn *conn,
@@ -1698,14 +1698,31 @@ parse_int_param(const char *value, int *result, PGconn *conn,
 
 	*result = 0;
 
+	/* strtol(3) skips leading whitespaces */
 	errno = 0;
 	numval = strtol(value, &end, 10);
-	if (errno == 0 && *end == '\0' && numval == (int) numval)
-	{
-		*result = numval;
-		return true;
-	}
 
+	/*
+	 * If no progress was done during the parsing or an error happened, fail.
+	 * This tests properly for overflows of the result.
+	 */
+	if (value == end || errno != 0 || numval != (int) numval)
+		goto error;
+
+	/*
+	 * Skip any trailing whitespace; if anything but whitespace remains before
+	 * the terminating character, fail
+	 */
+	while (*end && *end != '\0' && isspace((unsigned char) *end))
+		end++;
+
+	if (*end && *end != '\0')
+		goto error;
+
+	*result = numval;
+	return true;
+
+error:
 	appendPQExpBuffer(&conn->errorMessage,
 					  libpq_gettext("invalid integer value \"%s\" for connection option \"%s\"\n"),
 					  value, context);
@@ -2008,7 +2025,11 @@ connectDBComplete(PGconn *conn)
 	{
 		if (!parse_int_param(conn->connect_timeout, &timeout, conn,
 							 "connect_timeout"))
+		{
+			/* mark the connection as bad to report the parsing failure */
+			conn->status = CONNECTION_BAD;
 			return 0;
+		}
 
 		if (timeout > 0)
 		{
#4Lars Kanis
lars@greiz-reinsdorf.de
In reply to: Michael Paquier (#3)
1 attachment(s)
Re: libpq: Fix wrong connection status on invalid "connect_timeout"

Am 18.10.19 um 05:06 schrieb Michael Paquier:

So attached is a patch to skip trailing whitespaces as well,
which also fixes the issue with ECPG. I have refactored the parsing
logic a bit while on it. The comment at the top of parse_int_param()
needs to be reworked a bit more.

I tested this and it looks good to me. Maybe you could omit some
redundant 'end' checks, as in the attached patch. Or was your intention
to verify non-NULL 'end'?

Perhaps we could add directly regression
tests for libpq. I'll start a new thread about that once we are done
here, the topic is larger.

We have around 650 tests on ruby-pg to ensure everything runs as
expected and I always wondered how the API of libpq is being verified.

--
Kind Regards,
Lars Kanis

Attachments:

libpq-parse-fix-v2.patchtext/x-patch; name=libpq-parse-fix-v2.patchDownload
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2..0eeabb8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1687,7 +1687,7 @@ useKeepalives(PGconn *conn)
 /*
  * Parse and try to interpret "value" as an integer value, and if successful,
  * store it in *result, complaining if there is any trailing garbage or an
- * overflow.
+ * overflow.  This allows any number of leading and trailing whitespaces.
  */
 static bool
 parse_int_param(const char *value, int *result, PGconn *conn,
@@ -1698,14 +1698,31 @@ parse_int_param(const char *value, int *result, PGconn *conn,
 
 	*result = 0;
 
+	/* strtol(3) skips leading whitespaces */
 	errno = 0;
 	numval = strtol(value, &end, 10);
-	if (errno == 0 && *end == '\0' && numval == (int) numval)
-	{
-		*result = numval;
-		return true;
-	}
 
+	/*
+	 * If no progress was done during the parsing or an error happened, fail.
+	 * This tests properly for overflows of the result.
+	 */
+	if (value == end || errno != 0 || numval != (int) numval)
+		goto error;
+
+	/*
+	 * Skip any trailing whitespace; if anything but whitespace remains before
+	 * the terminating character, fail
+	 */
+	while (*end != '\0' && isspace((unsigned char) *end))
+		end++;
+
+	if (*end != '\0')
+		goto error;
+
+	*result = numval;
+	return true;
+
+error:
 	appendPQExpBuffer(&conn->errorMessage,
 					  libpq_gettext("invalid integer value \"%s\" for connection option \"%s\"\n"),
 					  value, context);
@@ -2008,7 +2025,11 @@ connectDBComplete(PGconn *conn)
 	{
 		if (!parse_int_param(conn->connect_timeout, &timeout, conn,
 							 "connect_timeout"))
+		{
+			/* mark the connection as bad to report the parsing failure */
+			conn->status = CONNECTION_BAD;
 			return 0;
+		}
 
 		if (timeout > 0)
 		{
#5Lars Kanis
lars@greiz-reinsdorf.de
In reply to: Michael Paquier (#3)
1 attachment(s)
Re: libpq: Fix wrong connection status on invalid "connect_timeout"

Am 18.10.19 um 05:06 schrieb Michael Paquier:

So attached is a patch to skip trailing whitespaces as well,
which also fixes the issue with ECPG. I have refactored the parsing
logic a bit while on it. The comment at the top of parse_int_param()
needs to be reworked a bit more.

I tested this and it looks good to me. Maybe you could omit some
redundant 'end' checks, as in the attached patch. Or was your intention
to verify non-NULL 'end'?

Perhaps we could add directly regression
tests for libpq. I'll start a new thread about that once we are done
here, the topic is larger.

We have around 650 tests on ruby-pg to ensure everything runs as
expected and I always wondered how the API of libpq is being verified.

--
Kind Regards,
Lars Kanis

Attachments:

libpq-parse-fix-v2.patchtext/x-patch; name=libpq-parse-fix-v2.patchDownload
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2..0eeabb8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1687,7 +1687,7 @@ useKeepalives(PGconn *conn)
 /*
  * Parse and try to interpret "value" as an integer value, and if successful,
  * store it in *result, complaining if there is any trailing garbage or an
- * overflow.
+ * overflow.  This allows any number of leading and trailing whitespaces.
  */
 static bool
 parse_int_param(const char *value, int *result, PGconn *conn,
@@ -1698,14 +1698,31 @@ parse_int_param(const char *value, int *result, PGconn *conn,
 
 	*result = 0;
 
+	/* strtol(3) skips leading whitespaces */
 	errno = 0;
 	numval = strtol(value, &end, 10);
-	if (errno == 0 && *end == '\0' && numval == (int) numval)
-	{
-		*result = numval;
-		return true;
-	}
 
+	/*
+	 * If no progress was done during the parsing or an error happened, fail.
+	 * This tests properly for overflows of the result.
+	 */
+	if (value == end || errno != 0 || numval != (int) numval)
+		goto error;
+
+	/*
+	 * Skip any trailing whitespace; if anything but whitespace remains before
+	 * the terminating character, fail
+	 */
+	while (*end != '\0' && isspace((unsigned char) *end))
+		end++;
+
+	if (*end != '\0')
+		goto error;
+
+	*result = numval;
+	return true;
+
+error:
 	appendPQExpBuffer(&conn->errorMessage,
 					  libpq_gettext("invalid integer value \"%s\" for connection option \"%s\"\n"),
 					  value, context);
@@ -2008,7 +2025,11 @@ connectDBComplete(PGconn *conn)
 	{
 		if (!parse_int_param(conn->connect_timeout, &timeout, conn,
 							 "connect_timeout"))
+		{
+			/* mark the connection as bad to report the parsing failure */
+			conn->status = CONNECTION_BAD;
 			return 0;
+		}
 
 		if (timeout > 0)
 		{
#6Lars Kanis
lars@greiz-reinsdorf.de
In reply to: Michael Paquier (#3)
1 attachment(s)
Re: libpq: Fix wrong connection status on invalid "connect_timeout"

Am 18.10.19 um 05:06 schrieb Michael Paquier:

So attached is a patch to skip trailing whitespaces as well,
which also fixes the issue with ECPG. I have refactored the parsing
logic a bit while on it. The comment at the top of parse_int_param()
needs to be reworked a bit more.

I tested this and it looks good to me. Maybe you could omit some
redundant 'end' checks, as in the attached patch. Or was your intention
to verify non-NULL 'end'?

Perhaps we could add directly regression
tests for libpq. I'll start a new thread about that once we are done
here, the topic is larger.

We have around 650 tests on ruby-pg to ensure everything runs as
expected and I always wondered how the API of libpq is being verified.

--
Kind Regards,
Lars Kanis

Attachments:

libpq-parse-fix-v2.patchtext/x-patch; name=libpq-parse-fix-v2.patchDownload
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f91f0f2..0eeabb8 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1687,7 +1687,7 @@ useKeepalives(PGconn *conn)
 /*
  * Parse and try to interpret "value" as an integer value, and if successful,
  * store it in *result, complaining if there is any trailing garbage or an
- * overflow.
+ * overflow.  This allows any number of leading and trailing whitespaces.
  */
 static bool
 parse_int_param(const char *value, int *result, PGconn *conn,
@@ -1698,14 +1698,31 @@ parse_int_param(const char *value, int *result, PGconn *conn,
 
 	*result = 0;
 
+	/* strtol(3) skips leading whitespaces */
 	errno = 0;
 	numval = strtol(value, &end, 10);
-	if (errno == 0 && *end == '\0' && numval == (int) numval)
-	{
-		*result = numval;
-		return true;
-	}
 
+	/*
+	 * If no progress was done during the parsing or an error happened, fail.
+	 * This tests properly for overflows of the result.
+	 */
+	if (value == end || errno != 0 || numval != (int) numval)
+		goto error;
+
+	/*
+	 * Skip any trailing whitespace; if anything but whitespace remains before
+	 * the terminating character, fail
+	 */
+	while (*end != '\0' && isspace((unsigned char) *end))
+		end++;
+
+	if (*end != '\0')
+		goto error;
+
+	*result = numval;
+	return true;
+
+error:
 	appendPQExpBuffer(&conn->errorMessage,
 					  libpq_gettext("invalid integer value \"%s\" for connection option \"%s\"\n"),
 					  value, context);
@@ -2008,7 +2025,11 @@ connectDBComplete(PGconn *conn)
 	{
 		if (!parse_int_param(conn->connect_timeout, &timeout, conn,
 							 "connect_timeout"))
+		{
+			/* mark the connection as bad to report the parsing failure */
+			conn->status = CONNECTION_BAD;
 			return 0;
+		}
 
 		if (timeout > 0)
 		{
#7Michael Paquier
michael@paquier.xyz
In reply to: Lars Kanis (#5)
Re: libpq: Fix wrong connection status on invalid "connect_timeout"

On Fri, Oct 18, 2019 at 02:01:23PM +0200, Lars Kanis wrote:

Am 18.10.19 um 05:06 schrieb Michael Paquier:

So attached is a patch to skip trailing whitespaces as well,
which also fixes the issue with ECPG. I have refactored the parsing
logic a bit while on it. The comment at the top of parse_int_param()
needs to be reworked a bit more.

I tested this and it looks good to me. Maybe you could omit some
redundant 'end' checks, as in the attached patch. Or was your intention
to verify non-NULL 'end'?

Yes. Here are the connection patterns I have tested. These now pass:
'postgresql:///postgres?host=/tmp&port=5432 &user=postgres'
'postgresql:///postgres?host=/tmp&port= 5432&user=postgres'
And these fail (overflow on third one):
'postgresql:///postgres?host=/tmp&port=5432 s &user=postgres'
'postgresql:///postgres?host=/tmp&port= s 5432&user=postgres'
'postgresql:///postgres?host=/tmp&port= 5000000000&user=postgres'

Before the patch any trailing characters caused a failures even if
there were just whitespaces as trailing characters (first case
listed).

Perhaps we could add directly regression
tests for libpq. I'll start a new thread about that once we are done
here, the topic is larger.

We have around 650 tests on ruby-pg to ensure everything runs as
expected and I always wondered how the API of libpq is being verified.

For advanced test scenarios like connection handling, we use perl's
TAP tests. The situation regarding libpq-related testing is a bit
messy though. We have some tests in src/test/recovery/ for a couple
of things, and we should have more things to stress anything related
to the protocol (say message injection, etc.).

I'll try to start a new thread about that with a patch adding some
basics for discussion.

I have applied the parsing fix and your fix as two separate commits as
these are at the end two separate bugs, then back-patched down to v12.
Ed has been credited for the report, and I have marked the author as
you, Lars. Thanks!
--
Michael