replace strtok()

Started by Peter Eisentrautover 1 year ago22 messages
#1Peter Eisentraut
peter@eisentraut.org
4 attachment(s)

Under the topic of getting rid of thread-unsafe functions in the backend
[0]: /messages/by-id/856e5ec3-879f-42ee-8258-8bcc6ec9bdea@eisentraut.org

Of course, strtok() is famously not thread-safe and can be replaced by
strtok_r(). But it also has the wrong semantics in some cases, because
it considers adjacent delimiters to be one delimiter. So if you parse

SCRAM-SHA-256$<iterations>:<salt>$<storedkey>:<serverkey>

with strtok(), then

SCRAM-SHA-256$$<iterations>::<salt>$$<storedkey>::<serverkey>

parses just the same. In many cases, this is arguably wrong and could
hide mistakes.

So I'm suggesting to use strsep() in those places. strsep() is
nonstandard but widely available.

There are a few places where strtok() has the right semantics, such as
parsing tokens separated by whitespace. For those, I'm using strtok_r().

A reviewer job here would be to check whether I made that distinction
correctly in each case.

On the portability side, I'm including a port/ replacement for strsep()
and some workaround to get strtok_r() for Windows. I have included
these here as separate patches for clarity.

[0]: /messages/by-id/856e5ec3-879f-42ee-8258-8bcc6ec9bdea@eisentraut.org
/messages/by-id/856e5ec3-879f-42ee-8258-8bcc6ec9bdea@eisentraut.org

Attachments:

v1-0001-Replace-some-strtok-with-strsep.patchtext/plain; charset=UTF-8; name=v1-0001-Replace-some-strtok-with-strsep.patchDownload
From 8ab537885f007d8bed58f839ca91108ef20422a6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Mon, 10 Jun 2024 16:08:08 +0200
Subject: [PATCH v1 1/4] Replace some strtok() with strsep()

strtok() considers adjacent delimiters to be one delimiter, which is
arguably the wrong behavior in some cases.  Replace with strsep(),
which has the right behavior: Adjacent delimiters create an empty
token.
---
 src/backend/libpq/auth-scram.c    | 11 +++++------
 src/backend/utils/adt/pg_locale.c |  3 ++-
 src/common/logging.c              |  4 +++-
 src/test/regress/pg_regress.c     |  5 ++---
 4 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
index 41619599148..03ddddc3c27 100644
--- a/src/backend/libpq/auth-scram.c
+++ b/src/backend/libpq/auth-scram.c
@@ -608,16 +608,15 @@ parse_scram_secret(const char *secret, int *iterations,
 	 * SCRAM-SHA-256$<iterations>:<salt>$<storedkey>:<serverkey>
 	 */
 	v = pstrdup(secret);
-	if ((scheme_str = strtok(v, "$")) == NULL)
+	if ((scheme_str = strsep(&v, "$")) == NULL)
 		goto invalid_secret;
-	if ((iterations_str = strtok(NULL, ":")) == NULL)
+	if ((iterations_str = strsep(&v, ":")) == NULL)
 		goto invalid_secret;
-	if ((salt_str = strtok(NULL, "$")) == NULL)
+	if ((salt_str = strsep(&v, "$")) == NULL)
 		goto invalid_secret;
-	if ((storedkey_str = strtok(NULL, ":")) == NULL)
-		goto invalid_secret;
-	if ((serverkey_str = strtok(NULL, "")) == NULL)
+	if ((storedkey_str = strsep(&v, ":")) == NULL)
 		goto invalid_secret;
+	serverkey_str = v;
 
 	/* Parse the fields */
 	if (strcmp(scheme_str, "SCRAM-SHA-256") != 0)
diff --git a/src/backend/utils/adt/pg_locale.c b/src/backend/utils/adt/pg_locale.c
index 7e5bb2b703a..21924d89877 100644
--- a/src/backend/utils/adt/pg_locale.c
+++ b/src/backend/utils/adt/pg_locale.c
@@ -2809,6 +2809,7 @@ icu_set_collation_attributes(UCollator *collator, const char *loc,
 	char	   *icu_locale_id;
 	char	   *lower_str;
 	char	   *str;
+	char	   *token;
 
 	/*
 	 * The input locale may be a BCP 47 language tag, e.g.
@@ -2834,7 +2835,7 @@ icu_set_collation_attributes(UCollator *collator, const char *loc,
 		return;
 	str++;
 
-	for (char *token = strtok(str, ";"); token; token = strtok(NULL, ";"))
+	while ((token = strsep(&str, ";")))
 	{
 		char	   *e = strchr(token, '=');
 
diff --git a/src/common/logging.c b/src/common/logging.c
index e9a02e3e46a..aedd1ae2d8c 100644
--- a/src/common/logging.c
+++ b/src/common/logging.c
@@ -119,7 +119,9 @@ pg_logging_init(const char *argv0)
 
 			if (colors)
 			{
-				for (char *token = strtok(colors, ":"); token; token = strtok(NULL, ":"))
+				char	   *token;
+
+				while ((token = strsep(&colors, ":")))
 				{
 					char	   *e = strchr(token, '=');
 
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 06f6775fc65..1abfe6c4a5c 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -234,12 +234,11 @@ static void
 split_to_stringlist(const char *s, const char *delim, _stringlist **listhead)
 {
 	char	   *sc = pg_strdup(s);
-	char	   *token = strtok(sc, delim);
+	char	   *token;
 
-	while (token)
+	while ((token = strsep(&sc, delim)))
 	{
 		add_stringlist_item(listhead, token);
-		token = strtok(NULL, delim);
 	}
 	free(sc);
 }

base-commit: ae482a7ec521e09bb0dbfc261da6e6170a5ac29f
-- 
2.45.2

v1-0002-Replace-remaining-strtok-with-strtok_r.patchtext/plain; charset=UTF-8; name=v1-0002-Replace-remaining-strtok-with-strtok_r.patchDownload
From 4e4769864ed3f9ccca232ff7fa56d3ce9c9f5f2c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Mon, 10 Jun 2024 16:08:56 +0200
Subject: [PATCH v1 2/4] Replace remaining strtok() with strtok_r()

for thread-safety in the future
---
 src/backend/utils/misc/tzparser.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/backend/utils/misc/tzparser.c b/src/backend/utils/misc/tzparser.c
index 21fd866d6d6..96cc3912e56 100644
--- a/src/backend/utils/misc/tzparser.c
+++ b/src/backend/utils/misc/tzparser.c
@@ -97,6 +97,7 @@ validateTzEntry(tzEntry *tzentry)
 static bool
 splitTzLine(const char *filename, int lineno, char *line, tzEntry *tzentry)
 {
+	char	   *brkl;
 	char	   *abbrev;
 	char	   *offset;
 	char	   *offset_endptr;
@@ -106,7 +107,7 @@ splitTzLine(const char *filename, int lineno, char *line, tzEntry *tzentry)
 	tzentry->lineno = lineno;
 	tzentry->filename = filename;
 
-	abbrev = strtok(line, WHITESPACE);
+	abbrev = strtok_r(line, WHITESPACE, &brkl);
 	if (!abbrev)
 	{
 		GUC_check_errmsg("missing time zone abbreviation in time zone file \"%s\", line %d",
@@ -115,7 +116,7 @@ splitTzLine(const char *filename, int lineno, char *line, tzEntry *tzentry)
 	}
 	tzentry->abbrev = pstrdup(abbrev);
 
-	offset = strtok(NULL, WHITESPACE);
+	offset = strtok_r(NULL, WHITESPACE, &brkl);
 	if (!offset)
 	{
 		GUC_check_errmsg("missing time zone offset in time zone file \"%s\", line %d",
@@ -135,11 +136,11 @@ splitTzLine(const char *filename, int lineno, char *line, tzEntry *tzentry)
 			return false;
 		}
 
-		is_dst = strtok(NULL, WHITESPACE);
+		is_dst = strtok_r(NULL, WHITESPACE, &brkl);
 		if (is_dst && pg_strcasecmp(is_dst, "D") == 0)
 		{
 			tzentry->is_dst = true;
-			remain = strtok(NULL, WHITESPACE);
+			remain = strtok_r(NULL, WHITESPACE, &brkl);
 		}
 		else
 		{
@@ -158,7 +159,7 @@ splitTzLine(const char *filename, int lineno, char *line, tzEntry *tzentry)
 		tzentry->zone = pstrdup(offset);
 		tzentry->offset = 0 * SECS_PER_HOUR;
 		tzentry->is_dst = false;
-		remain = strtok(NULL, WHITESPACE);
+		remain = strtok_r(NULL, WHITESPACE, &brkl);
 	}
 
 	if (!remain)				/* no more non-whitespace chars */
@@ -394,8 +395,9 @@ ParseTzFile(const char *filename, int depth,
 		{
 			/* pstrdup so we can use filename in result data structure */
 			char	   *includeFile = pstrdup(line + strlen("@INCLUDE"));
+			char	   *brki;
 
-			includeFile = strtok(includeFile, WHITESPACE);
+			includeFile = strtok_r(includeFile, WHITESPACE, &brki);
 			if (!includeFile || !*includeFile)
 			{
 				GUC_check_errmsg("@INCLUDE without file name in time zone file \"%s\", line %d",
-- 
2.45.2

v1-0003-Add-port-replacement-for-strsep.patchtext/plain; charset=UTF-8; name=v1-0003-Add-port-replacement-for-strsep.patchDownload
From da59e9395e77b8849b1a9700b610a601548ffdd7 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 13 Jun 2024 11:14:36 +0200
Subject: [PATCH v1 3/4] Add port/ replacement for strsep()

from OpenBSD, similar to strlcat, strlcpy
---
 configure                  | 23 ++++++++++++
 configure.ac               |  3 +-
 meson.build                |  2 +
 src/include/pg_config.h.in |  7 ++++
 src/include/port.h         |  4 ++
 src/port/meson.build       |  1 +
 src/port/strsep.c          | 77 ++++++++++++++++++++++++++++++++++++++
 7 files changed, 116 insertions(+), 1 deletion(-)
 create mode 100644 src/port/strsep.c

diff --git a/configure b/configure
index 7b03db56a67..90664858ba7 100755
--- a/configure
+++ b/configure
@@ -15778,6 +15778,16 @@ fi
 cat >>confdefs.h <<_ACEOF
 #define HAVE_DECL_STRNLEN $ac_have_decl
 _ACEOF
+ac_fn_c_check_decl "$LINENO" "strsep" "ac_cv_have_decl_strsep" "$ac_includes_default"
+if test "x$ac_cv_have_decl_strsep" = xyes; then :
+  ac_have_decl=1
+else
+  ac_have_decl=0
+fi
+
+cat >>confdefs.h <<_ACEOF
+#define HAVE_DECL_STRSEP $ac_have_decl
+_ACEOF
 
 
 # We can't use AC_CHECK_FUNCS to detect these functions, because it
@@ -15925,6 +15935,19 @@ esac
 
 fi
 
+ac_fn_c_check_func "$LINENO" "strsep" "ac_cv_func_strsep"
+if test "x$ac_cv_func_strsep" = xyes; then :
+  $as_echo "#define HAVE_STRSEP 1" >>confdefs.h
+
+else
+  case " $LIBOBJS " in
+  *" strsep.$ac_objext "* ) ;;
+  *) LIBOBJS="$LIBOBJS strsep.$ac_objext"
+ ;;
+esac
+
+fi
+
 
 
 ac_fn_c_check_func "$LINENO" "pthread_barrier_wait" "ac_cv_func_pthread_barrier_wait"
diff --git a/configure.ac b/configure.ac
index 63e7be38472..2fcb256123d 100644
--- a/configure.ac
+++ b/configure.ac
@@ -1795,7 +1795,7 @@ AC_CHECK_DECLS(posix_fadvise, [], [], [#include <fcntl.h>])
 ]) # fi
 
 AC_CHECK_DECLS(fdatasync, [], [], [#include <unistd.h>])
-AC_CHECK_DECLS([strlcat, strlcpy, strnlen])
+AC_CHECK_DECLS([strlcat, strlcpy, strnlen, strsep])
 
 # We can't use AC_CHECK_FUNCS to detect these functions, because it
 # won't handle deployment target restrictions on macOS
@@ -1814,6 +1814,7 @@ AC_REPLACE_FUNCS(m4_normalize([
 	strlcat
 	strlcpy
 	strnlen
+	strsep
 ]))
 
 AC_REPLACE_FUNCS(pthread_barrier_wait)
diff --git a/meson.build b/meson.build
index 2767abd19e2..7d021138e72 100644
--- a/meson.build
+++ b/meson.build
@@ -2326,6 +2326,7 @@ decl_checks = [
   ['strlcat', 'string.h'],
   ['strlcpy', 'string.h'],
   ['strnlen', 'string.h'],
+  ['strsep',  'string.h'],
 ]
 
 # Need to check for function declarations for these functions, because
@@ -2594,6 +2595,7 @@ func_checks = [
   ['strlcat'],
   ['strlcpy'],
   ['strnlen'],
+  ['strsep'],
   ['strsignal'],
   ['sync_file_range'],
   ['syncfs'],
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index f8d3e3b6b84..9862739b8e8 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -127,6 +127,10 @@
    don't. */
 #undef HAVE_DECL_STRNLEN
 
+/* Define to 1 if you have the declaration of `strsep', and to 0 if you don't.
+   */
+#undef HAVE_DECL_STRSEP
+
 /* Define to 1 if you have the <editline/history.h> header file. */
 #undef HAVE_EDITLINE_HISTORY_H
 
@@ -414,6 +418,9 @@
 /* Define to 1 if you have the `strnlen' function. */
 #undef HAVE_STRNLEN
 
+/* Define to 1 if you have the `strsep' function. */
+#undef HAVE_STRSEP
+
 /* Define to 1 if you have the `strsignal' function. */
 #undef HAVE_STRSIGNAL
 
diff --git a/src/include/port.h b/src/include/port.h
index ae115d2d970..c7400052675 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -432,6 +432,10 @@ extern size_t strlcpy(char *dst, const char *src, size_t siz);
 extern size_t strnlen(const char *str, size_t maxlen);
 #endif
 
+#if !HAVE_DECL_STRSEP
+extern char *strsep(char **stringp, const char *delim);
+#endif
+
 /* port/user.c */
 #ifndef WIN32
 extern bool pg_get_user_name(uid_t user_id, char *buffer, size_t buflen);
diff --git a/src/port/meson.build b/src/port/meson.build
index fd9ee199d1b..ff54b7b53e9 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -70,6 +70,7 @@ replace_funcs_neg = [
   ['strlcat'],
   ['strlcpy'],
   ['strnlen'],
+  ['strsep'],
 ]
 
 if host_system != 'windows'
diff --git a/src/port/strsep.c b/src/port/strsep.c
new file mode 100644
index 00000000000..564125c5101
--- /dev/null
+++ b/src/port/strsep.c
@@ -0,0 +1,77 @@
+/*
+ * src/port/strsep.c
+ *
+ *	$OpenBSD: strsep.c,v 1.8 2015/08/31 02:53:57 guenther Exp $	*/
+
+/*-
+ * Copyright (c) 1990, 1993
+ *	The Regents of the University of California.  All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the name of the University nor the names of its contributors
+ *    may be used to endorse or promote products derived from this software
+ *    without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
+ * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
+ * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
+ * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
+ * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
+ * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
+ * SUCH DAMAGE.
+ */
+
+#include "c.h"
+
+/*
+ * Get next token from string *stringp, where tokens are possibly-empty
+ * strings separated by characters from delim.
+ *
+ * Writes NULs into the string at *stringp to end tokens.
+ * delim need not remain constant from call to call.
+ * On return, *stringp points past the last NUL written (if there might
+ * be further tokens), or is NULL (if there are definitely no more tokens).
+ *
+ * If *stringp is NULL, strsep returns NULL.
+ */
+char *
+strsep(char **stringp, const char *delim)
+{
+	char	   *s;
+	const char *spanp;
+	int			c,
+				sc;
+	char	   *tok;
+
+	if ((s = *stringp) == NULL)
+		return (NULL);
+	for (tok = s;;)
+	{
+		c = *s++;
+		spanp = delim;
+		do
+		{
+			if ((sc = *spanp++) == c)
+			{
+				if (c == 0)
+					s = NULL;
+				else
+					s[-1] = 0;
+				*stringp = s;
+				return (tok);
+			}
+		} while (sc != 0);
+	}
+	/* NOTREACHED */
+}
-- 
2.45.2

v1-0004-Windows-replacement-for-strtok_r.patchtext/plain; charset=UTF-8; name=v1-0004-Windows-replacement-for-strtok_r.patchDownload
From 90631bdab4b6ed9a05cea0ce37b0c7e117883ac1 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 13 Jun 2024 11:18:50 +0200
Subject: [PATCH v1 4/4] Windows replacement for strtok_r()

They spell it "strtok_s" there.
---
 src/include/port/win32_port.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/src/include/port/win32_port.h b/src/include/port/win32_port.h
index 3d1de166cb0..7ffe5891c69 100644
--- a/src/include/port/win32_port.h
+++ b/src/include/port/win32_port.h
@@ -415,6 +415,11 @@ extern int	_pglstat64(const char *name, struct stat *buf);
 #undef ETIMEDOUT
 #define ETIMEDOUT WSAETIMEDOUT
 
+/*
+ * Supplement to <string.h>.
+ */
+#define strtok_r strtok_s
+
 /*
  * Locale stuff.
  *
-- 
2.45.2

#2Ranier Vilela
ranier.vf@gmail.com
In reply to: Peter Eisentraut (#1)
1 attachment(s)
Re: replace strtok()

Em ter., 18 de jun. de 2024 às 04:18, Peter Eisentraut <peter@eisentraut.org>
escreveu:

Under the topic of getting rid of thread-unsafe functions in the backend
[0], here is a patch series to deal with strtok().

Of course, strtok() is famously not thread-safe and can be replaced by
strtok_r(). But it also has the wrong semantics in some cases, because
it considers adjacent delimiters to be one delimiter. So if you parse

SCRAM-SHA-256$<iterations>:<salt>$<storedkey>:<serverkey>

with strtok(), then

SCRAM-SHA-256$$<iterations>::<salt>$$<storedkey>::<serverkey>

parses just the same. In many cases, this is arguably wrong and could
hide mistakes.

So I'm suggesting to use strsep() in those places. strsep() is
nonstandard but widely available.

There are a few places where strtok() has the right semantics, such as
parsing tokens separated by whitespace. For those, I'm using strtok_r().

A reviewer job here would be to check whether I made that distinction
correctly in each case.

On the portability side, I'm including a port/ replacement for strsep()
and some workaround to get strtok_r() for Windows. I have included
these here as separate patches for clarity.

+1 For making the code thread-safe.
But I would like to see more const char * where this is possible.

For example, in pg_locale.c
IMO, the token variable can be const char *.

At least strchr expects a const char * as the first parameter.

I found another implementation of strsep, it seems lighter to me.
I will attach it for consideration, however, I have not done any testing.

best regards,
Ranier Vilela

Attachments:

strsep.ctext/plain; charset=US-ASCII; name=strsep.cDownload
#3Kyotaro Horiguchi
horikyota.ntt@gmail.com
In reply to: Peter Eisentraut (#1)
Re: replace strtok()

At Tue, 18 Jun 2024 09:18:28 +0200, Peter Eisentraut <peter@eisentraut.org> wrote in

Under the topic of getting rid of thread-unsafe functions in the
backend [0], here is a patch series to deal with strtok().

Of course, strtok() is famously not thread-safe and can be replaced by
strtok_r(). But it also has the wrong semantics in some cases,
because it considers adjacent delimiters to be one delimiter. So if
you parse

SCRAM-SHA-256$<iterations>:<salt>$<storedkey>:<serverkey>

with strtok(), then

SCRAM-SHA-256$$<iterations>::<salt>$$<storedkey>::<serverkey>

parses just the same. In many cases, this is arguably wrong and could
hide mistakes.

So I'm suggesting to use strsep() in those places. strsep() is
nonstandard but widely available.

There are a few places where strtok() has the right semantics, such as
parsing tokens separated by whitespace. For those, I'm using
strtok_r().

I agree with the distinction.

A reviewer job here would be to check whether I made that distinction
correctly in each case.

0001 and 0002 look correct to me regarding that distinction. They
applied correctly to the master HEAD and all tests passed on Linux.

On the portability side, I'm including a port/ replacement for
strsep() and some workaround to get strtok_r() for Windows. I have
included these here as separate patches for clarity.

0003 looks fine and successfully built and seems working on an MSVC
build.

About 0004, Cygwin seems to have its own strtok_r, but I haven't
checked how that fact affects the build.

[0]:
/messages/by-id/856e5ec3-879f-42ee-8258-8bcc6ec9bdea@eisentraut.org

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#4Peter Eisentraut
peter@eisentraut.org
In reply to: Ranier Vilela (#2)
Re: replace strtok()

On 18.06.24 13:43, Ranier Vilela wrote:

But I would like to see more const char * where this is possible.

For example, in pg_locale.c
IMO, the token variable can be const char *.

At least strchr expects a const char * as the first parameter.

This would not be future-proof. In C23, if you pass a const char * into
strchr(), you also get a const char * as a result. And in this case, we
do write into the area pointed to by the result. So with a const char
*token, this whole thing would not compile cleanly under C23.

I found another implementation of strsep, it seems lighter to me.
I will attach it for consideration, however, I have not done any testing.

Yeah, surely there are many possible implementations. I'm thinking,
since we already took other str*() functions from OpenBSD, it makes
sense to do this here as well, so we have only one source to deal with.

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Eisentraut (#4)
Re: replace strtok()

Peter Eisentraut <peter@eisentraut.org> writes:

On 18.06.24 13:43, Ranier Vilela wrote:

I found another implementation of strsep, it seems lighter to me.
I will attach it for consideration, however, I have not done any testing.

Yeah, surely there are many possible implementations. I'm thinking,
since we already took other str*() functions from OpenBSD, it makes
sense to do this here as well, so we have only one source to deal with.

Why not use strpbrk? That's equally thread-safe, it's been there
since C89, and it doesn't have the problem that you can't find out
which of the delimiter characters was found.

regards, tom lane

#6Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#5)
Re: replace strtok()

On Sat, Jun 22, 2024 at 11:48:21AM -0400, Tom Lane wrote:

Peter Eisentraut <peter@eisentraut.org> writes:

On 18.06.24 13:43, Ranier Vilela wrote:

I found another implementation of strsep, it seems lighter to me.
I will attach it for consideration, however, I have not done any testing.

Yeah, surely there are many possible implementations. I'm thinking,
since we already took other str*() functions from OpenBSD, it makes
sense to do this here as well, so we have only one source to deal with.

Why not use strpbrk? That's equally thread-safe, it's been there
since C89, and it doesn't have the problem that you can't find out
which of the delimiter characters was found.

Yeah, strpbrk() has been used in the tree as far as 2003 without any
port/ implementation.
--
Michael

#7Peter Eisentraut
peter@eisentraut.org
In reply to: Michael Paquier (#6)
Re: replace strtok()

On 24.06.24 02:34, Michael Paquier wrote:

On Sat, Jun 22, 2024 at 11:48:21AM -0400, Tom Lane wrote:

Peter Eisentraut <peter@eisentraut.org> writes:

On 18.06.24 13:43, Ranier Vilela wrote:

I found another implementation of strsep, it seems lighter to me.
I will attach it for consideration, however, I have not done any testing.

Yeah, surely there are many possible implementations. I'm thinking,
since we already took other str*() functions from OpenBSD, it makes
sense to do this here as well, so we have only one source to deal with.

Why not use strpbrk? That's equally thread-safe, it's been there
since C89, and it doesn't have the problem that you can't find out
which of the delimiter characters was found.

Yeah, strpbrk() has been used in the tree as far as 2003 without any
port/ implementation.

The existing uses of strpbrk() are really just checking whether some
characters exist in a string, more like an enhanced strchr(). I don't
see any uses for tokenizing a string like strtok() or strsep() would do.
I think that would look quite cumbersome. So I think a simpler and
more convenient abstraction like strsep() would still be worthwhile.

#8David Steele
david@pgmasters.net
In reply to: Peter Eisentraut (#7)
Re: replace strtok()

On 6/24/24 19:57, Peter Eisentraut wrote:

On 24.06.24 02:34, Michael Paquier wrote:

On Sat, Jun 22, 2024 at 11:48:21AM -0400, Tom Lane wrote:

Peter Eisentraut <peter@eisentraut.org> writes:

On 18.06.24 13:43, Ranier Vilela wrote:

I found another implementation of strsep, it seems lighter to me.
I will attach it for consideration, however, I have not done any
testing.

Yeah, surely there are many possible implementations.  I'm thinking,
since we already took other str*() functions from OpenBSD, it makes
sense to do this here as well, so we have only one source to deal with.

Why not use strpbrk?  That's equally thread-safe, it's been there
since C89, and it doesn't have the problem that you can't find out
which of the delimiter characters was found.

Yeah, strpbrk() has been used in the tree as far as 2003 without any
port/ implementation.

The existing uses of strpbrk() are really just checking whether some
characters exist in a string, more like an enhanced strchr().  I don't
see any uses for tokenizing a string like strtok() or strsep() would do.
 I think that would look quite cumbersome.  So I think a simpler and
more convenient abstraction like strsep() would still be worthwhile.

I agree that using strsep() in these cases seems more natural. Since
this patch provides a default implementation compatibility does not seem
like a big issue.

I've also reviewed the rest of the patch and it looks good to me.

Regards,
-David

#9Peter Eisentraut
peter@eisentraut.org
In reply to: David Steele (#8)
Re: replace strtok()

On 08.07.24 07:45, David Steele wrote:

On 6/24/24 19:57, Peter Eisentraut wrote:

On 24.06.24 02:34, Michael Paquier wrote:

On Sat, Jun 22, 2024 at 11:48:21AM -0400, Tom Lane wrote:

Peter Eisentraut <peter@eisentraut.org> writes:

On 18.06.24 13:43, Ranier Vilela wrote:

I found another implementation of strsep, it seems lighter to me.
I will attach it for consideration, however, I have not done any
testing.

Yeah, surely there are many possible implementations.  I'm thinking,
since we already took other str*() functions from OpenBSD, it makes
sense to do this here as well, so we have only one source to deal
with.

Why not use strpbrk?  That's equally thread-safe, it's been there
since C89, and it doesn't have the problem that you can't find out
which of the delimiter characters was found.

Yeah, strpbrk() has been used in the tree as far as 2003 without any
port/ implementation.

The existing uses of strpbrk() are really just checking whether some
characters exist in a string, more like an enhanced strchr().  I don't
see any uses for tokenizing a string like strtok() or strsep() would
do.   I think that would look quite cumbersome.  So I think a simpler
and more convenient abstraction like strsep() would still be worthwhile.

I agree that using strsep() in these cases seems more natural. Since
this patch provides a default implementation compatibility does not seem
like a big issue.

I've also reviewed the rest of the patch and it looks good to me.

This has been committed. Thanks.

#10Alexander Lakhin
exclusion@gmail.com
In reply to: Peter Eisentraut (#9)
Re: replace strtok()

Hello Peter,

23.07.2024 15:38, Peter Eisentraut wrote:

This has been committed.  Thanks.

Please look at the SCRAM secret, which breaks parse_scram_secret(),
perhaps because strsep() doesn't return NULL where strtok() did:

CREATE ROLE r PASSWORD
'SCRAM-SHA-256$4096:hpFyHTUsSWcR7O9P$LgZFIt6Oqdo27ZFKbZ2nV+=vtnYM995pDh9ca6WSi120qVV5NeluNfUPkwm7Vqat25RjSPLkGeoZBQs6wVv+um4=';

Core was generated by `postgres: law regression [local] CREATE ROLE                                  '.
Program terminated with signal SIGSEGV, Segmentation fault.

#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
(gdb) bt
#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
#1  0x0000563625e9e5b0 in parse_scram_secret (...) at auth-scram.c:655

Best regards,
Alexander

#11Ranier Vilela
ranier.vf@gmail.com
In reply to: Alexander Lakhin (#10)
1 attachment(s)
Re: replace strtok()

Hi Alexander,

Em qui., 10 de out. de 2024 às 02:00, Alexander Lakhin <exclusion@gmail.com>
escreveu:

Hello Peter,

23.07.2024 15:38, Peter Eisentraut wrote:

This has been committed. Thanks.

Please look at the SCRAM secret, which breaks parse_scram_secret(),
perhaps because strsep() doesn't return NULL where strtok() did:

CREATE ROLE r PASSWORD

'SCRAM-SHA-256$4096:hpFyHTUsSWcR7O9P$LgZFIt6Oqdo27ZFKbZ2nV+=vtnYM995pDh9ca6WSi120qVV5NeluNfUPkwm7Vqat25RjSPLkGeoZBQs6wVv+um4=';

Core was generated by `postgres: law regression [local] CREATE
ROLE '.
Program terminated with signal SIGSEGV, Segmentation fault.

#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
(gdb) bt
#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
#1 0x0000563625e9e5b0 in parse_scram_secret (...) at auth-scram.c:655

Thanks for the report.

It seems to me that it could be due to incorrect use of the strsep function.
See:
https://man7.org/linux/man-pages/man3/strsep.3.html
"

In case no delimiter was found, the token
is taken to be the entire string **stringp*, and **stringp* is made
NULL.

"
So, it is necessary to check the *stringp* against NULL too.

I tried the patch attached and your test case works.

CREATE ROLE r PASSWORD
postgres-#
'SCRAM-SHA-256$4096:hpFyHTUsSWcR7O9P$LgZFIt6Oqdo27ZFKbZ2nV+=vtnYM995pDh9ca6WSi120qVV5NeluNfUPkwm7Vqat25RjSPLkGeoZBQs6wVv+um4=';
CREATE ROLE

best regards,
Ranier Vilela

Attachments:

fix-core-dump-strsep-auth-scram.patchapplication/octet-stream; name=fix-core-dump-strsep-auth-scram.patchDownload
diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
index 03ddddc3c2..028ec16d0d 100644
--- a/src/backend/libpq/auth-scram.c
+++ b/src/backend/libpq/auth-scram.c
@@ -608,13 +608,13 @@ parse_scram_secret(const char *secret, int *iterations,
 	 * SCRAM-SHA-256$<iterations>:<salt>$<storedkey>:<serverkey>
 	 */
 	v = pstrdup(secret);
-	if ((scheme_str = strsep(&v, "$")) == NULL)
+	if ((scheme_str = strsep(&v, "$")) == NULL || v == NULL )
 		goto invalid_secret;
-	if ((iterations_str = strsep(&v, ":")) == NULL)
+	if ((iterations_str = strsep(&v, ":")) == NULL || v == NULL)
 		goto invalid_secret;
-	if ((salt_str = strsep(&v, "$")) == NULL)
+	if ((salt_str = strsep(&v, "$")) == NULL || v == NULL)
 		goto invalid_secret;
-	if ((storedkey_str = strsep(&v, ":")) == NULL)
+	if ((storedkey_str = strsep(&v, ":")) == NULL || v == NULL)
 		goto invalid_secret;
 	serverkey_str = v;
 
#12Peter Eisentraut
peter@eisentraut.org
In reply to: Ranier Vilela (#11)
1 attachment(s)
Re: replace strtok()

On 10.10.24 14:59, Ranier Vilela wrote:

Please look at the SCRAM secret, which breaks parse_scram_secret(),
perhaps because strsep() doesn't return NULL where strtok() did:

CREATE ROLE r PASSWORD
'SCRAM-
SHA-256$4096:hpFyHTUsSWcR7O9P$LgZFIt6Oqdo27ZFKbZ2nV+=vtnYM995pDh9ca6WSi120qVV5NeluNfUPkwm7Vqat25RjSPLkGeoZBQs6wVv+um4=';

Core was generated by `postgres: law regression [local] CREATE
ROLE                                  '.
Program terminated with signal SIGSEGV, Segmentation fault.

#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
(gdb) bt
#0  __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
#1  0x0000563625e9e5b0 in parse_scram_secret (...) at auth-scram.c:655

Thanks for the report.

It seems to me that it could be due to incorrect use of the strsep function.
See:
https://man7.org/linux/man-pages/man3/strsep.3.html <https://man7.org/
linux/man-pages/man3/strsep.3.html>
"

In case no delimiter was found, the token
is taken to be the entire string/*stringp/, and/*stringp/ is made
NULL.

"
So, it is necessary to check the *stringp* against NULL too.

Thanks for the analysis. I think moreover we *only* need to check the
"stringp" for NULL, not the return value of strsep(), which would never
be NULL in our case. So I propose the attached patch as a variant of yours.

Attachments:

v2-0001-Fix-strsep-use-for-SCRAM-secrets-parsing.patchtext/plain; charset=UTF-8; name=v2-0001-Fix-strsep-use-for-SCRAM-secrets-parsing.patchDownload
From df4d43acb2956509e8166f2df4a3a2be1493ff6f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Tue, 15 Oct 2024 08:39:30 +0200
Subject: [PATCH v2] Fix strsep() use for SCRAM secrets parsing

The previous code (from commit 5d2e1cc117b) did not detect end of
string correctly, so it would fail to error out if fewer than the
expected number of fields were present, which could then later lead to
a crash when NULL string pointers are accessed.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reported-by: Ranier Vilela <ranier.vf@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c3944222a@eisentraut.org
---
 src/backend/libpq/auth-scram.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
index 03ddddc3c27..56df870e9ef 100644
--- a/src/backend/libpq/auth-scram.c
+++ b/src/backend/libpq/auth-scram.c
@@ -608,13 +608,17 @@ parse_scram_secret(const char *secret, int *iterations,
 	 * SCRAM-SHA-256$<iterations>:<salt>$<storedkey>:<serverkey>
 	 */
 	v = pstrdup(secret);
-	if ((scheme_str = strsep(&v, "$")) == NULL)
+	scheme_str = strsep(&v, "$");
+	if (v == NULL)
 		goto invalid_secret;
-	if ((iterations_str = strsep(&v, ":")) == NULL)
+	iterations_str = strsep(&v, ":");
+	if (v == NULL)
 		goto invalid_secret;
-	if ((salt_str = strsep(&v, "$")) == NULL)
+	salt_str = strsep(&v, "$");
+	if (v == NULL)
 		goto invalid_secret;
-	if ((storedkey_str = strsep(&v, ":")) == NULL)
+	storedkey_str = strsep(&v, ":");
+	if (v == NULL)
 		goto invalid_secret;
 	serverkey_str = v;
 

base-commit: 7cdfeee320e72162b62dddddee638e713c2b8680
-- 
2.47.0

#13Ranier Vilela
ranier.vf@gmail.com
In reply to: Peter Eisentraut (#12)
Re: replace strtok()

Em ter., 15 de out. de 2024 às 03:45, Peter Eisentraut <peter@eisentraut.org>
escreveu:

On 10.10.24 14:59, Ranier Vilela wrote:

Please look at the SCRAM secret, which breaks parse_scram_secret(),
perhaps because strsep() doesn't return NULL where strtok() did:

CREATE ROLE r PASSWORD
'SCRAM-

SHA-256$4096:hpFyHTUsSWcR7O9P$LgZFIt6Oqdo27ZFKbZ2nV+=vtnYM995pDh9ca6WSi120qVV5NeluNfUPkwm7Vqat25RjSPLkGeoZBQs6wVv+um4=';

Core was generated by `postgres: law regression [local] CREATE
ROLE '.
Program terminated with signal SIGSEGV, Segmentation fault.

#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
(gdb) bt
#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:74
#1 0x0000563625e9e5b0 in parse_scram_secret (...) at

auth-scram.c:655

Thanks for the report.

It seems to me that it could be due to incorrect use of the strsep

function.

See:
https://man7.org/linux/man-pages/man3/strsep.3.html <https://man7.org/
linux/man-pages/man3/strsep.3.html>
"

In case no delimiter was found, the token
is taken to be the entire string/*stringp/, and/*stringp/ is made
NULL.

"
So, it is necessary to check the *stringp* against NULL too.

Thanks for the analysis. I think moreover we *only* need to check the
"stringp" for NULL, not the return value of strsep(), which would never
be NULL in our case. So I propose the attached patch as a variant of
yours.

I'm not 100% sure, but the contrib passwordcheck uses and It's not very
safe.
The checks of NULL return are cheap, and will protect unwary users.

So I'm neutral here.

Notice that the report is only by Alexander Lakhin.
I'm at most a reviewer here.

best regards,
Ranier Vilela

#14Alexander Lakhin
exclusion@gmail.com
In reply to: Ranier Vilela (#13)
Re: replace strtok()

Hello Ranier and Peter,

15.10.2024 14:44, Ranier Vilela wrote:

Em ter., 15 de out. de 2024 às 03:45, Peter Eisentraut <peter@eisentraut.org> escreveu:

Thanks for the analysis.  I think moreover we *only* need to check the
"stringp" for NULL, not the return value of strsep(), which would never
be NULL in our case.  So I propose the attached patch as a variant of yours.

I'm not 100% sure, but the contrib passwordcheck uses and It's not very safe.
The checks of NULL return are cheap, and will protect unwary users.

So I'm neutral here.

I also wonder, if other places touched by 5d2e1cc11 need corrections too.
I played with
PG_COLOR=always PG_COLORS="error=01;31" .../initdb

and it looks like this free() call in pg_logging_init():
            char       *colors = strdup(pg_colors_env);

            if (colors)
            {
...
                while ((token = strsep(&colors, ":")))
                {
...
                }

                free(colors);
            }
gets null in colors.

Best regards,
Alexander

#15Ranier Vilela
ranier.vf@gmail.com
In reply to: Alexander Lakhin (#14)
Re: replace strtok()

Em ter., 15 de out. de 2024 às 09:00, Alexander Lakhin <exclusion@gmail.com>
escreveu:

Hello Ranier and Peter,

15.10.2024 14:44, Ranier Vilela wrote:

Em ter., 15 de out. de 2024 às 03:45, Peter Eisentraut <
peter@eisentraut.org> escreveu:

Thanks for the analysis. I think moreover we *only* need to check the
"stringp" for NULL, not the return value of strsep(), which would never
be NULL in our case. So I propose the attached patch as a variant of
yours.

I'm not 100% sure, but the contrib passwordcheck uses and It's not very
safe.
The checks of NULL return are cheap, and will protect unwary users.

So I'm neutral here.

I also wonder, if other places touched by 5d2e1cc11 need corrections too.
I played with
PG_COLOR=always PG_COLORS="error=01;31" .../initdb

and it looks like this free() call in pg_logging_init():
char *colors = strdup(pg_colors_env);

if (colors)
{
...
while ((token = strsep(&colors, ":")))
{
...
}

free(colors);
}
gets null in colors.

Yeah, I also saw this usage, but I was waiting for a definition for the
first report.
The solution IMO, would be the same.

diff --git a/src/common/logging.c b/src/common/logging.c
index aedd1ae2d8..45b5316d48 100644
--- a/src/common/logging.c
+++ b/src/common/logging.c
@@ -121,7 +121,7 @@ pg_logging_init(const char *argv0)
  {
  char   *token;
- while ((token = strsep(&colors, ":")))
+ while ((token = strsep(&colors, ":")) != NULL && colors != NULL)
  {
  char   *e = strchr(token, '=');

The advantage of this change is that it would avoid processing unnecessary
tokens.

best regards,
Ranier Vilela

Show quoted text
#16Peter Eisentraut
peter@eisentraut.org
In reply to: Alexander Lakhin (#14)
1 attachment(s)
Re: replace strtok()

On 15.10.24 14:00, Alexander Lakhin wrote:

I also wonder, if other places touched by 5d2e1cc11 need corrections too.
I played with
PG_COLOR=always PG_COLORS="error=01;31" .../initdb

and it looks like this free() call in pg_logging_init():
            char       *colors = strdup(pg_colors_env);

            if (colors)
            {
...
                while ((token = strsep(&colors, ":")))
                {
...
                }

                free(colors);
            }
gets null in colors.

Yes, this is indeed incorrect. We need to keep a separate pointer to
the start of the string to free later. This matches the example on the
strsep man page (https://man.freebsd.org/cgi/man.cgi?strsep(3)). Patch
attached.

Attachments:

0001-Fix-memory-leaks-from-incorrect-strsep-uses.patchtext/plain; charset=UTF-8; name=0001-Fix-memory-leaks-from-incorrect-strsep-uses.patchDownload
From 1b842bfdc2b303264134687f3ec21fd3e53a0c82 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Wed, 16 Oct 2024 09:37:54 +0200
Subject: [PATCH] Fix memory leaks from incorrect strsep() uses

Commit 5d2e1cc117b introduced some strsep() uses, but it did the
memory management wrong in some cases.  We need to keep a separate
pointer to the allocate memory so that we can free it later, because
strsep() advances the pointer we pass to it, and it at the end it
will be NULL, so any free() calls won't do anything.

(This fixes two of the four places changed in commit 5d2e1cc117b.  The
other two don't have this problem.)

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c3944222a@eisentraut.org
---
 src/common/logging.c          | 3 ++-
 src/test/regress/pg_regress.c | 7 +++++--
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/src/common/logging.c b/src/common/logging.c
index aedd1ae2d8c..3cf119090a5 100644
--- a/src/common/logging.c
+++ b/src/common/logging.c
@@ -120,8 +120,9 @@ pg_logging_init(const char *argv0)
 			if (colors)
 			{
 				char	   *token;
+				char	   *cp = colors;
 
-				while ((token = strsep(&colors, ":")))
+				while ((token = strsep(&cp, ":")))
 				{
 					char	   *e = strchr(token, '=');
 
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 5157629b1cc..6c188954b14 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -233,14 +233,17 @@ free_stringlist(_stringlist **listhead)
 static void
 split_to_stringlist(const char *s, const char *delim, _stringlist **listhead)
 {
-	char	   *sc = pg_strdup(s);
 	char	   *token;
+	char	   *sc;
+	char	   *tofree;
+
+	tofree = sc = pg_strdup(s);
 
 	while ((token = strsep(&sc, delim)))
 	{
 		add_stringlist_item(listhead, token);
 	}
-	free(sc);
+	free(tofree);
 }
 
 /*
-- 
2.47.0

#17Peter Eisentraut
peter@eisentraut.org
In reply to: Ranier Vilela (#15)
Re: replace strtok()

On 15.10.24 14:07, Ranier Vilela wrote:

I also wonder, if other places touched by 5d2e1cc11 need corrections
too.
I played with
PG_COLOR=always PG_COLORS="error=01;31" .../initdb

and it looks like this free() call in pg_logging_init():
            char       *colors = strdup(pg_colors_env);

            if (colors)
            {
...
                while ((token = strsep(&colors, ":")))
                {
...
                }

                free(colors);
            }
gets null in colors.

Yeah, I also saw this usage, but I was waiting for a definition for the
first report.
The solution IMO, would be the same.

diff --git a/src/common/logging.c b/src/common/logging.c
index aedd1ae2d8..45b5316d48 100644
--- a/src/common/logging.c
+++ b/src/common/logging.c
@@ -121,7 +121,7 @@ pg_logging_init(const char *argv0)
  {
  char   *token;
- while ((token = strsep(&colors, ":")))
+ while ((token = strsep(&colors, ":")) != NULL && colors != NULL)
  {
  char   *e = strchr(token, '=');
The advantage of this change is that it would avoid processing 
unnecessary tokens.

This wouldn't fix anything, I think. If colors is NULL, then strsep()
already returns NULL, so the added code does nothing.

#18Ranier Vilela
ranier.vf@gmail.com
In reply to: Peter Eisentraut (#17)
Re: replace strtok()

Em qua., 16 de out. de 2024 às 04:45, Peter Eisentraut <peter@eisentraut.org>
escreveu:

On 15.10.24 14:07, Ranier Vilela wrote:

I also wonder, if other places touched by 5d2e1cc11 need corrections
too.
I played with
PG_COLOR=always PG_COLORS="error=01;31" .../initdb

and it looks like this free() call in pg_logging_init():
char *colors = strdup(pg_colors_env);

if (colors)
{
...
while ((token = strsep(&colors, ":")))
{
...
}

free(colors);
}
gets null in colors.

Yeah, I also saw this usage, but I was waiting for a definition for the
first report.
The solution IMO, would be the same.

diff --git a/src/common/logging.c b/src/common/logging.c
index aedd1ae2d8..45b5316d48 100644
--- a/src/common/logging.c
+++ b/src/common/logging.c
@@ -121,7 +121,7 @@ pg_logging_init(const char *argv0)
{
char   *token;
- while ((token = strsep(&colors, ":")))
+ while ((token = strsep(&colors, ":")) != NULL && colors != NULL)
{
char   *e = strchr(token, '=');
The advantage of this change is that it would avoid processing
unnecessary tokens.

This wouldn't fix anything, I think. If colors is NULL, then strsep()
already returns NULL, so the added code does nothing.

If *colors* is NULL, then the delimiter is not found and strsep will return
the entire
string **stringp, so the token becomes invalid*.

IMO, I think it must be necessary to check if *colors* are NULL too.

best regards,
Ranier Vilela

#19Alexander Lakhin
exclusion@gmail.com
In reply to: Ranier Vilela (#18)
2 attachment(s)
Re: replace strtok()

Hello Ranier,

16.10.2024 14:14, Ranier Vilela wrote:

Em qua., 16 de out. de 2024 às 04:45, Peter Eisentraut <peter@eisentraut.org> escreveu:

This wouldn't fix anything, I think.  If colors is NULL, then strsep()
already returns NULL, so the added code does nothing.

If *colors* is NULL, then the delimiter is not found and strsep will return the entire
string /*stringp, so the token becomes invalid/.

IMO, I think it must be necessary to check if *colors* are NULL too.

I've tested your proposed change and what I'm seeing is that:
PG_COLOR=always PG_COLORS="error=01;31" initdb
doesn't color the "error" word:

while with only Peter's patch it works as expected:

Does your change work differently for you?

Best regards,
Alexander

Attachments:

bHg6qu1K7Gf0TPeN.pngimage/png; name=bHg6qu1K7Gf0TPeN.pngDownload
Qnnl0syi30SW7Xgx.pngimage/png; name=Qnnl0syi30SW7Xgx.pngDownload
#20Ranier Vilela
ranier.vf@gmail.com
In reply to: Alexander Lakhin (#19)
3 attachment(s)
Re: replace strtok()

Em qui., 17 de out. de 2024 às 07:30, Alexander Lakhin <exclusion@gmail.com>
escreveu:

Hello Ranier,

16.10.2024 14:14, Ranier Vilela wrote:

Em qua., 16 de out. de 2024 às 04:45, Peter Eisentraut <
peter@eisentraut.org> escreveu:

This wouldn't fix anything, I think. If colors is NULL, then strsep()
already returns NULL, so the added code does nothing.

If *colors* is NULL, then the delimiter is not found and strsep will
return the entire
string **stringp, so the token becomes invalid*.

IMO, I think it must be necessary to check if *colors* are NULL too.

I've tested your proposed change and what I'm seeing is that:
PG_COLOR=always PG_COLORS="error=01;31" initdb
doesn't color the "error" word:

while with only Peter's patch it works as expected:

Does your change work differently for you?

No.
Thanks for the test.

It seems to me that with strsep, the only alternative is to run the risk of
processing an invalid token?
I ran the test with Peter's patch in Windows and the terminal doesn't
color the "error" word, perhaps this feature does not work with Windows.
[image: error1.png]
I withdraw the proposed patch.

best regards,
Ranier Vilela

Attachments:

bHg6qu1K7Gf0TPeN.pngimage/png; name=bHg6qu1K7Gf0TPeN.pngDownload
Qnnl0syi30SW7Xgx.pngimage/png; name=Qnnl0syi30SW7Xgx.pngDownload
error1.pngimage/png; name=error1.pngDownload
#21Peter Eisentraut
peter@eisentraut.org
In reply to: Peter Eisentraut (#16)
Re: replace strtok()

On 16.10.24 09:42, Peter Eisentraut wrote:

On 15.10.24 14:00, Alexander Lakhin wrote:

I also wonder, if other places touched by 5d2e1cc11 need corrections too.
I played with
PG_COLOR=always PG_COLORS="error=01;31" .../initdb

and it looks like this free() call in pg_logging_init():
             char       *colors = strdup(pg_colors_env);

             if (colors)
             {
...
                 while ((token = strsep(&colors, ":")))
                 {
...
                 }

                 free(colors);
             }
gets null in colors.

Yes, this is indeed incorrect.  We need to keep a separate pointer to
the start of the string to free later.  This matches the example on the
strsep man page (https://man.freebsd.org/cgi/man.cgi?strsep(3)).  Patch
attached.

I have committed both fixes mentioned in this thread in the last couple
of days.

#22Ranier Vilela
ranier.vf@gmail.com
In reply to: Peter Eisentraut (#21)
Re: replace strtok()

Em sex., 18 de out. de 2024 às 06:41, Peter Eisentraut <peter@eisentraut.org>
escreveu:

On 16.10.24 09:42, Peter Eisentraut wrote:

On 15.10.24 14:00, Alexander Lakhin wrote:

I also wonder, if other places touched by 5d2e1cc11 need corrections

too.

I played with
PG_COLOR=always PG_COLORS="error=01;31" .../initdb

and it looks like this free() call in pg_logging_init():
char *colors = strdup(pg_colors_env);

if (colors)
{
...
while ((token = strsep(&colors, ":")))
{
...
}

free(colors);
}
gets null in colors.

Yes, this is indeed incorrect. We need to keep a separate pointer to
the start of the string to free later. This matches the example on the
strsep man page (https://man.freebsd.org/cgi/man.cgi?strsep(3)). Patch
attached.

I have committed both fixes mentioned in this thread in the last couple
of days.

Thanks.

Thanks Alexander, for the hard work.

best regards,
Ranier Vilela