[PATCH] Use strchr() to search for a single character
Code in src/port/pgmkdirp.c uses strstr() to find a single character in
a string, but strstr() seems to be too generic for this job. Another
function, strchr(), might be better suited for this purpose, because it
is optimized to search for exactly one character in a string. In
addition, if strchr() is used, the compiler doesn't have to generate a
terminating \0 byte for the substring, producing slightly smaller code.
I'm attaching the patch.
Regards,
Dmitry
Attachments:
0001-Use-strchr-to-search-for-a-single-character.patchtext/x-diff; name=0001-Use-strchr-to-search-for-a-single-character.patchDownload
From: Dmitry Mityugov <d.mityugov@postgrespro.ru>
Date: Mon, 21 Jul 2025 00:50:35 +0300
Subject: [PATCH] Use strchr() to search for a single character
Code in src/port/pgmkdirp.c uses strstr() to find a single character in a
string, but strstr() seems to be too generic for this job. Another function,
strchr(), might be better suited for this purpose, because it is optimized to
search for exactly one character in a string. In addition, if strchr() is used,
the compiler doesn't have to generate a terminating \0 byte for the substring,
producing slightly smaller code.
---
src/port/pgmkdirp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/port/pgmkdirp.c b/src/port/pgmkdirp.c
index d943559760d..7d7cea4dd0e 100644
--- a/src/port/pgmkdirp.c
+++ b/src/port/pgmkdirp.c
@@ -73,7 +73,7 @@ pg_mkdir_p(char *path, int omode)
if (p[0] == '/' && p[1] == '/')
{
/* network drive */
- p = strstr(p + 2, "/");
+ p = strchr(p + 2, '/');
if (p == NULL)
{
errno = EINVAL;
--
2.50.1
On Sun, Jul 20, 2025 at 6:21 PM Dmitry Mityugov <d.mityugov@postgrespro.ru>
wrote:
Code in src/port/pgmkdirp.c uses strstr() to find a single character in
a string, but strstr() seems to be too generic for this job. Another
function, strchr(), might be better suited for this purpose, because it
is optimized to search for exactly one character in a string. In
addition, if strchr() is used, the compiler doesn't have to generate a
terminating \0 byte for the substring, producing slightly smaller code.
I'm attaching the patch.Regards,
Dmitry
Seems like a simple-enough change, not a huge win but probably worth doing.
Using ripgrep to search for 'strstr(.*".")' turns up two similar situations
in contrib/fuzzystrmatch/dmetaphone.c, so perhaps we include those.
There's also a match in src/bin/pg_rewind/filemap.c, but that one is a
false positive.
Corey Huinker писал(а) 2025-07-22 22:42:
On Sun, Jul 20, 2025 at 6:21 PM Dmitry Mityugov
<d.mityugov@postgrespro.ru> wrote:Code in src/port/pgmkdirp.c uses strstr() to find a single character
in
a string, but strstr() seems to be too generic for this job. Anotherfunction, strchr(), might be better suited for this purpose, because
it
is optimized to search for exactly one character in a string. In
addition, if strchr() is used, the compiler doesn't have to generate
a
terminating \0 byte for the substring, producing slightly smaller
code.
I'm attaching the patch.Regards,
DmitrySeems like a simple-enough change, not a huge win but probably worth
doing.Using ripgrep to search for 'strstr(.*".")' turns up two similar
situations in contrib/fuzzystrmatch/dmetaphone.c, so perhaps we
include those.There's also a match in src/bin/pg_rewind/filemap.c, but that one is a
false positive.
Thank you for your attention to this problem. The code in
contrib/fuzzystrmatch/dmetaphone.c indeed uses several calls to strstr()
to search for a single character, but it also uses strstr() to search
for strings that consist of more than a single character on adjacent
lines, and replacing half of those strstr()s with strchr()s would make
the code less consistent in my opinion.
What's more important, it seems that this code in
contrib/fuzzystrmatch/dmetaphone.c contains a bug. Statement `else if
(strstr(s->str, "WITZ"))` at line 317 will never be executed, because if
the string contains substring “W”, it will be handled at line 311, `if
(strstr(s->str, "W"))`. Probably this bug should be fixed in a separate
commit.
On Wed, 23 Jul 2025 at 09:34, Dmitry Mityugov <d.mityugov@postgrespro.ru> wrote:
Thank you for your attention to this problem. The code in
contrib/fuzzystrmatch/dmetaphone.c indeed uses several calls to strstr()
to search for a single character, but it also uses strstr() to search
for strings that consist of more than a single character on adjacent
lines, and replacing half of those strstr()s with strchr()s would make
the code less consistent in my opinion.
That depends on what you're making consistent. If the consistency is
that we always use strchr() when the search is for a single char, then
it's not consistent to ignore that one.
Looking at [1]https://godbolt.org/z/q1xcKdzd7, it seems even ancient versions of gcc and clang
rewrite the strstr() into a strchr() call when the search term is a
single char string. So it might not be worth doing to any trouble
here.
[1]: https://godbolt.org/z/q1xcKdzd7
David
David Rowley <dgrowleyml@gmail.com> writes:
Looking at [1], it seems even ancient versions of gcc and clang
rewrite the strstr() into a strchr() call when the search term is a
single char string. So it might not be worth doing to any trouble
here.
I was wondering if that might be true. However, your godbolt results
show that MSVC doesn't do this optimization, and the usage in
pgmkdirp.c is inside "#ifdef WIN32", so maybe it's worth fixing there.
I can't get excited about dmetaphone --- that's about as old and
crufty as anything in our tree. If we were going to touch it the
first thing I'd want to do is get rid of its non-multibyte-safe
upcasing logic (MakeUpper). The "WITZ" business is kind of
silly-looking, but the question it brings to my mind is whether
the algorithm was mistranscribed; so rather than just delete that
line we should do some research. If we care, that is.
regards, tom lane
On Wed, 23 Jul 2025 at 10:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
Looking at [1], it seems even ancient versions of gcc and clang
rewrite the strstr() into a strchr() call when the search term is a
single char string. So it might not be worth doing to any trouble
here.I was wondering if that might be true. However, your godbolt results
show that MSVC doesn't do this optimization, and the usage in
pgmkdirp.c is inside "#ifdef WIN32", so maybe it's worth fixing there.
Yeah, I noticed MSVC not doing the rewrite. I didn't notice the
mentioned use case was within an #ifdef WIN32.
I'm currently thinking we should just fix the pgmkdirp.c instance and
call it good.
David
David Rowley <dgrowleyml@gmail.com> writes:
I'm currently thinking we should just fix the pgmkdirp.c instance and
call it good.
+1
regards, tom lane