pg17.3 PQescapeIdentifier() ignores len

Started by Justin Pryzby11 months ago13 messages
#1Justin Pryzby
pryzby@telsasoft.com

I found errors in our sql log after upgrading to 17.3.

error_severity | ERROR
message | schema "rptcache.44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214" does not exist
query | copy "rptcache.44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214"."44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214" from stdin

The copy command is from pygresql's inserttable(), which does:

do {
t = strchr(s, '.');
if (!t)
t = s + strlen(s);
table = PQescapeIdentifier(self->cnx, s, (size_t)(t - s));
fprintf(stderr, "table %s len %ld => %s\n", s, t-s, table);
if (bufpt < bufmax)
bufpt += snprintf(bufpt, (size_t)(bufmax - bufpt), "%s", table);
PQfreemem(table);
s = t;
if (*s && bufpt < bufmax)
*bufpt++ = *s++;
} while (*s);

The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.

python3 -c "import pg; db=pg.DB('postgres'); db.inserttable('child.a000000000000', [1])")
table child.a000000000000 len 5 => "child.a000000000000"
table a000000000000 len 13 => "a000000000000"

--
Justin

#2Ranier Vilela
ranier.vf@gmail.com
In reply to: Justin Pryzby (#1)
Re: pg17.3 PQescapeIdentifier() ignores len

Em qui., 13 de fev. de 2025 às 13:51, Justin Pryzby <pryzby@telsasoft.com>
escreveu:

I found errors in our sql log after upgrading to 17.3.

error_severity | ERROR
message | schema
"rptcache.44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214" does not exist
query | copy
"rptcache.44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214"."44e3955c33bb79f55750897da0c5ab1fa2004af1_20250214"
from stdin

The copy command is from pygresql's inserttable(), which does:

do {
t = strchr(s, '.');
if (!t)
t = s + strlen(s);
table = PQescapeIdentifier(self->cnx, s, (size_t)(t - s));
fprintf(stderr, "table %s len %ld => %s\n", s, t-s, table);
if (bufpt < bufmax)
bufpt += snprintf(bufpt, (size_t)(bufmax - bufpt), "%s",
table);
PQfreemem(table);
s = t;
if (*s && bufpt < bufmax)
*bufpt++ = *s++;
} while (*s);

The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its
len.

Interesting, Coverity has some new reports regarding PQescapeIdentifier.

CID 1591290: (#1 of 1): Out-of-bounds access (OVERRUN)
2. alloc_strlen: Allocating insufficient memory for the terminating null of
the string. [Note: The source code implementation of the function has been
overridden by a builtin model.]

Until now, I was in disbelief.

best regards,
Ranier Vilela

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Justin Pryzby (#1)
1 attachment(s)
Re: pg17.3 PQescapeIdentifier() ignores len

Justin Pryzby <pryzby@telsasoft.com> writes:

The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.

Ugh, yes. Need something like the attached.

FTR, 5dc1e42b4 et al were quite subtle patches done under extreme time
pressure. I wonder if they have any other issues. More eyes on those
patches would be welcome, now that they are public.

regards, tom lane

Attachments:

fix-PQescapeInternal-ignores-len-parameter.patchtext/x-diff; charset=us-ascii; name=fix-PQescapeInternal-ignores-len-parameter.patchDownload
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index e97ad02542..120d4d032e 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -4224,7 +4224,7 @@ PQescapeInternal(PGconn *conn, const char *str, size_t len, bool as_ident)
 	char	   *rp;
 	int			num_quotes = 0; /* single or double, depending on as_ident */
 	int			num_backslashes = 0;
-	size_t		input_len = strlen(str);
+	size_t		input_len = strnlen(str, len);
 	size_t		result_size;
 	char		quote_char = as_ident ? '"' : '\'';
 	bool		validated_mb = false;
@@ -4274,7 +4274,7 @@ PQescapeInternal(PGconn *conn, const char *str, size_t len, bool as_ident)
 			if (!validated_mb)
 			{
 				if (pg_encoding_verifymbstr(conn->client_encoding, s, remaining)
-					!= strlen(s))
+					!= remaining)
 				{
 					libpq_append_conn_error(conn, "invalid multibyte character");
 					return NULL;
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ranier Vilela (#2)
Re: pg17.3 PQescapeIdentifier() ignores len

Ranier Vilela <ranier.vf@gmail.com> writes:

Interesting, Coverity has some new reports regarding PQescapeIdentifier.

CID 1591290: (#1 of 1): Out-of-bounds access (OVERRUN)
2. alloc_strlen: Allocating insufficient memory for the terminating null of
the string. [Note: The source code implementation of the function has been
overridden by a builtin model.]

That's not new, we've been seeing those for awhile. I've been
ignoring them on the grounds that (a) if the code actually had such a
problem, valgrind testing would have found it, and (b) the message is
saying in so many words that they're ignoring our code in favor of
somebody's apparently-inaccurate model of said code.

regards, tom lane

#5Ranier Vilela
ranier.vf@gmail.com
In reply to: Tom Lane (#4)
Re: pg17.3 PQescapeIdentifier() ignores len

Em qui., 13 de fev. de 2025 às 16:05, Tom Lane <tgl@sss.pgh.pa.us> escreveu:

Ranier Vilela <ranier.vf@gmail.com> writes:

Interesting, Coverity has some new reports regarding PQescapeIdentifier.

CID 1591290: (#1 of 1): Out-of-bounds access (OVERRUN)
2. alloc_strlen: Allocating insufficient memory for the terminating null

of

the string. [Note: The source code implementation of the function has

been

overridden by a builtin model.]

That's not new, we've been seeing those for awhile. I've been
ignoring them on the grounds that (a) if the code actually had such a
problem, valgrind testing would have found it, and (b) the message is
saying in so many words that they're ignoring our code in favor of
somebody's apparently-inaccurate model of said code.

Thanks Tom, extra care is needed when analyzing these reports.

best regards,
Ranier Vilela

#6Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#3)
Re: pg17.3 PQescapeIdentifier() ignores len

On Thu, Feb 13, 2025 at 02:00:09PM -0500, Tom Lane wrote:

Justin Pryzby <pryzby@telsasoft.com> writes:

The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.

Ugh, yes. Need something like the attached.

Your patch looks right to me.

--
nathan

#7Ranier Vilela
ranier.vf@gmail.com
In reply to: Tom Lane (#3)
Re: pg17.3 PQescapeIdentifier() ignores len

Em qui., 13 de fev. de 2025 às 16:00, Tom Lane <tgl@sss.pgh.pa.us> escreveu:

Justin Pryzby <pryzby@telsasoft.com> writes:

The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its

len.

Ugh, yes. Need something like the attached.

FTR, 5dc1e42b4 et al were quite subtle patches done under extreme time
pressure. I wonder if they have any other issues. More eyes on those
patches would be welcome, now that they are public.

Passes on standard tests at Windows 64 bits, msvc 2022 64 bits.

best regards,
Ranier Vilela

#8Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#3)
Re: pg17.3 PQescapeIdentifier() ignores len

Hi,

On 2025-02-13 14:00:09 -0500, Tom Lane wrote:

Justin Pryzby <pryzby@telsasoft.com> writes:

The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.

Ugh, yes. Need something like the attached.

I just pushed this fix, together with an expansion of test_escape.c. With the
expanded test both uses of strlen() are detected.

FTR, 5dc1e42b4 et al were quite subtle patches done under extreme time
pressure. I wonder if they have any other issues. More eyes on those
patches would be welcome, now that they are public.

Indeed.

Greetings,

Andres Freund

#9Christoph Berg
myon@debian.org
In reply to: Andres Freund (#8)
Re: pg17.3 PQescapeIdentifier() ignores len

Re: Andres Freund

The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.

Ugh, yes. Need something like the attached.

I just pushed this fix, together with an expansion of test_escape.c. With the
expanded test both uses of strlen() are detected.

FTR, this is also caught by pygresql's regression tests:

test_inserttable_with_dotted_table_name (tests.test_classic_connection.TestInserttable.test_inserttable_with_dotted_table_name) ... ERROR

https://ci.debian.net/packages/p/pygresql/testing/amd64/57838998/
https://qa.debian.org/excuses.php?package=postgresql-17

What's missing in the PG regression tests to see that problem?

Christoph

#10Andres Freund
andres@anarazel.de
In reply to: Christoph Berg (#9)
Re: pg17.3 PQescapeIdentifier() ignores len

Hi,

On 2025-02-15 13:33:54 +0100, Christoph Berg wrote:

Re: Andres Freund

The fprintf suggests that since 5dc1e42b4 PQescapeIdentifier ignores its len.

Ugh, yes. Need something like the attached.

I just pushed this fix, together with an expansion of test_escape.c. With the
expanded test both uses of strlen() are detected.

FTR, this is also caught by pygresql's regression tests:

test_inserttable_with_dotted_table_name (tests.test_classic_connection.TestInserttable.test_inserttable_with_dotted_table_name) ... ERROR

https://ci.debian.net/packages/p/pygresql/testing/amd64/57838998/
https://qa.debian.org/excuses.php?package=postgresql-17

What's missing in the PG regression tests to see that problem?

Well, the expanded tests added as part of the fix would catch it, but I agree,
it's a problem this wasn't caught beforehand.

I don't think that common uses of PQescapeIdentifier/Literal are likely to
catch the problem, so it's perhaps not too surprising it wasn't caught. Which,
I guess, shows that we really need more explicit edge-case coverage of at
least the most crucial APIs (we barely have any). There's pretty much no way
that pg_regress or TAP test style tests are going to catch a problem like
this.

Greetings,

Andres Freund

#11Christoph Berg
myon@debian.org
In reply to: Andres Freund (#10)
Re: pg17.3 PQescapeIdentifier() ignores len

Re: Andres Freund

What's missing in the PG regression tests to see that problem?

Well, the expanded tests added as part of the fix would catch it, but I agree,
it's a problem this wasn't caught beforehand.

Oh sorry, I was actually skimming the git log to see if there is a
test, but then failed to realize there is one. Thanks!

I don't think that common uses of PQescapeIdentifier/Literal are likely to
catch the problem, so it's perhaps not too surprising it wasn't caught. Which,
I guess, shows that we really need more explicit edge-case coverage of at
least the most crucial APIs (we barely have any). There's pretty much no way
that pg_regress or TAP test style tests are going to catch a problem like
this.

What I can do is to trigger regression tests on all packages on
apt.postgresql.org after the minor releases have been built and then
raise any flags before the release goes out.

Except that pygresql isn't yet a package on apt.pg.o... will fix that
now. This time, the problem was caught by Debian's CI machinery.

Christoph

#12Andres Freund
andres@anarazel.de
In reply to: Christoph Berg (#11)
Re: pg17.3 PQescapeIdentifier() ignores len

Hi,

On 2025-02-15 17:55:12 +0100, Christoph Berg wrote:

Re: Andres Freund

I don't think that common uses of PQescapeIdentifier/Literal are likely to
catch the problem, so it's perhaps not too surprising it wasn't caught. Which,
I guess, shows that we really need more explicit edge-case coverage of at
least the most crucial APIs (we barely have any). There's pretty much no way
that pg_regress or TAP test style tests are going to catch a problem like
this.

What I can do is to trigger regression tests on all packages on
apt.postgresql.org after the minor releases have been built and then
raise any flags before the release goes out.

I think that'd be *really* helpful. Of course that does require somebody
watching and raising an alarm...

Do you have ongoing package builds for sid or such?

Except that pygresql isn't yet a package on apt.pg.o... will fix that
now. This time, the problem was caught by Debian's CI machinery.

Are there regular outomated rebuilds that could tell us of such a problem
between the release being stamped and actually made?

Greetings,

Andres Freund

#13Christoph Berg
myon@debian.org
In reply to: Andres Freund (#12)
Re: pg17.3 PQescapeIdentifier() ignores len

Re: Andres Freund

I think that'd be *really* helpful. Of course that does require somebody
watching and raising an alarm...

Do you have ongoing package builds for sid or such?

What I am doing anyway is to trigger the regression test of each
package once a month (randomly distributed over the month so I'm not
getting swamped with failures) and then try to keep this page as
"green" as possible:

https://jengus.postgresql.org/view/Testsuite/

Are there regular outomated rebuilds that could tell us of such a problem
between the release being stamped and actually made?

The current proposal would be to simply build the minor releases from
the Monday tarballs, do the extended testing on Tuesday/Wednesday by
triggering all tests from that Jenkins page, and hopefully complain
before Thursday.

There are daily builds of all PG server branches, but
1) there are too many temporary failures so I don't have enough time to chase them
2) these packages are not used as basis for regression testing the other packages

https://jengus.postgresql.org/view/Snapshot/

1) might improve once the networking (hello arm64) and IO (hello
s390x) issues of some of the build machines are fixed.

2) might be something we should give a try. Either it works, or it
turns out there are too many weird minor-version-skew problems that
aren't actual bugs.

It would still not catch any last-minute regression from security
patches, though.

Christoph