[PATCH] clarify palloc comment on quote_literal_cstr

Started by Steve Chavez9 months ago9 messages
#1Steve Chavez
steve@supabase.io
1 attachment(s)

Hello hackers,

I found the numbers in `quote_literal_cstr` palloc quite magical. So I've
added a comment clarifying what they mean. The change is small:

        /* We make a worst-case result area; wasting a little space is OK */
-       result = palloc(len * 2 + 3 + 1);
+       result = palloc(
+               (len * 2)    /* worst-case doubling for every character if
each one is a quote */
+               + 3          /* two outer quotes + possibly 'E' if needed */
+               + 1          /* null terminator */
+       );

Best regards,
Steve

Attachments:

0001-clarify-palloc-comment-on-quote_literal_cstr.patchtext/x-patch; charset=US-ASCII; name=0001-clarify-palloc-comment-on-quote_literal_cstr.patchDownload
From 1eb0a72ea819ee9217d84abb8112e72bf0df61d8 Mon Sep 17 00:00:00 2001
From: steve-chavez <stevechavezast@gmail.com>
Date: Sun, 6 Apr 2025 12:33:55 -0500
Subject: [PATCH] clarify palloc comment on quote_literal_cstr

---
 src/backend/utils/adt/quote.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/quote.c b/src/backend/utils/adt/quote.c
index 677efb93e8d..4b9c52e6d9e 100644
--- a/src/backend/utils/adt/quote.c
+++ b/src/backend/utils/adt/quote.c
@@ -108,7 +108,11 @@ quote_literal_cstr(const char *rawstr)
 
 	len = strlen(rawstr);
 	/* We make a worst-case result area; wasting a little space is OK */
-	result = palloc(len * 2 + 3 + 1);
+	result = palloc(
+		(len * 2)    /* worst-case doubling for every character if each one is a quote */
+		+ 3          /* two outer quotes + possibly 'E' if needed */
+		+ 1          /* null terminator */
+	);
 
 	newlen = quote_literal_internal(result, rawstr, len);
 	result[newlen] = '\0';
-- 
2.40.1

#2Michael Paquier
michael@paquier.xyz
In reply to: Steve Chavez (#1)
Re: [PATCH] clarify palloc comment on quote_literal_cstr

On Sun, Apr 06, 2025 at 12:37:24PM -0500, Steve Chavez wrote:

I found the numbers in `quote_literal_cstr` palloc quite magical. So I've
added a comment clarifying what they mean. The change is small:

/* We make a worst-case result area; wasting a little space is OK */
-       result = palloc(len * 2 + 3 + 1);
+       result = palloc(
+               (len * 2)    /* worst-case doubling for every character if
each one is a quote */
+               + 3          /* two outer quotes + possibly 'E' if needed */
+               + 1          /* null terminator */
+       );

Agreed that this is a good idea to have a better documentation here if
someone decides to tweak this area in the future, rather than have
them guess the numbers.

Looking at what quote_literal_internal() does, it seems to me that you
are right based on ESCAPE_STRING_SYNTAX, SQL_STR_DOUBLE(), and the
extra backslashes added to the destination buffer.
--
Michael

#3Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: Michael Paquier (#2)
Re: [PATCH] clarify palloc comment on quote_literal_cstr

On Mon, Apr 7, 2025 at 3:19 AM Michael Paquier <michael@paquier.xyz> wrote:

On Sun, Apr 06, 2025 at 12:37:24PM -0500, Steve Chavez wrote:

I found the numbers in `quote_literal_cstr` palloc quite magical. So I've
added a comment clarifying what they mean. The change is small:

/* We make a worst-case result area; wasting a little space is OK */
-       result = palloc(len * 2 + 3 + 1);
+       result = palloc(
+               (len * 2)    /* worst-case doubling for every character if
each one is a quote */
+               + 3          /* two outer quotes + possibly 'E' if needed */
+               + 1          /* null terminator */
+       );

Agreed that this is a good idea to have a better documentation here if
someone decides to tweak this area in the future, rather than have
them guess the numbers.

Looking at what quote_literal_internal() does, it seems to me that you
are right based on ESCAPE_STRING_SYNTAX, SQL_STR_DOUBLE(), and the
extra backslashes added to the destination buffer.

This is an improvement in readability. However, I have not seen this
style of commenting arguments of a function. If there's a precedence,
please let me know. Usually we explain it in a comment before the
call. For example, something like
/* Allocate memory for worst case result factoring in a. double the
length number of bytes, in case every character is a quote, b. two
bytes for two outer quotes c. extra byte for E' d. one byte for null
terminator. */

--
Best Wishes,
Ashutosh Bapat

#4Steve Chavez
steve@supabase.io
In reply to: Ashutosh Bapat (#3)
1 attachment(s)
Re: [PATCH] clarify palloc comment on quote_literal_cstr

I haven't found a similar style of comment on any other function call.

I've attached a new patch using the style you suggest.

That being said, I do find the first form much more readable, but I
understand this is a debatable subject.

Best regards,
Steve Chavez

On Mon, 7 Apr 2025 at 06:33, Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
wrote:

Show quoted text

On Mon, Apr 7, 2025 at 3:19 AM Michael Paquier <michael@paquier.xyz>
wrote:

On Sun, Apr 06, 2025 at 12:37:24PM -0500, Steve Chavez wrote:

I found the numbers in `quote_literal_cstr` palloc quite magical. So

I've

added a comment clarifying what they mean. The change is small:

/* We make a worst-case result area; wasting a little space is

OK */

-       result = palloc(len * 2 + 3 + 1);
+       result = palloc(
+               (len * 2)    /* worst-case doubling for every

character if

each one is a quote */
+ + 3 /* two outer quotes + possibly 'E' if

needed */

+               + 1          /* null terminator */
+       );

Agreed that this is a good idea to have a better documentation here if
someone decides to tweak this area in the future, rather than have
them guess the numbers.

Looking at what quote_literal_internal() does, it seems to me that you
are right based on ESCAPE_STRING_SYNTAX, SQL_STR_DOUBLE(), and the
extra backslashes added to the destination buffer.

This is an improvement in readability. However, I have not seen this
style of commenting arguments of a function. If there's a precedence,
please let me know. Usually we explain it in a comment before the
call. For example, something like
/* Allocate memory for worst case result factoring in a. double the
length number of bytes, in case every character is a quote, b. two
bytes for two outer quotes c. extra byte for E' d. one byte for null
terminator. */

--
Best Wishes,
Ashutosh Bapat

Attachments:

0001-clarify-palloc-comment-on-quote_literal_cstr.patchtext/x-patch; charset=US-ASCII; name=0001-clarify-palloc-comment-on-quote_literal_cstr.patchDownload
From 99f64dfc687273ca1ef90199b8c6a9addb6578a9 Mon Sep 17 00:00:00 2001
From: steve-chavez <stevechavezast@gmail.com>
Date: Mon, 7 Apr 2025 13:34:55 -0500
Subject: [PATCH] clarify palloc comment on quote_literal_cstr

---
 src/backend/utils/adt/quote.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/quote.c b/src/backend/utils/adt/quote.c
index 677efb93e8d..60bda20606d 100644
--- a/src/backend/utils/adt/quote.c
+++ b/src/backend/utils/adt/quote.c
@@ -107,7 +107,10 @@ quote_literal_cstr(const char *rawstr)
 	int			newlen;
 
 	len = strlen(rawstr);
-	/* We make a worst-case result area; wasting a little space is OK */
+	/* Allocate memory for worst case result factoring in a. double the
+	length number of bytes, in case every character is a quote, b. two
+	bytes for two outer quotes c. extra byte for E' d. one byte for null
+	terminator. */
 	result = palloc(len * 2 + 3 + 1);
 
 	newlen = quote_literal_internal(result, rawstr, len);
-- 
2.40.1

#5Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: Steve Chavez (#4)
Re: [PATCH] clarify palloc comment on quote_literal_cstr

On Tue, Apr 8, 2025 at 12:08 AM Steve Chavez <steve@supabase.io> wrote:

I haven't found a similar style of comment on any other function call.

I've attached a new patch using the style you suggest.

That being said, I do find the first form much more readable, but I understand this is a debatable subject.

Thanks for addressing the comment.

In PG code, we start a multiline comment with just /* on the first
line and end with just */ on the last line. All the lines in-between
start with * . Please check other comments in the file.

I would write a., b. c. d. on separate lines.

--
Best Wishes,
Ashutosh Bapat

#6Michael Paquier
michael@paquier.xyz
In reply to: Ashutosh Bapat (#5)
Re: [PATCH] clarify palloc comment on quote_literal_cstr

On Tue, Apr 08, 2025 at 08:47:53AM +0530, Ashutosh Bapat wrote:

Thanks for addressing the comment.

In PG code, we start a multiline comment with just /* on the first
line and end with just */ on the last line. All the lines in-between
start with * . Please check other comments in the file.

I would write a., b. c. d. on separate lines.

Hmm. I don't think that grouping all these details in the same
comment block is an improvement in this case compared to the first
version, where it is clear which part of the calculation applies to
what. In short, I'm OK with how things are on HEAD.
--
Michael

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#6)
Re: [PATCH] clarify palloc comment on quote_literal_cstr

Michael Paquier <michael@paquier.xyz> writes:

Hmm. I don't think that grouping all these details in the same
comment block is an improvement in this case compared to the first
version, where it is clear which part of the calculation applies to
what. In short, I'm OK with how things are on HEAD.

+1. When I saw the patch I was mainly afraid that pgindent would
make a hash of it. But it didn't, so I'm cool with it as-is.

regards, tom lane

#8Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#7)
Re: [PATCH] clarify palloc comment on quote_literal_cstr

On Tue, Apr 08, 2025 at 01:26:59AM -0400, Tom Lane wrote:

+1. When I saw the patch I was mainly afraid that pgindent would
make a hash of it.

Yes. I was actually surprised to see that it did not mess up with the
code generated.
--
Michael

#9Ashutosh Bapat
ashutosh.bapat.oss@gmail.com
In reply to: Michael Paquier (#6)
Re: [PATCH] clarify palloc comment on quote_literal_cstr

On Tue, Apr 8, 2025 at 10:49 AM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Apr 08, 2025 at 08:47:53AM +0530, Ashutosh Bapat wrote:

Thanks for addressing the comment.

In PG code, we start a multiline comment with just /* on the first
line and end with just */ on the last line. All the lines in-between
start with * . Please check other comments in the file.

I would write a., b. c. d. on separate lines.

Hmm. I don't think that grouping all these details in the same
comment block is an improvement in this case compared to the first
version, where it is clear which part of the calculation applies to
what. In short, I'm OK with how things are on HEAD.

Ah! I didn't realize it was committed already.

--
Best Wishes,
Ashutosh Bapat