[PATCH] Add hints for invalid binary encoding names in encode/decode functions
Hi,
When users pass an invalid encoding name to the built-in functions
`encode()` or `decode()`, PostgreSQL currently reports only:
ERROR: unrecognized encoding: "xxx"
This patch adds an error hint listing the valid encoding names,
so users can immediately see the supported options.
Example:
SELECT encode('\x01'::bytea, 'invalid');
ERROR: unrecognized encoding: "invalid"
HINT: Valid binary encodings are: "hex", "base64", "base64url",
"escape".
This change applies to both `binary_encode()` and `binary_decode()` in
`encode.c`,
and adds regression tests under `src/test/regress/sql/strings.sql`.
Although this is a small change, let me share a few considerations behind
it:
- I extracted the valid encodings from the hint messages and used a format
specifier like
`Valid binary encodings are: %s`, so that we avoid scattering those fixed
strings
across translation files.
- I briefly considered adding a small helper function to build the hint
string,
but decided against it since keeping the codebase simple felt preferable,
and
new binary encodings are not added very often.
I’d be happy to hear any opinions or suggestions.
Patch attached.
Best regards,
Shinya Sugamoto
Attachments:
0001-Added-error-hints-for-invalid-binary-encoding-names-.patchapplication/octet-stream; name=0001-Added-error-hints-for-invalid-binary-encoding-names-.patchDownload
From 18963479937bee58570a3a20329e5f7f573ffd84 Mon Sep 17 00:00:00 2001
From: Shinya Sugamoto <sugamoto@me.com>
Date: Fri, 7 Nov 2025 23:04:03 +0900
Subject: [PATCH] Added error hints for invalid binary encoding names so users
immediately see the supported options
---
src/backend/utils/adt/encode.c | 8 ++++++--
src/test/regress/expected/strings.out | 7 +++++++
src/test/regress/sql/strings.sql | 4 ++++
3 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c
index aabe9913eee..7f15705e69e 100644
--- a/src/backend/utils/adt/encode.c
+++ b/src/backend/utils/adt/encode.c
@@ -64,7 +64,9 @@ binary_encode(PG_FUNCTION_ARGS)
if (enc == NULL)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unrecognized encoding: \"%s\"", namebuf)));
+ errmsg("unrecognized encoding: \"%s\"", namebuf),
+ errhint("Valid binary encodings are: %s",
+ "\"hex\", \"base64\", \"base64url\", \"escape\".")));
dataptr = VARDATA_ANY(data);
datalen = VARSIZE_ANY_EXHDR(data);
@@ -112,7 +114,9 @@ binary_decode(PG_FUNCTION_ARGS)
if (enc == NULL)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unrecognized encoding: \"%s\"", namebuf)));
+ errmsg("unrecognized encoding: \"%s\"", namebuf),
+ errhint("Valid binary encodings are: %s",
+ "\"hex\", \"base64\", \"base64url\", \"escape\".")));
dataptr = VARDATA_ANY(data);
datalen = VARSIZE_ANY_EXHDR(data);
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index b9dc08d5f61..c5f04aa0440 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2575,6 +2575,13 @@ SELECT decode(encode('\x1234567890abcdef00', 'escape'), 'escape');
\x1234567890abcdef00
(1 row)
+-- invalid encoding names should report the valid options
+SELECT encode('\x01'::bytea, 'invalid'); -- error
+ERROR: unrecognized encoding: "invalid"
+HINT: Valid binary encodings are: "hex", "base64", "base64url", "escape".
+SELECT decode('00', 'invalid'); -- error
+ERROR: unrecognized encoding: "invalid"
+HINT: Valid binary encodings are: "hex", "base64", "base64url", "escape".
--
-- base64url encoding/decoding
--
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index a2a91523404..e70bc37f00e 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -815,6 +815,10 @@ SELECT decode(encode(('\x' || repeat('1234567890abcdef0001', 7))::bytea,
SELECT encode('\x1234567890abcdef00', 'escape');
SELECT decode(encode('\x1234567890abcdef00', 'escape'), 'escape');
+-- invalid encoding names should report the valid options
+SELECT encode('\x01'::bytea, 'invalid'); -- error
+SELECT decode('00', 'invalid'); -- error
+
--
-- base64url encoding/decoding
--
--
2.50.1 (Apple Git-155)
On Sat, Nov 8, 2025 at 3:26 PM Sugamoto Shinya <shinya34892@gmail.com> wrote:
Hi,
When users pass an invalid encoding name to the built-in functions
`encode()` or `decode()`, PostgreSQL currently reports only:ERROR: unrecognized encoding: "xxx"
This patch adds an error hint listing the valid encoding names,
so users can immediately see the supported options.Example:
SELECT encode('\x01'::bytea, 'invalid');
ERROR: unrecognized encoding: "invalid"
HINT: Valid binary encodings are: "hex", "base64", "base64url", "escape".This change applies to both `binary_encode()` and `binary_decode()` in `encode.c`,
and adds regression tests under `src/test/regress/sql/strings.sql`.
+1
- errmsg("unrecognized encoding: \"%s\"", namebuf)));
+ errmsg("unrecognized encoding: \"%s\"", namebuf),
+ errhint("Valid binary encodings are: %s",
+ "\"hex\", \"base64\", \"base64url\", \"escape\".")));
Since neither encode.c nor the documentation for encode() use
the term "binary encoding", I think just "encodings" is sufficient
in the hint message.
Also, the colon after "encodings”"doesn't seem necessary.
+-- invalid encoding names should report the valid options
+SELECT encode('\x01'::bytea, 'invalid'); -- error
I'd like to improve the comment like:
"report an error with a hint listing valid encodings when an invalid
encoding is specified".
Although this is a small change, let me share a few considerations behind it:
- I extracted the valid encodings from the hint messages and used a format specifier like
`Valid binary encodings are: %s`, so that we avoid scattering those fixed strings
across translation files.
I understand your point. But since new encodings are rarely added as you told,
I'm fine to list them directly in the hint instead of using %s,
similar to other hints like:
src/backend/commands/dbcommands.c:
errhint("Valid strategies are \"wal_log\" and \"file_copy\".")));
Regards,
--
Fujii Masao
On Nov 8, 2025, at 14:25, Sugamoto Shinya <shinya34892@gmail.com> wrote:
Hi,
When users pass an invalid encoding name to the built-in functions
`encode()` or `decode()`, PostgreSQL currently reports only:ERROR: unrecognized encoding: "xxx"
This patch adds an error hint listing the valid encoding names,
so users can immediately see the supported options.Example:
SELECT encode('\x01'::bytea, 'invalid');
ERROR: unrecognized encoding: "invalid"
HINT: Valid binary encodings are: "hex", "base64", "base64url", "escape".This change applies to both `binary_encode()` and `binary_decode()` in `encode.c`,
and adds regression tests under `src/test/regress/sql/strings.sql`.Although this is a small change, let me share a few considerations behind it:
- I extracted the valid encodings from the hint messages and used a format specifier like
`Valid binary encodings are: %s`, so that we avoid scattering those fixed strings
across translation files.
- I briefly considered adding a small helper function to build the hint string,
but decided against it since keeping the codebase simple felt preferable, and
new binary encodings are not added very often.I’d be happy to hear any opinions or suggestions.
Patch attached.
Best regards,
Shinya Sugamoto
<0001-Added-error-hints-for-invalid-binary-encoding-names-.patch>
1.
```
- errmsg("unrecognized encoding: \"%s\"", namebuf)));
+ errmsg("unrecognized encoding: \"%s\"", namebuf),
+ errhint("Valid binary encodings are: %s",
+ "\"hex\", \"base64\", \"base64url\", \"escape\".")));
```
I think hardcoding the encoding list is fragile. AFAIK, “base64url” was newly added just a couple of months ago.
The list is defined in encode.c (search for enclist in the file), I guess we can add a function to return a string with a encoding names.
2
```
+ errhint("Valid binary encodings are: %s",
+ "\"hex\", \"base64\", \"base64url\", \"escape\".")));
```
Looks like putting punctuation inside %s is not normal. By looking at other hint messages, they usually do something like:
```
errhint("Perhaps you meant the option \"%s\".”, value)
```
But I think if you address comment 1, then this one will be addressed as well.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
On 10 Nov 2025, at 10:06, Chao Li <li.evan.chao@gmail.com> wrote:
On Nov 8, 2025, at 14:25, Sugamoto Shinya <shinya34892@gmail.com> wrote:
This patch adds an error hint listing the valid encoding names,
so users can immediately see the supported options.
+1.
I think hardcoding the encoding list is fragile. AFAIK, “base64url” was newly added just a couple of months ago.
The list is defined in encode.c (search for enclist in the file), I guess we can add a function to return a string with a encoding names.
New encodings are added very infrequently, we can revisit this if that changes
but till then I think the simplicity of a hardcoded string is preferrable.
--
Daniel Gustafsson
On Nov 10, 2025, at 22:43, Daniel Gustafsson <daniel@yesql.se> wrote:
On 10 Nov 2025, at 10:06, Chao Li <li.evan.chao@gmail.com> wrote:
On Nov 8, 2025, at 14:25, Sugamoto Shinya <shinya34892@gmail.com> wrote:
This patch adds an error hint listing the valid encoding names,
so users can immediately see the supported options.+1.
I think hardcoding the encoding list is fragile. AFAIK, “base64url” was newly added just a couple of months ago.
The list is defined in encode.c (search for enclist in the file), I guess we can add a function to return a string with a encoding names.
New encodings are added very infrequently, we can revisit this if that changes
but till then I think the simplicity of a hardcoded string is preferrable.
I’m not convinced that a hardcoded string is actually better. In fact, the less often something changes, the easier it is to miss updating it when we need to.
If we add a helper function to build the list of supported encodings:
* The function is small and straightforward — trivial to implement.
* The function doesn’t run in any performance-critical paths, so no meaningful cost there.
* The encoding names are plain ASCII identifiers and effectively the same across languages, so there’s no real benefit to hardcoding them for i18n purposes either.
So I still think the helper function seems cleaner and less error-prone than a hardcoded string.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
Thanks everyone for checking my proposal.
Other than how to make the hint string, every comment looks good to me.
I'll take those into my patch. I appreciate it.
Given the comments you all made, I looked into the source code to see how
we build `errhint` with values, and found the following.
// contrib/pg_prewarm/pg_prewarm.c:102
errhint("Valid prewarm types are \"prefetch\", \"read\", and \"buffer\".")
// contrib/amcheck/verify_heapam.c:303
errhint("Valid skip options are \"all-visible\", \"all-frozen\", and
\"none\".")
// src/backend/replication/logical/launcher.c:458
errhint("You might need to increase \"%s\".",
"max_logical_replication_workers")
// src/backend/commands/opclasscmds.c:1240
errhint("Valid signature of operator class options parsing function is
%s.",
"(internal) RETURNS void")));
// src/backend/commands/dbcommands.c:1044
errhint("Valid strategies are \"wal_log\" and \"file_copy\".")));
Regarding the following comment,
In fact, the less often something changes, the easier it is to miss
updating it when we need to.
If we add a helper function to build the list of supported encodings:
* The function is small and straightforward — trivial to implement.
* The function doesn’t run in any performance-critical paths, so no
meaningful cost there.
* The encoding names are plain ASCII identifiers and effectively the same
across languages, so there’s no real benefit to hardcoding them for i18n
purposes either.
While I partly agree with the idea of adding a helper function, introducing
one just for constructing a simple hint string would lead to more code of
the kind that I would prefer to avoid, since this is not our main concern.
So, although it is true that having such a helper function would improve
maintainability in some sense,
considering that:
- there are already several existing cases where values are simply embedded
into hint texts (as Fujii pointed out), and
- this approach does not violate our coding rules or make the code harder
to understand,
I tend to think that it’s acceptable to hard-code the list of possible
encodings in the hint this time.
Thoughts?
Regards,
On Tue, Nov 11, 2025 at 3:14 PM Chao Li <li.evan.chao@gmail.com> wrote:
Show quoted text
On Nov 10, 2025, at 22:43, Daniel Gustafsson <daniel@yesql.se> wrote:
On 10 Nov 2025, at 10:06, Chao Li <li.evan.chao@gmail.com> wrote:
On Nov 8, 2025, at 14:25, Sugamoto Shinya <shinya34892@gmail.com>
wrote:
This patch adds an error hint listing the valid encoding names,
so users can immediately see the supported options.+1.
I think hardcoding the encoding list is fragile. AFAIK, “base64url” was
newly added just a couple of months ago.
The list is defined in encode.c (search for enclist in the file), I
guess we can add a function to return a string with a encoding names.
New encodings are added very infrequently, we can revisit this if that
changes
but till then I think the simplicity of a hardcoded string is
preferrable.
I’m not convinced that a hardcoded string is actually better. In fact, the
less often something changes, the easier it is to miss updating it when we
need to.If we add a helper function to build the list of supported encodings:
* The function is small and straightforward — trivial to implement.
* The function doesn’t run in any performance-critical paths, so no
meaningful cost there.
* The encoding names are plain ASCII identifiers and effectively the same
across languages, so there’s no real benefit to hardcoding them for i18n
purposes either.So I still think the helper function seems cleaner and less error-prone
than a hardcoded string.Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
On Tue, Nov 11, 2025 at 7:14 AM Sugamoto Shinya <shinya34892@gmail.com> wrote:
Thanks everyone for checking my proposal.
Other than how to make the hint string, every comment looks good to me. I'll take those into my patch. I appreciate it.Given the comments you all made, I looked into the source code to see how we build `errhint` with values, and found the following.
// contrib/pg_prewarm/pg_prewarm.c:102
errhint("Valid prewarm types are \"prefetch\", \"read\", and \"buffer\".")// contrib/amcheck/verify_heapam.c:303
errhint("Valid skip options are \"all-visible\", \"all-frozen\", and \"none\".")// src/backend/replication/logical/launcher.c:458
errhint("You might need to increase \"%s\".", "max_logical_replication_workers")// src/backend/commands/opclasscmds.c:1240
errhint("Valid signature of operator class options parsing function is %s.",
"(internal) RETURNS void")));// src/backend/commands/dbcommands.c:1044
errhint("Valid strategies are \"wal_log\" and \"file_copy\".")));Regarding the following comment,
In fact, the less often something changes, the easier it is to miss updating it when we need to.
If we add a helper function to build the list of supported encodings:
* The function is small and straightforward — trivial to implement.
* The function doesn’t run in any performance-critical paths, so no meaningful cost there.
* The encoding names are plain ASCII identifiers and effectively the same across languages, so there’s no real benefit to hardcoding them for i18n purposes either.While I partly agree with the idea of adding a helper function, introducing one just for constructing a simple hint string would lead to more code of the kind that I would prefer to avoid, since this is not our main concern.
So, although it is true that having such a helper function would improve maintainability in some sense,
considering that:
- there are already several existing cases where values are simply embedded into hint texts (as Fujii pointed out), and
- this approach does not violate our coding rules or make the code harder to understand,
I tend to think that it’s acceptable to hard-code the list of possible encodings in the hint this time.Thoughts?
+1 for hard-coding the supported encoding list. I think it's a good
start point. We can revisit the idea of dynamically constructing the
encoding list if we're concerned about its maintenance cost.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Wed, Nov 12, 2025 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
+1 for hard-coding the supported encoding list. I think it's a good
start point. We can revisit the idea of dynamically constructing the
encoding list if we're concerned about its maintenance cost.
+1
Regards,
--
Fujii Masao
On Wed, Nov 12, 2025 at 10:56 AM Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, Nov 12, 2025 at 10:10 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:+1 for hard-coding the supported encoding list. I think it's a good
start point. We can revisit the idea of dynamically constructing the
encoding list if we're concerned about its maintenance cost.+1
Regards,
--
Fujii Masao
Thanks everyone for reviewing my proposal.
I've hard-coded the valid list of encoding names.
Also, I modified the hint string into like "Valid encodings are \"hex\",
\"base64\", \"base64url\", and \"escape\"."
by adding "and" before "escape".
I attached my v2 patch to this message. Please let me know freely if you
have any additional questions.
Regards,
Attachments:
v2-0001-Added-error-hints-for-invalid-binary-encoding-names-.patchapplication/octet-stream; name=v2-0001-Added-error-hints-for-invalid-binary-encoding-names-.patchDownload
From dbd3398b41b38dbf6b7062110bef57c2092e143e Mon Sep 17 00:00:00 2001
From: Shinya Sugamoto <sugamoto@me.com>
Date: Fri, 7 Nov 2025 23:04:03 +0900
Subject: [PATCH] Added error hints for invalid binary encoding names so users
immediately see the supported options
---
src/backend/utils/adt/encode.c | 6 ++++--
src/test/regress/expected/strings.out | 7 +++++++
src/test/regress/sql/strings.sql | 4 ++++
3 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c
index aabe9913eee..738cf5bd577 100644
--- a/src/backend/utils/adt/encode.c
+++ b/src/backend/utils/adt/encode.c
@@ -64,7 +64,8 @@ binary_encode(PG_FUNCTION_ARGS)
if (enc == NULL)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unrecognized encoding: \"%s\"", namebuf)));
+ errmsg("unrecognized encoding: \"%s\"", namebuf),
+ errhint("Valid encodings are \"hex\", \"base64\", \"base64url\", and \"escape\".")));
dataptr = VARDATA_ANY(data);
datalen = VARSIZE_ANY_EXHDR(data);
@@ -112,7 +113,8 @@ binary_decode(PG_FUNCTION_ARGS)
if (enc == NULL)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unrecognized encoding: \"%s\"", namebuf)));
+ errmsg("unrecognized encoding: \"%s\"", namebuf),
+ errhint("Valid encodings are \"hex\", \"base64\", \"base64url\", and \"escape\".")));
dataptr = VARDATA_ANY(data);
datalen = VARSIZE_ANY_EXHDR(data);
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index b9dc08d5f61..2bfbff6a5ff 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2575,6 +2575,13 @@ SELECT decode(encode('\x1234567890abcdef00', 'escape'), 'escape');
\x1234567890abcdef00
(1 row)
+-- report an error with a hint listing valid encodings when an invalid encoding is specified
+SELECT encode('\x01'::bytea, 'invalid'); -- error
+ERROR: unrecognized encoding: "invalid"
+HINT: Valid encodings are "hex", "base64", "base64url", and "escape".
+SELECT decode('00', 'invalid'); -- error
+ERROR: unrecognized encoding: "invalid"
+HINT: Valid encodings are "hex", "base64", "base64url", and "escape".
--
-- base64url encoding/decoding
--
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index a2a91523404..88aa4c2983b 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -815,6 +815,10 @@ SELECT decode(encode(('\x' || repeat('1234567890abcdef0001', 7))::bytea,
SELECT encode('\x1234567890abcdef00', 'escape');
SELECT decode(encode('\x1234567890abcdef00', 'escape'), 'escape');
+-- report an error with a hint listing valid encodings when an invalid encoding is specified
+SELECT encode('\x01'::bytea, 'invalid'); -- error
+SELECT decode('00', 'invalid'); -- error
+
--
-- base64url encoding/decoding
--
--
2.50.1 (Apple Git-155)
On 10.11.25 10:06, Chao Li wrote:
The list is defined in encode.c (search for enclist in the file), I guess we can add a function to return a string with a encoding names.
2 ``` + errhint("Valid binary encodings are: %s", + "\"hex\", \"base64\", \"base64url\", \"escape\"."))); ```Looks like putting punctuation inside %s is not normal. By looking at other hint messages, they usually do something like:
```
errhint("Perhaps you meant the option \"%s\".”, value)
```
Yes, this is because the punctuation is itself subject to translation.
(Most obviously, different languages use different quotation marks.) So
the preferred style is something like
errhint("Valid encodings are: \"%s\", \"%s\", and \"%s\", "hex", ...)
or however many you need.
See also
</messages/by-id/202511070936.jj4o4ktd4b6l@alvherre.pgsql>
for a related discussion.
On Wed, Nov 12, 2025 at 10:23 PM Sugamoto Shinya <shinya34892@gmail.com> wrote:
Thanks everyone for reviewing my proposal.
I've hard-coded the valid list of encoding names.Also, I modified the hint string into like "Valid encodings are \"hex\", \"base64\", \"base64url\", and \"escape\"."
by adding "and" before "escape".I attached my v2 patch to this message. Please let me know freely if you have any additional questions.
Thanks for the updated patch! LGTM.
One minor comment: in v2 patch, it seems the encodings in the HINT message are
listed in the order they appear in the enclist struct. Wouldn't it be
clearer to list them
alphabetically, matching the order shown in the docs [1]https://www.postgresql.org/docs/devel/functions-binarystring.html#ENCODE-FORMAT-BASE64: base64,
base64url, escape, and hex?
Regards,
[1]: https://www.postgresql.org/docs/devel/functions-binarystring.html#ENCODE-FORMAT-BASE64
--
Fujii Masao
On Thu, Nov 13, 2025 at 9:58 AM Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, Nov 12, 2025 at 10:23 PM Sugamoto Shinya <shinya34892@gmail.com>
wrote:Thanks everyone for reviewing my proposal.
I've hard-coded the valid list of encoding names.Also, I modified the hint string into like "Valid encodings are \"hex\",
\"base64\", \"base64url\", and \"escape\"."
by adding "and" before "escape".
I attached my v2 patch to this message. Please let me know freely if you
have any additional questions.
Thanks for the updated patch! LGTM.
One minor comment: in v2 patch, it seems the encodings in the HINT message
are
listed in the order they appear in the enclist struct. Wouldn't it be
clearer to list them
alphabetically, matching the order shown in the docs [1]: base64,
base64url, escape, and hex?Regards,
[1]
https://www.postgresql.org/docs/devel/functions-binarystring.html#ENCODE-FORMAT-BASE64--
Fujii Masao
Thanks, Fujii and Peter.
I ordered the valid encodings alphabetically and extracted those into
separate parameters.
Here is my new patch.
Regards,
Attachments:
v3-0001-Add-error-hints-for-invalid-binary-encoding-names.patchapplication/octet-stream; name=v3-0001-Add-error-hints-for-invalid-binary-encoding-names.patchDownload
From 7eaba1e603a3b6c45e4cb6f3ae25516bec8d3c7d Mon Sep 17 00:00:00 2001
From: Shinya Sugamoto <sugamoto@me.com>
Date: Mon, 17 Nov 2025 22:56:07 +0900
Subject: [PATCH] Add error hints for invalid binary encoding names
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When users provide an invalid encoding name to encode() or decode()
functions, display a helpful error hint listing the valid options:
"hex", "base64", "base64url", and "escape".
This improvement helps users immediately see the supported encoding
options without needing to consult documentation.
---
src/backend/utils/adt/encode.c | 8 ++++++--
src/test/regress/expected/strings.out | 7 +++++++
src/test/regress/sql/strings.sql | 4 ++++
3 files changed, 17 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c
index aabe9913eee..c813ee1258b 100644
--- a/src/backend/utils/adt/encode.c
+++ b/src/backend/utils/adt/encode.c
@@ -64,7 +64,9 @@ binary_encode(PG_FUNCTION_ARGS)
if (enc == NULL)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unrecognized encoding: \"%s\"", namebuf)));
+ errmsg("unrecognized encoding: \"%s\"", namebuf),
+ errhint("Valid encodings are \"%s\", \"%s\", \"%s\", and \"%s\".",
+ "base64", "base64url", "escape", "hex")));
dataptr = VARDATA_ANY(data);
datalen = VARSIZE_ANY_EXHDR(data);
@@ -112,7 +114,9 @@ binary_decode(PG_FUNCTION_ARGS)
if (enc == NULL)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unrecognized encoding: \"%s\"", namebuf)));
+ errmsg("unrecognized encoding: \"%s\"", namebuf),
+ errhint("Valid encodings are \"%s\", \"%s\", \"%s\", and \"%s\".",
+ "base64", "base64url", "escape", "hex")));
dataptr = VARDATA_ANY(data);
datalen = VARSIZE_ANY_EXHDR(data);
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index b9dc08d5f61..727304f60e7 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2575,6 +2575,13 @@ SELECT decode(encode('\x1234567890abcdef00', 'escape'), 'escape');
\x1234567890abcdef00
(1 row)
+-- report an error with a hint listing valid encodings when an invalid encoding is specified
+SELECT encode('\x01'::bytea, 'invalid'); -- error
+ERROR: unrecognized encoding: "invalid"
+HINT: Valid encodings are "base64", "base64url", "escape", and "hex".
+SELECT decode('00', 'invalid'); -- error
+ERROR: unrecognized encoding: "invalid"
+HINT: Valid encodings are "base64", "base64url", "escape", and "hex".
--
-- base64url encoding/decoding
--
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index a2a91523404..88aa4c2983b 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -815,6 +815,10 @@ SELECT decode(encode(('\x' || repeat('1234567890abcdef0001', 7))::bytea,
SELECT encode('\x1234567890abcdef00', 'escape');
SELECT decode(encode('\x1234567890abcdef00', 'escape'), 'escape');
+-- report an error with a hint listing valid encodings when an invalid encoding is specified
+SELECT encode('\x01'::bytea, 'invalid'); -- error
+SELECT decode('00', 'invalid'); -- error
+
--
-- base64url encoding/decoding
--
--
2.50.1 (Apple Git-155)
On 17 Nov 2025, at 15:08, Sugamoto Shinya <shinya34892@gmail.com> wrote:
I ordered the valid encodings alphabetically and extracted those into separate parameters.
LGTM. Per project style I don't think we should quote the names of encodings
since they are visibly not English words (see commit a243569bf65c5 which deals
with variable names, but I think it applies here as well), but no need to send
another version of the patch just for that small adjustment.
--
Daniel Gustafsson
On Mon, Nov 17, 2025 at 11:30 PM Daniel Gustafsson <daniel@yesql.se> wrote:
On 17 Nov 2025, at 15:08, Sugamoto Shinya <shinya34892@gmail.com> wrote:
I ordered the valid encodings alphabetically and extracted those into separate parameters.
LGTM. Per project style I don't think we should quote the names of encodings
since they are visibly not English words (see commit a243569bf65c5 which deals
with variable names, but I think it applies here as well)
It looks like the message-style rule added in commit a243569bf65c5
applies only to
configuration parameter names, right? And it seems that rule was later revised
by commit 17974ec2594. Is that correct?
Regards,
--
Fujii Masao
On 17 Nov 2025, at 15:59, Fujii Masao <masao.fujii@gmail.com> wrote:
On Mon, Nov 17, 2025 at 11:30 PM Daniel Gustafsson <daniel@yesql.se> wrote:
On 17 Nov 2025, at 15:08, Sugamoto Shinya <shinya34892@gmail.com> wrote:
I ordered the valid encodings alphabetically and extracted those into separate parameters.
LGTM. Per project style I don't think we should quote the names of encodings
since they are visibly not English words (see commit a243569bf65c5 which deals
with variable names, but I think it applies here as well)It looks like the message-style rule added in commit a243569bf65c5
applies only to
configuration parameter names, right? And it seems that rule was later revised
by commit 17974ec2594. Is that correct?
Oh right, I had forgotten about that, thanks for the reminder. The patch as
proposed is correct then.
--
Daniel Gustafsson
On Mon, Nov 17, 2025 at 04:04:17PM +0100, Daniel Gustafsson wrote:
Oh right, I had forgotten about that, thanks for the reminder. The patch as
proposed is correct then.
The patch is probably fine as-is, but this seems like a candidate for the
ClosestMatch stuff added by commit 5ac51c8.
--
nathan
On Mon, Nov 17, 2025 at 8:44 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Nov 17, 2025 at 04:04:17PM +0100, Daniel Gustafsson wrote:
Oh right, I had forgotten about that, thanks for the reminder. The patch as
proposed is correct then.The patch is probably fine as-is, but this seems like a candidate for the
ClosestMatch stuff added by commit 5ac51c8.
+1
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
On Tue, Nov 18, 2025 at 5:34 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
On Mon, Nov 17, 2025 at 8:44 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Mon, Nov 17, 2025 at 04:04:17PM +0100, Daniel Gustafsson wrote:
Oh right, I had forgotten about that, thanks for the reminder. The patch as
proposed is correct then.The patch is probably fine as-is, but this seems like a candidate for the
ClosestMatch stuff added by commit 5ac51c8.+1
Is there a guideline on when we should use the ClosestMatch machinery
instead of explicitly listing valid values? From the discussion in [1]/messages/by-id/b1f9f399-3a1a-b554-283f-4ae7f34608e2@enterprisedb.com,
it seems appropriate when there are many possible values.
But should we generally replace all hints that list valid values
with ClosestMatch, or only in specific cases?
Regards,
[1]: /messages/by-id/b1f9f399-3a1a-b554-283f-4ae7f34608e2@enterprisedb.com
--
Fujii Masao
On Tue, Nov 18, 2025 at 1:07 PM Fujii Masao <masao.fujii@gmail.com> wrote:
On Tue, Nov 18, 2025 at 5:34 AM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:On Mon, Nov 17, 2025 at 8:44 AM Nathan Bossart <nathandbossart@gmail.com>
wrote:
On Mon, Nov 17, 2025 at 04:04:17PM +0100, Daniel Gustafsson wrote:
Oh right, I had forgotten about that, thanks for the reminder. The
patch as
proposed is correct then.
The patch is probably fine as-is, but this seems like a candidate for
the
ClosestMatch stuff added by commit 5ac51c8.
+1
Is there a guideline on when we should use the ClosestMatch machinery
instead of explicitly listing valid values? From the discussion in [1],
it seems appropriate when there are many possible values.
But should we generally replace all hints that list valid values
with ClosestMatch, or only in specific cases?Regards,
[1]
/messages/by-id/b1f9f399-3a1a-b554-283f-4ae7f34608e2@enterprisedb.com--
Fujii Masao
Is there a guideline on when we should use the ClosestMatch machinery
instead of explicitly listing valid values?
As far as I checked the source code, I couldn't find any related
documentation except the discussions.
Maybe we should add some explanations about this to
`doc/src/sgml/sources.sgml`,
especially in the Error Message Style Guide section.
From the discussion in [1], it seems appropriate when there are many
possible values
Personally I agree with this. Since there are only the four options in this
case, it wouldn't be difficult
for users to find a correct encoding from the hint message even without
showing a possible encoding.
Also, using ClosestMatch here to show a possible encoding would encourage
similar additions later
which might bloat our error handling. Keeping code simple by notifying the
user of the valid encodings
should be sufficient for this part.
Regards,
On Wed, Nov 19, 2025 at 12:20:34AM +0900, Sugamoto Shinya wrote:
On Tue, Nov 18, 2025 at 1:07 PM Fujii Masao <masao.fujii@gmail.com> wrote:
Is there a guideline on when we should use the ClosestMatch machinery
instead of explicitly listing valid values? From the discussion in [1],
it seems appropriate when there are many possible values.
But should we generally replace all hints that list valid values
with ClosestMatch, or only in specific cases?
I think it makes sense to reserve it for cases with several possible
values.
Personally I agree with this. Since there are only the four options in
this case, it wouldn't be difficult for users to find a correct encoding
from the hint message even without showing a possible encoding. Also,
using ClosestMatch here to show a possible encoding would encourage
similar additions later which might bloat our error handling. Keeping
code simple by notifying the user of the valid encodings should be
sufficient for this part.
WFM
--
nathan
Thanks everyone for reviewing my proposal. Please let me know if you have
something to discuss here.
2025年11月19日(水) 0:31 Nathan Bossart <nathandbossart@gmail.com>:
Show quoted text
On Wed, Nov 19, 2025 at 12:20:34AM +0900, Sugamoto Shinya wrote:
On Tue, Nov 18, 2025 at 1:07 PM Fujii Masao <masao.fujii@gmail.com>
wrote:
Is there a guideline on when we should use the ClosestMatch machinery
instead of explicitly listing valid values? From the discussion in [1],
it seems appropriate when there are many possible values.
But should we generally replace all hints that list valid values
with ClosestMatch, or only in specific cases?I think it makes sense to reserve it for cases with several possible
values.Personally I agree with this. Since there are only the four options in
this case, it wouldn't be difficult for users to find a correct encoding
from the hint message even without showing a possible encoding. Also,
using ClosestMatch here to show a possible encoding would encourage
similar additions later which might bloat our error handling. Keeping
code simple by notifying the user of the valid encodings should be
sufficient for this part.WFM
--
nathan
On Wed, Nov 19, 2025 at 8:08 PM Sugamoto Shinya <shinya34892@gmail.com> wrote:
Thanks everyone for reviewing my proposal. Please let me know if you have something to discuss here.
+ errhint("Valid encodings are \"%s\",
\"%s\", \"%s\", and \"%s\".",
+ "base64",
"base64url", "escape", "hex")));
You changed the HINT message to use format arguments,
but I still think it's better to embed the values directly (e.g.,
Valid encodings are "base64", "base64url", "escape", and "hex".)
for consistency with similar HINT messages. If this message is
used in many places a parameterized style would help translators,
but that benefit doesn't apply here.
Attached is an updated version of the patch. I switched
the HINT message to the embedded form and updated the commit message.
Barring any objections, I'm thinking to commit this.
Regards,
--
Fujii Masao
Attachments:
v4-0001-Add-HINT-listing-valid-encodings-to-encode-and-de.patchapplication/octet-stream; name=v4-0001-Add-HINT-listing-valid-encodings-to-encode-and-de.patchDownload
From de30aa93b23ba8be0b28817ff242d99051cb3913 Mon Sep 17 00:00:00 2001
From: Fujii Masao <fujii@postgresql.org>
Date: Wed, 19 Nov 2025 20:59:51 +0900
Subject: [PATCH v4] Add HINT listing valid encodings to encode() and decode()
errors.
This commit updates encode() and decode() so that when an invalid encoding
is specified, their error message includes a HINT listing all valid encodings.
This helps users quickly see which encodings are supported without needing
to consult the documentation.
Author: Shinya Sugamoto <shinya34892@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAAe3y+99sfPv8UDF1VM-rC1i5HBdqxUh=2HrbJJFm2+i=1OwOw@mail.gmail.com
---
src/backend/utils/adt/encode.c | 6 ++++--
src/test/regress/expected/strings.out | 7 +++++++
src/test/regress/sql/strings.sql | 4 ++++
3 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/src/backend/utils/adt/encode.c b/src/backend/utils/adt/encode.c
index aabe9913eee..f4a05d99737 100644
--- a/src/backend/utils/adt/encode.c
+++ b/src/backend/utils/adt/encode.c
@@ -64,7 +64,8 @@ binary_encode(PG_FUNCTION_ARGS)
if (enc == NULL)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unrecognized encoding: \"%s\"", namebuf)));
+ errmsg("unrecognized encoding: \"%s\"", namebuf),
+ errhint("Valid encodings are \"base64\", \"base64url\", \"escape\", and \"hex\".")));
dataptr = VARDATA_ANY(data);
datalen = VARSIZE_ANY_EXHDR(data);
@@ -112,7 +113,8 @@ binary_decode(PG_FUNCTION_ARGS)
if (enc == NULL)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("unrecognized encoding: \"%s\"", namebuf)));
+ errmsg("unrecognized encoding: \"%s\"", namebuf),
+ errhint("Valid encodings are \"base64\", \"base64url\", \"escape\", and \"hex\".")));
dataptr = VARDATA_ANY(data);
datalen = VARSIZE_ANY_EXHDR(data);
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index b9dc08d5f61..727304f60e7 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -2575,6 +2575,13 @@ SELECT decode(encode('\x1234567890abcdef00', 'escape'), 'escape');
\x1234567890abcdef00
(1 row)
+-- report an error with a hint listing valid encodings when an invalid encoding is specified
+SELECT encode('\x01'::bytea, 'invalid'); -- error
+ERROR: unrecognized encoding: "invalid"
+HINT: Valid encodings are "base64", "base64url", "escape", and "hex".
+SELECT decode('00', 'invalid'); -- error
+ERROR: unrecognized encoding: "invalid"
+HINT: Valid encodings are "base64", "base64url", "escape", and "hex".
--
-- base64url encoding/decoding
--
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index a2a91523404..88aa4c2983b 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -815,6 +815,10 @@ SELECT decode(encode(('\x' || repeat('1234567890abcdef0001', 7))::bytea,
SELECT encode('\x1234567890abcdef00', 'escape');
SELECT decode(encode('\x1234567890abcdef00', 'escape'), 'escape');
+-- report an error with a hint listing valid encodings when an invalid encoding is specified
+SELECT encode('\x01'::bytea, 'invalid'); -- error
+SELECT decode('00', 'invalid'); -- error
+
--
-- base64url encoding/decoding
--
--
2.51.2
On 19 Nov 2025, at 14:01, Fujii Masao <masao.fujii@gmail.com> wrote:
You changed the HINT message to use format arguments,
but I still think it's better to embed the values directly (e.g.,
Valid encodings are "base64", "base64url", "escape", and "hex".)
for consistency with similar HINT messages. If this message is
used in many places a parameterized style would help translators,
but that benefit doesn't apply here.Attached is an updated version of the patch. I switched
the HINT message to the embedded form and updated the commit message.
Barring any objections, I'm thinking to commit this.
My intepreration was that this version was already objected to by Peter
upthread in c5c937dc-de8e-4284-be25-5d5eaf089d00@eisentraut.org. The point
there being that punctuation in lists is subject to translation.
--
Daniel Gustafsson
On Wed, Nov 19, 2025 at 10:04 PM Daniel Gustafsson <daniel@yesql.se> wrote:
On 19 Nov 2025, at 14:01, Fujii Masao <masao.fujii@gmail.com> wrote:
You changed the HINT message to use format arguments,
but I still think it's better to embed the values directly (e.g.,
Valid encodings are "base64", "base64url", "escape", and "hex".)
for consistency with similar HINT messages. If this message is
used in many places a parameterized style would help translators,
but that benefit doesn't apply here.Attached is an updated version of the patch. I switched
the HINT message to the embedded form and updated the commit message.
Barring any objections, I'm thinking to commit this.My intepreration was that this version was already objected to by Peter
upthread in c5c937dc-de8e-4284-be25-5d5eaf089d00@eisentraut.org. The point
there being that punctuation in lists is subject to translation.
Yes, so I was thinking a message like "Valid encodings are %s" isn't acceptable
since the punctuation included in the %s value is not translated. In contrast,
a message like "Valid encodings are \"base64\", …" is fine because all
punctuation
is part of the translatable string itself, I thought.
But maybe I'm missing something?
Regards,
--
Fujii Masao
On 19.11.25 14:47, Fujii Masao wrote:
On Wed, Nov 19, 2025 at 10:04 PM Daniel Gustafsson <daniel@yesql.se> wrote:
On 19 Nov 2025, at 14:01, Fujii Masao <masao.fujii@gmail.com> wrote:
You changed the HINT message to use format arguments,
but I still think it's better to embed the values directly (e.g.,
Valid encodings are "base64", "base64url", "escape", and "hex".)
for consistency with similar HINT messages. If this message is
used in many places a parameterized style would help translators,
but that benefit doesn't apply here.Attached is an updated version of the patch. I switched
the HINT message to the embedded form and updated the commit message.
Barring any objections, I'm thinking to commit this.My intepreration was that this version was already objected to by Peter
upthread in c5c937dc-de8e-4284-be25-5d5eaf089d00@eisentraut.org. The point
there being that punctuation in lists is subject to translation.Yes, so I was thinking a message like "Valid encodings are %s" isn't acceptable
since the punctuation included in the %s value is not translated. In contrast,
a message like "Valid encodings are \"base64\", …" is fine because all
punctuation
is part of the translatable string itself, I thought.
My point was a different one. It is generally preferable to separate
translatable and untranslated things. So, as an example from elsewhere,
instead of
errhint("Use ALTER DOMAIN instead.");
it would be better to use
errhint("Use % instead.", "ALTER DOMAIN");
Similarly, here I would prefer something along the lines of
errhint("Valid values are \"%s\", \"%s\", and \"%s\".", "foo", "bar",
"baz");
So like it was done in patch v3 looks good to me.
On Wed, Nov 19, 2025 at 10:57 PM Peter Eisentraut <peter@eisentraut.org> wrote:
My point was a different one. It is generally preferable to separate
translatable and untranslated things. So, as an example from elsewhere,
instead oferrhint("Use ALTER DOMAIN instead.");
it would be better to use
errhint("Use % instead.", "ALTER DOMAIN");
Similarly, here I would prefer something along the lines of
errhint("Valid values are \"%s\", \"%s\", and \"%s\".", "foo", "bar",
"baz");So like it was done in patch v3 looks good to me.
Thanks for the explanation! I misunderstood your point.
So I will commit v3 patch.
Regards,
--
Fujii Masao
On 19 Nov 2025, at 15:17, Fujii Masao <masao.fujii@gmail.com> wrote:
So I will commit v3 patch.
+1
--
Daniel Gustafsson
On Wed, Nov 19, 2025 at 11:19 PM Daniel Gustafsson <daniel@yesql.se> wrote:
On 19 Nov 2025, at 15:17, Fujii Masao <masao.fujii@gmail.com> wrote:
So I will commit v3 patch.
+1
I've pushed the patch. Thanks!
Regards,
--
Fujii Masao