[PATCH] docs: document N'...' national character string literal syntax

Started by Hoda Salim2 months ago6 messagesdocs
Jump to latest
#1Hoda Salim
hoda.s.salim@gmail.com

Hi,

This patch documents the N'...' national character string literal
syntax, which has been supported by PostgreSQL but was previously
undocumented.

The documentation explains:
- What the syntax is (N'hello')
- What the SQL standard specifies (implementation-defined national
character set)
- What PostgreSQL actually does (treats it as a cast to character type)
- Why it exists (compatibility with SQL migrated from other databases)

I verified the documentation builds without errors.

--
Hoda Salim

Attachments:

0001-docs-document-N-national-character-string-literal.patchapplication/octet-stream; name=0001-docs-document-N-national-character-string-literal.patchDownload+43-1
#2Erik Wienhold
ewie@ewie.name
In reply to: Hoda Salim (#1)
Re: [PATCH] docs: document N'...' national character string literal syntax

On 2026-02-02 16:05 +0100, Hoda Salim wrote:

This patch documents the N'...' national character string literal
syntax, which has been supported by PostgreSQL but was previously
undocumented.

The documentation explains:
- What the syntax is (N'hello')
- What the SQL standard specifies (implementation-defined national
character set)
- What PostgreSQL actually does (treats it as a cast to character type)
- Why it exists (compatibility with SQL migrated from other databases)

I verified the documentation builds without errors.

+1

I brought up the missing documentation before [1]/messages/by-id/om3g7p7u3ztlrdp4tfswgulavljgn2fe6u2agk34mrr65dffuu@cpzlzuv6flko, but wasn't sure at
the time if Postgres conforms to the SQL standard (mainly because of
[2]: /messages/by-id/1275895438.1849.1.camel@fsopti579.F-Secure.com
(F421). That entry was commented with "syntax accepted" until commit
35223af0579. I read that as "fully supported" now.

+     <productname>PostgreSQL</productname> does not implement a separate
+     national character set; it treats <literal>N'...'</literal> as
+     equivalent to a regular string constant cast to the
+     <type>character</type> type, that is, <literal>'...'::character</literal>,
+     using the database's character set.

nchar is an alias of bpchar. There's no cast to char behind the scenes
since that would truncate the string:

select n'foo', 'foo'::character;
bpchar | bpchar
--------+--------
foo | f
(1 row)

Should we also mention the nchar alias in [4]https://www.postgresql.org/docs/current/datatype-character.html?

[1]: /messages/by-id/om3g7p7u3ztlrdp4tfswgulavljgn2fe6u2agk34mrr65dffuu@cpzlzuv6flko
[2]: /messages/by-id/1275895438.1849.1.camel@fsopti579.F-Secure.com
[3]: https://www.postgresql.org/docs/current/features-sql-standard.html
[4]: https://www.postgresql.org/docs/current/datatype-character.html

--
Erik Wienhold

#3Hoda Salim
hoda.s.salim@gmail.com
In reply to: Erik Wienhold (#2)
Re: [PATCH] docs: document N'...' national character string literal syntax

nchar is an alias of bpchar. There's no cast to char behind the scenes
since that would truncate the string:

select n'foo', 'foo'::character;
bpchar | bpchar
--------+--------
foo | f
(1 row)

Thank you for catching this! I verified the behavior and updated the
documentation to correctly state that N'...' is equivalent to a bpchar
literal.

Should we also mention the nchar alias in [4]?
[4] https://www.postgresql.org/docs/current/datatype-character.html

I'm happy to add that in a v3 if you think it belongs in this patch.
I wasn't sure if it would be preferred separately or together.

Updated patch attached.

--
Hoda Salim

Show quoted text

On Mon, Feb 2, 2026 at 7:09 PM Erik Wienhold <ewie@ewie.name> wrote:

On 2026-02-02 16:05 +0100, Hoda Salim wrote:

This patch documents the N'...' national character string literal
syntax, which has been supported by PostgreSQL but was previously
undocumented.

The documentation explains:
- What the syntax is (N'hello')
- What the SQL standard specifies (implementation-defined national
character set)
- What PostgreSQL actually does (treats it as a cast to character type)
- Why it exists (compatibility with SQL migrated from other databases)

I verified the documentation builds without errors.

+1

I brought up the missing documentation before [1], but wasn't sure at
the time if Postgres conforms to the SQL standard (mainly because of
[2]). Now I see that [3] already claims to support national character
(F421). That entry was commented with "syntax accepted" until commit
35223af0579. I read that as "fully supported" now.

+     <productname>PostgreSQL</productname> does not implement a separate
+     national character set; it treats <literal>N'...'</literal> as
+     equivalent to a regular string constant cast to the
+     <type>character</type> type, that is, <literal>'...'::character</literal>,
+     using the database's character set.

nchar is an alias of bpchar. There's no cast to char behind the scenes
since that would truncate the string:

select n'foo', 'foo'::character;
bpchar | bpchar
--------+--------
foo | f
(1 row)

Should we also mention the nchar alias in [4]?

[1] /messages/by-id/om3g7p7u3ztlrdp4tfswgulavljgn2fe6u2agk34mrr65dffuu@cpzlzuv6flko
[2] /messages/by-id/1275895438.1849.1.camel@fsopti579.F-Secure.com
[3] https://www.postgresql.org/docs/current/features-sql-standard.html
[4] https://www.postgresql.org/docs/current/datatype-character.html

--
Erik Wienhold

Attachments:

v2-0001-docs-document-N-.-national-character-string-liter.patchapplication/octet-stream; name=v2-0001-docs-document-N-.-national-character-string-liter.patchDownload+41-1
#4Erik Wienhold
ewie@ewie.name
In reply to: Hoda Salim (#3)
Re: [PATCH] docs: document N'...' national character string literal syntax

On 2026-02-02 21:04 +0100, Hoda Salim wrote:

nchar is an alias of bpchar. There's no cast to char behind the scenes
since that would truncate the string:

select n'foo', 'foo'::character;
bpchar | bpchar
--------+--------
foo | f
(1 row)

Thank you for catching this! I verified the behavior and updated the
documentation to correctly state that N'...' is equivalent to a bpchar
literal.

Thanks! The text looks good now. But please fix the indentation of the
second paragraph (should be one space per level). Your first patch was
correct in that regard.

Should we also mention the nchar alias in [4]?
[4] https://www.postgresql.org/docs/current/datatype-character.html

I'm happy to add that in a v3 if you think it belongs in this patch.
I wasn't sure if it would be preferred separately or together.

I'd prefer a single patch with both changes to have an atomic change.
But in the end it's up to the committer, not me.

--
Erik Wienhold

#5David G. Johnston
david.g.johnston@gmail.com
In reply to: Erik Wienhold (#4)
Re: [PATCH] docs: document N'...' national character string literal syntax

On Tue, Feb 3, 2026 at 5:24 PM Erik Wienhold <ewie@ewie.name> wrote:

On 2026-02-02 21:04 +0100, Hoda Salim wrote:

nchar is an alias of bpchar. There's no cast to char behind the scenes
since that would truncate the string:

I'm seeing things differently:

postgres=# select '123 '::nchar, '123 '::bpchar;
bpchar | bpchar
--------+--------
1 | 123
(1 row)

Not a huge fan of the proposed wording but partly because the rest of the
section doesn't mention data types at all so this is going to move the bar
forward, leaving the others behind.

nit: I don't like saying N'...' is equivalent to a data type; something
more like N'...' produces a value of type bpchar.

Also, any reason not to just say:

"This syntax is accepted for compatibility with the SQL standard." and
move on? Repeating "uses a single character set/does not implement a
separate national character set" seems unnecessary. If there is at least
one secondary consideration for accepting this syntax we should state what
it is.

I'd copy the E'...' wording in the first paragraph:
... just before the opening single quote, e.g., N'foo'.
I'd suggest going even further in the emulation by leading with the title
of the thing being described, then the syntax. i.e., flip the ordering of
the first two sentences and rework for flow.

My thought:

For compatibility with the SQL standard, PostgreSQL accepts national
character string constants. A national character string constant is
specified by writing the letter N (upper or lower case) just before the
opening single quote, e.g., N'foo'. (When continuing a national character
string constant across lines, write N only before the first opening
quote.) PostgreSQL's implementation requires that characters comprising
the literal be encoded using the database encoding, just like all other
string constants. In fact, the concept of national character strings is
implemented purely at the SQL syntax layer (including data type names nchar
and nchar varying). Within the database, the bpchar and bpchar(n) data
types are used.

A similar note would be added to Data Types. I'd add after Example 8.1:

The SQL standard defines two additional data types pertaining to national
character strings. PostgreSQL only accommodates a single, database-wide,
character set via its database encoding, and so gains no practical benefit
from these distinct data types. However, as a compatibility shim,
PostgreSQL does implement SQL syntax to accept the nchar and nchar varying
data types. These get mapped onto bpchar and bpchar(n) (and thus
character) data types respectively.

David J.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#5)
Re: [PATCH] docs: document N'...' national character string literal syntax

"David G. Johnston" <david.g.johnston@gmail.com> writes:

Also, any reason not to just say:
"This syntax is accepted for compatibility with the SQL standard." and
move on?

+1. The syntax does not in fact do anything that anyone would call
useful. And I'm also unsure that it really satisfies what the
standard suggests it should do.

On the whole, leaving it in undocumented obscurity seems fine to
me. But if we must mention it, the less said the better. Nobody
will read a paragraph about this and think their time was well
spent.

regards, tom lane