Mention invalid null byte sequence

Started by PG Bug reporting formover 5 years ago6 messagesdocs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/13/datatype-character.html
Description:

I discovered accidentally that PostgreSQL doesn't accept null byte in text
type. It seems that Oracle does (see
/messages/by-id/de752e01-f36c-821e-9181-cfba78c0fbc8@propaas.com)
and SQLite does it too.

So it should written in the character type that null byte are not accepted,
it would make like easier to migrate to PostgreSQL :)

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: PG Bug reporting form (#1)
Re: Mention invalid null byte sequence

On Sat, 2020-12-05 at 21:58 +0000, PG Doc comments form wrote:

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/13/datatype-character.html
Description:

I discovered accidentally that PostgreSQL doesn't accept null byte in text
type. It seems that Oracle does (see
/messages/by-id/de752e01-f36c-821e-9181-cfba78c0fbc8@propaas.com)
and SQLite does it too.

So it should written in the character type that null byte are not accepted,
it would make like easier to migrate to PostgreSQL :)

+1; how about the attached patch?

Yours,
Laurenz Albe

Attachments:

Document-that-NUL-characters-are-not-supported.patchtext/x-patch; charset=UTF-8; name=Document-that-NUL-characters-are-not-supported.patchDownload+5-0
#3Adrien CLERC
bugs-postgresql@antipoul.fr
In reply to: Laurenz Albe (#2)
Re: Mention invalid null byte sequence

Le 07/12/2020 à 10:02, Laurenz Albe a écrit :

On Sat, 2020-12-05 at 21:58 +0000, PG Doc comments form wrote:

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/13/datatype-character.html
Description:

I discovered accidentally that PostgreSQL doesn't accept null byte in text
type. It seems that Oracle does (see
/messages/by-id/de752e01-f36c-821e-9181-cfba78c0fbc8@propaas.com)
and SQLite does it too.

So it should written in the character type that null byte are not accepted,
it would make like easier to migrate to PostgreSQL :)

+1; how about the attached patch?

That would be a good start indeed.

I don't know the policy for documentation redundancy in PostgreSQL, but
it should be good to mention that also in
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-STRINGS-ESCAPE
since the basic "SELECT E'la\x00la';" will fail while "SELECT
E'la\x01la';" will not.

And, as a lazy person, I also would like to see it in the general
datatype page, since it's a common behavior.

Anyway, merging the first patch will enable the search for "NUL
character" to return a result, and that will be definitively a nice
improvement!

Have a nice day!

Adrien

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Laurenz Albe (#2)
Re: Mention invalid null byte sequence

Laurenz Albe <laurenz.albe@cybertec.at> writes:

On Sat, 2020-12-05 at 21:58 +0000, PG Doc comments form wrote:

The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/13/datatype-character.html
Description:

So it should written in the character type that null byte are not accepted,
it would make like easier to migrate to PostgreSQL :)

+1; how about the attached patch?

I had thought that this was already documented, but after digging around
I can only find it mentioned in the contexts of saying that literal
strings and quoted identifiers can't contain \0. So yeah, we need to
improve that.

I agree with the submitter that the place one would expect to read about
this is in datatype-character.html. So I'd propose the attached.
Maybe there's reason to repeat the info in charset.sgml, but it seems
like more of a datatype limitation than a character set issue.

regards, tom lane

Attachments:

document-that-you-cant-store-NUL.patchtext/x-diff; charset=us-ascii; name=document-that-you-cant-store-NUL.patchDownload+9-4
#5Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Tom Lane (#4)
Re: Mention invalid null byte sequence

On Mon, 2020-12-07 at 15:27 -0500, Tom Lane wrote:

Laurenz Albe <laurenz.albe@cybertec.at> writes:

On Sat, 2020-12-05 at 21:58 +0000, PG Doc comments form wrote:

The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/13/datatype-character.html
Description:

So it should written in the character type that null byte are not accepted,
it would make like easier to migrate to PostgreSQL :)

+1; how about the attached patch?

I had thought that this was already documented, but after digging around
I can only find it mentioned in the contexts of saying that literal
strings and quoted identifiers can't contain \0. So yeah, we need to
improve that.

I agree with the submitter that the place one would expect to read about
this is in datatype-character.html. So I'd propose the attached.
Maybe there's reason to repeat the info in charset.sgml, but it seems
like more of a datatype limitation than a character set issue.

+1 on your patch.

Yours,
Laurenz Albe

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Laurenz Albe (#5)
Re: Mention invalid null byte sequence

Laurenz Albe <laurenz.albe@cybertec.at> writes:

On Mon, 2020-12-07 at 15:27 -0500, Tom Lane wrote:

I agree with the submitter that the place one would expect to read about
this is in datatype-character.html. So I'd propose the attached.
Maybe there's reason to repeat the info in charset.sgml, but it seems
like more of a datatype limitation than a character set issue.

+1 on your patch.

Pushed, thanks for looking it over.

regards, tom lane