char 0x00

Started by Brett Okkenabout 6 years ago6 messagesdocs
Jump to latest
#1Brett Okken
brett.okken.os@gmail.com

There was a lot of a discussion recently on a pgjdbc issue[1]https://github.com/pgjdbc/pgjdbc/issues/1738 started by a
user who was experiencing errors trying to insert string values containing
the 0x00 char into text columns. We ultimately concluded that while this
could be forced to work for SQL_ASCII, postgresql does not intend to
support that character in text. I think it could be helpful to make that a
bit clearer in the documentation.
For example, in the note[2]https://www.postgresql.org/docs/current/multibyte.html#MULTIBYTE-CHARSET-SUPPORTED on SQL_ASCII, it currently states:

[T]he server interprets byte values 0-127 according to the ASCII standard,
while byte values 128-255 are taken as uninterpreted characters...

It might be clearer to state:

[T]he server interprets byte values 1-127 according to the ASCII standard,
while byte values 0 and 128-255 are taken as uninterpreted characters...

It might also be helpful to state in the opening paragraph about character
set support that the NUL character is not supported in any character set
(except SQL_ASCII).

The one place I found documentation on the 0x00 character is around string
constants[3]https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS. At the very bottom of section 4.1.2.2 String Constants with
C-Style escapes it states:

The character with the code zero cannot be in a string constant.

I think it would be helpful to move that statement up to 4.1.2.1 String
Constants, making it clearer that applies to all String Constants, not just
c-style escapes.

Finally, I think it would be helpful to add a similar note to the Character
Types[4]https://www.postgresql.org/docs/current/datatype-character.html page so that it is clear that the String Constant limitation
applies to the character data types as well.

While I have contributed some to the pgjdbc project, this is my first
interaction with the postgresql community. Please let me know if there is
another forum/mechansim I should be using to make these suggestions.

Thanks,

Brett

[1]: https://github.com/pgjdbc/pgjdbc/issues/1738
[2]: https://www.postgresql.org/docs/current/multibyte.html#MULTIBYTE-CHARSET-SUPPORTED
https://www.postgresql.org/docs/current/multibyte.html#MULTIBYTE-CHARSET-SUPPORTED
[3]: https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS
https://www.postgresql.org/docs/current/sql-syntax-lexical.html#SQL-SYNTAX-CONSTANTS
[4]: https://www.postgresql.org/docs/current/datatype-character.html

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Brett Okken (#1)
Re: char 0x00

Brett Okken <brett.okken.os@gmail.com> writes:

It might also be helpful to state in the opening paragraph about character
set support that the NUL character is not supported in any character set
(except SQL_ASCII).

Uh, it's not supported *anywhere*, period. If you managed to get the
server to accept a text string with an embedded NUL, I'd like to know
how exactly, because that's a missed validation.

As far as documentation goes, I'd be inclined to address the point
in the page about string data types,

https://www.postgresql.org/docs/current/datatype-character.html

with text along the lines of "The contents of a string value must be
validly encoded according to the database's encoding, and cannot include
the character with code zero". I'd sort of thought we said that there
already, but I don't see it.

regards, tom lane

#3Brett Okken
brett.okken.os@gmail.com
In reply to: Tom Lane (#2)
Re: char 0x00

Using a client and server encoding of SQL_ASCII makes it possible to get
0x00 into a text value column when using a bind variable.

Thanks,

Brett

On Thu, Mar 26, 2020 at 9:22 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Brett Okken <brett.okken.os@gmail.com> writes:

It might also be helpful to state in the opening paragraph about

character

set support that the NUL character is not supported in any character set
(except SQL_ASCII).

Uh, it's not supported *anywhere*, period. If you managed to get the
server to accept a text string with an embedded NUL, I'd like to know
how exactly, because that's a missed validation.

As far as documentation goes, I'd be inclined to address the point
in the page about string data types,

https://www.postgresql.org/docs/current/datatype-character.html

with text along the lines of "The contents of a string value must be
validly encoded according to the database's encoding, and cannot include
the character with code zero". I'd sort of thought we said that there
already, but I don't see it.

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Brett Okken (#3)
Re: char 0x00

Brett Okken <brett.okken.os@gmail.com> writes:

Using a client and server encoding of SQL_ASCII makes it possible to get
0x00 into a text value column when using a bind variable.

Having looked at the code again, I flat out don't believe you.
textin is certainly not going to read past a nul character,
and textrecv goes through pg_client_to_server (via pq_getmsgtext),
which AFAICS is careful in all code paths to reject nuls.

If I'm missing something, I'd really like to see a concrete example,
because this would be a bug, and it'd suggest that somebody's managed
to reopen CVE-2006-2313. If we're missing nul rejection in some code
path, then we're probably not doing encoding validation at all.

regards, tom lane

#5Brett Okken
brett.okken.os@gmail.com
In reply to: Tom Lane (#4)
Re: char 0x00

Dave, any thoughts on best way to reproduce Vladimir’s described workflow
in a way that is consumable by the postgresql team?

On Thu, Mar 26, 2020 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

Brett Okken <brett.okken.os@gmail.com> writes:

Using a client and server encoding of SQL_ASCII makes it possible to get
0x00 into a text value column when using a bind variable.

Having looked at the code again, I flat out don't believe you.
textin is certainly not going to read past a nul character,
and textrecv goes through pg_client_to_server (via pq_getmsgtext),
which AFAICS is careful in all code paths to reject nuls.

If I'm missing something, I'd really like to see a concrete example,
because this would be a bug, and it'd suggest that somebody's managed
to reopen CVE-2006-2313. If we're missing nul rejection in some code
path, then we're probably not doing encoding validation at all.

regards, tom lane

#6Dave Cramer
pg@fastcrypt.com
In reply to: Brett Okken (#5)
Re: char 0x00

We don't have to make it consumable. If we can use his code and reproduce
it in the JDBC driver that is enough.

Dave Cramer

On Sat, 28 Mar 2020 at 11:31, Brett Okken <brett.okken.os@gmail.com> wrote:

Show quoted text

Dave, any thoughts on best way to reproduce Vladimir’s described workflow
in a way that is consumable by the postgresql team?

On Thu, Mar 26, 2020 at 10:21 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Brett Okken <brett.okken.os@gmail.com> writes:

Using a client and server encoding of SQL_ASCII makes it possible to get
0x00 into a text value column when using a bind variable.

Having looked at the code again, I flat out don't believe you.
textin is certainly not going to read past a nul character,
and textrecv goes through pg_client_to_server (via pq_getmsgtext),
which AFAICS is careful in all code paths to reject nuls.

If I'm missing something, I'd really like to see a concrete example,
because this would be a bug, and it'd suggest that somebody's managed
to reopen CVE-2006-2313. If we're missing nul rejection in some code
path, then we're probably not doing encoding validation at all.

regards, tom lane