Bug in jsonb_in function (14 & 15 version are affected)
Hi!
I found a bug in jsonb_in function (it converts json from sting representation
into jsonb internal representation).
To reproduce this bug (the way I found it) you should get 8bit instance of postgres db:
1. add en_US locale (dpkg-reconfigure locales in debian)
2. initdb with latin1 encoding:
LANG=en_US ./initdb --encoding=LATIN1 -D my_pg_data
3. run database and execute the query:
SELECT E'{\x0a"\x5cb\x5c"\x5c\x5c\x5c/\x5cb\x5cf\x5cn\x5cr\x5ct\x5c"\x5c\x5c\x5c\x5crZt\x5c"\x5c\x5c\x5c/\x5cb\x5c"\x5c\x5c\x5c/\x5cb\x5c"\x5cu000f0\x5cu000f0000000000000000000000000000000000000000000000000000000\x5cuDFFF000000000000000000000000000000000000000000000000000000000000"0000000000000000000000000000000\x5cu0000000000000000000\xb4\x5cuDBFF\x5cuDFFF00000000000000000002000000000000000000000000000000000000000000000000000000000000000\x5cuDBFF'::jsonb;
In postgres 14 and 15, the backend will crash.
The packtrace produce with ASan is in the attached file.
This bug was found while fuzzing postgres input functions, using AFL++.
For now we are using lightweight wrapper around input functions that
create minimal environment for these functions to run conversion, and run the, in fuzzer.
My colleagues (they will come here shortly) have narrowed down this query to
SELECT E'\n"\\u00000"'::jsonb;
and says that is crashes even in utf8 locale.
They also have a preliminary version of patch to fix it. They will tell about it soon, I hope.
--
Nikolay Shaplov aka Nataraj
Fuzzing Engineer at Postgres Professional
Matrix IM: @dhyan:nataraj.su
Attachments:
backtrace.txttext/plain; charset=utf-8; name=backtrace.txtDownload
Nikolay Shaplov <dhyan@nataraj.su> writes:
I found a bug in jsonb_in function (it converts json from sting representation
into jsonb internal representation).
Yeah. Looks like json_lex_string is failing to honor the invariant
that it needs to set token_terminator ... although the documentation
of the function certainly isn't helping. I think we need the attached.
A nice side benefit is that the error context reports get a lot more
useful --- somebody should have inquired before as to why they were
so bogus.
regards, tom lane
Attachments:
fix-json-lex-string-error-cases.patchtext/x-diff; charset=us-ascii; name=fix-json-lex-string-error-cases.patchDownload+78-59
В Пн, 13/03/2023 в 13:58 -0400, Tom Lane пишет:
Nikolay Shaplov <dhyan@nataraj.su> writes:
I found a bug in jsonb_in function (it converts json from sting representation
into jsonb internal representation).Yeah. Looks like json_lex_string is failing to honor the invariant
that it needs to set token_terminator ... although the documentation
of the function certainly isn't helping. I think we need the attached.A nice side benefit is that the error context reports get a lot more
useful --- somebody should have inquired before as to why they were
so bogus.regards, tom lane
Good day, Tom and all.
Merged patch looks like start of refactoring.
Colleague (Nikita Glukhov) propose further refactoring of jsonapi.c:
- use of inline functions instead of macroses,
- more uniform their usage in token success or error reporting,
- simplify json_lex_number and its usage a bit.
Also he added tests for fixed bug.
-----
Regards,
Yura Sokolov.