Insufficient memory access checks in pglz_decompress

Started by Flavien GUEDEZover 2 years ago5 messagesbugs
Jump to latest
#1Flavien GUEDEZ
flav.pg@oopacity.net

Hi,

After some investigations about very corrupted toast data in one
postgres instance, I found that the pglz_decompress function (in
common/pg_lzcompress.c) does not check correctly where it copies data
from using memcpy(), which could result in segfault.
In this function, there are other checks to ensure that we do not copy
after the destination end, but not if we copy data from "before the
beginning".

Apologize, I am not a C developer and I am not used to submitting patches.
Though I have tried and attached kind of PoC with a relatively random
corrupted payload (it was beginning with those bytes in my storage for
obscure reasons).
Also attached a simple patch of what could be done just before the
memcpy calls.

Regards,

Flavien

Attachments:

check_for_corrupted.patchtext/x-patch; charset=UTF-8; name=check_for_corrupted.patchDownload+12-0
poc.ctext/x-csrc; charset=UTF-8; name=poc.cDownload
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Flavien GUEDEZ (#1)
Re: Insufficient memory access checks in pglz_decompress

Flavien GUEDEZ <flav.pg@oopacity.net> writes:

After some investigations about very corrupted toast data in one
postgres instance, I found that the pglz_decompress function (in
common/pg_lzcompress.c) does not check correctly where it copies data
from using memcpy(), which could result in segfault.
In this function, there are other checks to ensure that we do not copy
after the destination end, but not if we copy data from "before the
beginning".

Hmm, would it not be better to add this check to the existing "Check for
corrupt data" a bit further up? Then you'd only need one instance of
the test, and only need to do it once per tag (note the comment pointing
out that dp - off stays the same), and overall it'd be less surprising IMO.

regards, tom lane

#3Flavien GUEDEZ
flav.pg@oopacity.net
In reply to: Tom Lane (#2)
Re: Insufficient memory access checks in pglz_decompress

Thanks for your feedback, you are definitely right, I did not notice
that (dp - off) was staying the same in the while loop.
Here is another much smaller patch.

Flavien

Le 18/10/2023 à 17:14, Tom Lane a écrit :

Show quoted text

Flavien GUEDEZ <flav.pg@oopacity.net> writes:

After some investigations about very corrupted toast data in one
postgres instance, I found that the pglz_decompress function (in
common/pg_lzcompress.c) does not check correctly where it copies data
from using memcpy(), which could result in segfault.
In this function, there are other checks to ensure that we do not copy
after the destination end, but not if we copy data from "before the
beginning".

Hmm, would it not be better to add this check to the existing "Check for
corrupt data" a bit further up? Then you'd only need one instance of
the test, and only need to do it once per tag (note the comment pointing
out that dp - off stays the same), and overall it'd be less surprising IMO.

regards, tom lane

Attachments:

v2-pglz_decompress_check_for_corrupted.patchtext/x-patch; charset=UTF-8; name=v2-pglz_decompress_check_for_corrupted.patchDownload+3-1
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Flavien GUEDEZ (#3)
Re: Insufficient memory access checks in pglz_decompress

Flavien GUEDEZ <flav.pg@oopacity.net> writes:

Thanks for your feedback, you are definitely right, I did not notice
that (dp - off) was staying the same in the while loop.
Here is another much smaller patch.

I thought of another thing we should change: it's better to perform
the test as "off > (dp - dest)" than the way you formulated it.
"dp - dest" is certainly computable, since it's the number of bytes
we've written to the output buffer so far. But "dp - off" could,
with bad luck and a buffer near the start of memory, wrap around
to look like it's after "dest".

Pushed with that change and a little fiddling with the comment.
Thanks for the report!

regards, tom lane

#5Flavien GUEDEZ
flav.pg@oopacity.net
In reply to: Tom Lane (#4)
Re: Insufficient memory access checks in pglz_decompress

Le 19/10/2023 à 02:48, Tom Lane a écrit :

I thought of another thing we should change: it's better to perform
the test as "off > (dp - dest)" than the way you formulated it.
"dp - dest" is certainly computable, since it's the number of bytes
we've written to the output buffer so far. But "dp - off" could,
with bad luck and a buffer near the start of memory, wrap around
to look like it's after "dest".

Pushed with that change and a little fiddling with the comment.
Thanks for the report!

regards, tom lane

Thank you for the details !
Best,
Flavien