Fuzz testing COPY FROM parsing
Hi,
I've been mucking around with COPY FROM lately, and to test it, I wrote
some tools to generate input files and load them with COPY FROM:
https://github.com/hlinnaka/pgcopyfuzz
I used a fuzz testing tool called honggfuzz [1]https://github.com/google/honggfuzz to generate test inputs
for COPY FROM. At first I tried to use afl and libfuzzer, but honggfuzz
was much easier to use with PostgreSQL. It has a "persistent fuzzing
mode", which allows starting the server normally (well, in single-user
mode), and calling a function to get the next input. With the other
fuzzers I tried, you have to provide a callback function that the fuzzer
calls for each test iteration, and that was hard to integrate into the
PostgreSQL main processing loop.
I ran it for about 2 h on my laptop with the patch I was working on [2]/messages/by-id/11d39e63-b80a-5f8d-8043-fff04201fadc@iki.fi.
It didn't find any crashes, but it generated about 1300 input files that
it considered "interesting" based on code coverage analysis. When I took
those generated inputs, and ran them against unpatched and patched
server, some inputs produced different results. So that revealed a
couple of bugs in the patch. (I'll post a fixed patched version on that
thread soon.)
I hope others find this useful, too.
[1]: https://github.com/google/honggfuzz
[2]: /messages/by-id/11d39e63-b80a-5f8d-8043-fff04201fadc@iki.fi
/messages/by-id/11d39e63-b80a-5f8d-8043-fff04201fadc@iki.fi
- Heikki
Greetings,
* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
I've been mucking around with COPY FROM lately, and to test it, I wrote some
tools to generate input files and load them with COPY FROM:
Neat!
I used a fuzz testing tool called honggfuzz [1] to generate test inputs for
COPY FROM. At first I tried to use afl and libfuzzer, but honggfuzz was much
easier to use with PostgreSQL. It has a "persistent fuzzing mode", which
allows starting the server normally (well, in single-user mode), and calling
a function to get the next input. With the other fuzzers I tried, you have
to provide a callback function that the fuzzer calls for each test
iteration, and that was hard to integrate into the PostgreSQL main
processing loop.
Yeah, that's been one of the challenges with fuzzers I've played with
too.
I ran it for about 2 h on my laptop with the patch I was working on [2]. It
didn't find any crashes, but it generated about 1300 input files that it
considered "interesting" based on code coverage analysis. When I took those
generated inputs, and ran them against unpatched and patched server, some
inputs produced different results. So that revealed a couple of bugs in the
patch. (I'll post a fixed patched version on that thread soon.)I hope others find this useful, too.
Nice! I wonder if there's a way to have a buildfarm member or other
system doing this automatically on new commits and perhaps adding
coverage for other things like the JSON code..
Thanks!
Stephen
On 2/5/21 10:54 AM, Stephen Frost wrote:
Greetings,
* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
I've been mucking around with COPY FROM lately, and to test it, I wrote some
tools to generate input files and load them with COPY FROM:Neat!
I used a fuzz testing tool called honggfuzz [1] to generate test inputs for
COPY FROM. At first I tried to use afl and libfuzzer, but honggfuzz was much
easier to use with PostgreSQL. It has a "persistent fuzzing mode", which
allows starting the server normally (well, in single-user mode), and calling
a function to get the next input. With the other fuzzers I tried, you have
to provide a callback function that the fuzzer calls for each test
iteration, and that was hard to integrate into the PostgreSQL main
processing loop.Yeah, that's been one of the challenges with fuzzers I've played with
too.I ran it for about 2 h on my laptop with the patch I was working on [2]. It
didn't find any crashes, but it generated about 1300 input files that it
considered "interesting" based on code coverage analysis. When I took those
generated inputs, and ran them against unpatched and patched server, some
inputs produced different results. So that revealed a couple of bugs in the
patch. (I'll post a fixed patched version on that thread soon.)I hope others find this useful, too.
Nice! I wonder if there's a way to have a buildfarm member or other
system doing this automatically on new commits and perhaps adding
coverage for other things like the JSON code..
Not easily in the buildfarm as it is today. We can easily create modules
for extensions and other things that don't require modification of core
code, but things that require patching core code are a whole different
story.
That's not to say it couldn't be done, a SMOP. But using something like
Appveyor or Cirrus might be a lot simpler.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On 05/02/2021 21:16, Andrew Dunstan wrote:
On 2/5/21 10:54 AM, Stephen Frost wrote:
* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
I ran it for about 2 h on my laptop with the patch I was working on [2]. It
didn't find any crashes, but it generated about 1300 input files that it
considered "interesting" based on code coverage analysis. When I took those
generated inputs, and ran them against unpatched and patched server, some
inputs produced different results. So that revealed a couple of bugs in the
patch. (I'll post a fixed patched version on that thread soon.)I hope others find this useful, too.
Nice! I wonder if there's a way to have a buildfarm member or other
system doing this automatically on new commits and perhaps adding
coverage for other things like the JSON code..Not easily in the buildfarm as it is today. We can easily create modules
for extensions and other things that don't require modification of core
code, but things that require patching core code are a whole different
story.
It might be possible to call the fuzzer's HF_ITER() function from a C
extension instead. So you would run a query like "SELECT
next_fuzz_iter()" in a loop, and next_fuzz_iter() would be a C function
that calls HF_ITER(), and executes the actual query with SPI.
That said, I don't think it's important to run the fuzzer in the
buildfarm. It should be enough to do that every once in a while, when
you modify the COPY FROM code (or something else that you want to fuzz
test). But we could easily include the test inputs generated by the
fuzzer in the regular tests. We've usually been very frugal in adding
tests, though, to keep the time it takes to run all the tests short.
- Heikki
Greetings,
* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
On 05/02/2021 21:16, Andrew Dunstan wrote:
On 2/5/21 10:54 AM, Stephen Frost wrote:
* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
I ran it for about 2 h on my laptop with the patch I was working on [2]. It
didn't find any crashes, but it generated about 1300 input files that it
considered "interesting" based on code coverage analysis. When I took those
generated inputs, and ran them against unpatched and patched server, some
inputs produced different results. So that revealed a couple of bugs in the
patch. (I'll post a fixed patched version on that thread soon.)I hope others find this useful, too.
Nice! I wonder if there's a way to have a buildfarm member or other
system doing this automatically on new commits and perhaps adding
coverage for other things like the JSON code..Not easily in the buildfarm as it is today. We can easily create modules
for extensions and other things that don't require modification of core
code, but things that require patching core code are a whole different
story.It might be possible to call the fuzzer's HF_ITER() function from a C
extension instead. So you would run a query like "SELECT next_fuzz_iter()"
in a loop, and next_fuzz_iter() would be a C function that calls HF_ITER(),
and executes the actual query with SPI.
I wonder how much we could fuzz with that approach...
That said, I don't think it's important to run the fuzzer in the buildfarm.
It should be enough to do that every once in a while, when you modify the
COPY FROM code (or something else that you want to fuzz test). But we could
easily include the test inputs generated by the fuzzer in the regular tests.
We've usually been very frugal in adding tests, though, to keep the time it
takes to run all the tests short.
If we could be sure that everyone who might ever modify the COPY FROM or
JSON parser or other code that we arrange to get fuzz testing on with
this approach, that would be great, but I wouldn't make a bet on that
happening, which is why having it done (however it's done) in an
automated fashion would be good. Also, doing it on the buildfarm, or
using a CI tool, means we can allow it to run longer since it won't be
directly impacting developers. I'd love to see us do more of that in
general. It's great that we have good regression tests that can be run
fast and catch some things, but it seems likely that there'll always be
things that just take longer to test and having that done in an
automated fashion essentially 'in the background' would be great, so we
can get reports back and fix anything they find before release.
Thanks,
Stephen
Heikki Linnakangas <hlinnaka@iki.fi> writes:
That said, I don't think it's important to run the fuzzer in the
buildfarm. It should be enough to do that every once in a while, when
you modify the COPY FROM code (or something else that you want to fuzz
test). But we could easily include the test inputs generated by the
fuzzer in the regular tests. We've usually been very frugal in adding
tests, though, to keep the time it takes to run all the tests short.
Yeah, I think there's a lot of value in the fact that it doesn't
take too long to run the core regression tests, or even check-world.
Also, given you mentioned that this fuzzer bases its work partly
on code examination, it seems like the right procedure would be to
re-invoke the fuzzer after changes, not just blindly re-use the
test cases it made for the old code. So it seems like the thing
we want here is documentation or a test harness for using the
fuzzer, but not direct incorporation of test cases.
regards, tom lane
On Fri, Feb 05, 2021 at 12:45:30PM +0200, Heikki Linnakangas wrote:
Hi,
I've been mucking around with COPY FROM lately, and to test it, I wrote some
tools to generate input files and load them with COPY FROM:
Neat!
The way it's already produced results is impressive.
Looking at honggfuzz, I see it's been used for wire protocols, of
which we have several. Does testing our wire protocols seem like a
big lift?
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On 2/5/21 2:50 PM, Heikki Linnakangas wrote:
On 05/02/2021 21:16, Andrew Dunstan wrote:
On 2/5/21 10:54 AM, Stephen Frost wrote:
* Heikki Linnakangas (hlinnaka@iki.fi) wrote:
I ran it for about 2 h on my laptop with the patch I was working on
[2]. It
didn't find any crashes, but it generated about 1300 input files
that it
considered "interesting" based on code coverage analysis. When I
took those
generated inputs, and ran them against unpatched and patched
server, some
inputs produced different results. So that revealed a couple of
bugs in the
patch. (I'll post a fixed patched version on that thread soon.)I hope others find this useful, too.
Nice!� I wonder if there's a way to have a buildfarm member or other
system doing this automatically on new commits and perhaps adding
coverage for other things like the JSON code..Not easily in the buildfarm as it is today. We can easily create modules
for extensions and other things that don't require modification of core
code, but things that require patching core code are a whole different
story.It might be possible to call the fuzzer's HF_ITER() function from a C
extension instead. So you would run a query like "SELECT
next_fuzz_iter()" in a loop, and next_fuzz_iter() would be a C
function that calls HF_ITER(), and executes the actual query with SPI.That said, I don't think it's important to run the fuzzer in the
buildfarm. It should be enough to do that every once in a while, when
you modify the COPY FROM code (or something else that you want to fuzz
test). But we could easily include the test inputs generated by the
fuzzer in the regular tests. We've usually been very frugal in adding
tests, though, to keep the time it takes to run all the tests short.
This strikes me as a better design in any case.
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com