Hi George,
I'm still trying to work out exactly how to go about fuzzing Tor. So far, I've been to defined an initial problem space, configured a test environment, and explored some fuzzing libraries / tools.
Fuzzing Problem Space
We started by looking at fuzzing Tor directory download requests over HTTP. This seemed like a manageable initial chunk of work.
I sketched an early draft design[0], and developed an annotated grammar for Tor directory requests[1] based on the Tor directory spec[2].
Configuring a Test Environment
I created scripts and configuration files to build, run and monitor a local-only tor directory cache.[3] I configured hardened, typical, and feature-full builds, all with dmalloc to catch memory access errors. I plan to start testing on the feature-full build, then re-test failures against the builds with smaller surfaces.
Exploring Fuzzing Tools / Libraries
JBroFuzz[4] is a HTTP / browser fuzzer written in Java. (tor-research-framework is also written in Java.) It focuses on exhaustive iteration through a set of alternatives.
I anticipate using JBroFuzz as a library to generate correct and near-correct requests to a tor instance. It may also be useful interactively, although the included analysis / graphing tools didn't work in my build. (And it's listed as "inactive".)
Thanks for the pointer to radamsa[5], it looks like it's much more structure-aware than many of the other black-box fuzzes I've looked at. I could imagine using radamsa to mutate correct requests on their way to tor. (JBroFuzz isn't designed to do random mutation.)
I've also looked at at zzuf[6], which appears unloved, but is incorporated into the CERT Basic Fuzzing Framework[7]. The set of zuff/BFF mutations is limited to bit-level flips, and isn't syntax-aware. Hooking into all the target process' file descriptors is neat, but it requires a local(-only) process to fuzz.
Results
Certain parts of directory request URLs get written to the tor debug log unescaped.[8] But the effect is severely limited: it simply makes reading the log in a terminal irritating (including BEL, DEL, and CR effects - the BEL effect was how I discovered the issue).
Further Work
These are the tasks I can think of:
1. Write JBroFuzz iterators for valid Tor directory requests
2. Mutate valid requests with radamsa / BFF / ?
3. Define "failure" of tor to process a request "correctly" (crashes? memory access? more?)
4. Configure dmalloc on fuzzing target builds (to crash? on "failure")
5. Automate request logging and failure identification
6. Work out how to confirm and report failures responsibly
Hi George
Thanks for your reply and information+links. Tim (cc-ed) is leading the work on the fuzzer and is looking at a couple of different frameworks. I've set up a example that can do port-forwarding to a BEGIN_DIR service - so you can just point a fuzzer at the local port - this opens up a wider range of potential targets (some paths on the directory service are over Tor only) .
The framework implements the tor protocol so should be easy to modify to do fuzzing of the actual protocol but I'm skeptical how successful this would be, I can only think of a couple of places that could be error prone.
Looking through the source, I agree that there's a very large surface area and also there's a lot of manual string manipulation which is potentially error prone. It's reassuring that you've already found bugs this way, it suggests the route isn't a complete dead-end.
I've cc-ed Tim, so he might pick your brains !
Thanks
Gareth