[tor-dev] Potential projects for SponsorR (Hidden Services)

Paul Syverson paul.syverson at nrl.navy.mil
Thu Oct 23 22:28:37 UTC 2014

Hi all,

NRL is effectively partnered with the Tor Project Inc. for the
SponsorR efforts.  Our (NRL's) tasking is largely overlapping and
somewhat complementary to that of TPI. As such I thought it would be
good to mention the basics of what we are working on to better inform
and coordinate the planning George et al. have begun discussing
in this thread.

Our task are

1. to identify which statistics about hidden services can be collected
and reported without harming user security.

This is also directly part of TPI's tasking, and I expect we will be
collaborating on this directly. We will be working on this probably
starting in c. a month.

2. to develop passive measurement techniques to measure information
about hidden services. This would, for example, allow the collection
of information about the relative popularity of different types of hidden
services, for example what fraction of hidden service connections are
for highly interactive connections vs. large data downloads vs. etc.
Also developing techniques to infer global activity from local observations.

Some of this has already begun. Roger deployed a month ago on a few
relays testing to see if a connection was for HSes vs. something else.
And we did some initial analysis on the global projection based on
estimation of how much bandwidth those relays saw, which varied wildly,
although there are lots of potential explanations for that.
Roger has also already in this thread touched on some statistics that
are interesting but require thought before deciding how/if to collect

A primary focus of NRL's work between now and the end of the year has
been and will be on devising a secure and accurate relay bandwidth
measurement scheme, with an emphasis on something that should be much
better than what is now available but also practical and compatible
enough that it could be rolled out in Tor w/in c. a year (and we'll
also be considering designs that are less directly implementable but
more theoretically solid). This is one of Tor's biggest current
vulnerabilities. It is pretty easy to get fake inflated BW numbers so
as to have a consensus weight that allows you to observe amounts of
traffic quite disproportionate to the amount you have actually been
carrying in the past. There have been many published attacks based on
bandwidth inflation, and Tor's current torflow design was not intended
to be secure---and could use some accuracy attention as well. This
also becomes important in the context of gathering HS statistics. If
we are going to be deploying statistic gathering code in a way that is
safe for users and hidden services, it is not enough to say what
statistics are safe to honestly collect. We also need to make Tor's
system of data gathering for those statistics robust to abuse. And one
of the easiest ways to abuse statistics gathering to undermine user
and service security is to manipulate BW attribution to increase the
raw data is available to malicious entities. Of course any statistics
that rely on accurate BW measurement will benefit from this work as

3. Designing and testing HS performance improvements, particularly as
they affect the crawling and measuring activities on HSes that SponsorR
is interested in.

Again we expect lots of collaboration in this area, although our focus
will be on the above first.

4. Evaluate planned and future changes to HSes for security and
performance, particularly to see how intended SponsorR measuring,
crawling, and indexing techniques for HSes may be affected. For
example, a technique that assumed directories could know when a new HS
is listed would be affected by design changes in proposal 224.

Same comment as for task 3.

More information about the tor-dev mailing list