[metrics-team] M.Sc projects at Edinburgh

Karsten Loesing karsten at torproject.org
Wed Jan 13 16:15:57 UTC 2016

Hash: SHA1

On 13/01/16 15:24, William Waites wrote:
> On Wed, 13 Jan 2016 15:13:16 +0100, Karsten Loesing
> <karsten at torproject.org> said:
>>>> Historical analysis of gaming the HSDir hash ring
>> But did you already turn this idea into a proposal, or do you 
>> need anything else from David or somebody else from the metrics 
>> team?
> It was already nearly a fully formed proposal, so nothing in 
> particular needed other than to know if David or someone else wants
> to co-supervise.

Sounds good!

>>>> 3. Exposed bad relays
>> Cool!  Same question as above, did you already write a proposal 
>> using the text I gave you, or do you need anything else from Tor 
>> people?
> I put it as-is into the system, with the intention of working on
> the text a bit, but anything you can do to make it better would be 
> appreciated if you have time.

There's some text below that you might be able to use.

All the best,

Analysis of Exposed Bad Relays in the Tor Network


Self-Proposed: No

Supervisor: TBD

Other Suggested Supervisors: Karsten Loesing, karsten at torproject.org

Subject Areas:

Principal goal of the project:

Analyze how bad relays are excluded from the Tor network

Description of the project:

The Tor network is the largest deployed anonymity network to date.
There are currently nine Tor directory authorities which exchange
votes once per hour and which form a consensus on which relays are
part of the network.  If a relay is found to be bad and shall be
rejected from the network, at least five of these directory
authorities need to reject it.  This usually doesn't happen instantly,
because it requires five human beings in up to five different time
zones to log into their servers, edit a config file, and reload the
tor process.  Once they do, the directory authorities stop listing a
relay in their vote.  But other than that there's no public record
that a relay was rejected on purpose.

There exist early visualizations of relays that were supposedly
rejected on purpose.  The first example visualization linked below can
be interpreted as follows.  The image visualizes when a relay was
first listed in at least 2/3 of votes and over a time period of at
least 6 hours drops to being listed in at most 1/3 of votes.  These
criteria are somewhat arbitrary and not fool-proof, as they may not
detect all rejected relays and as they may include relays that were
not rejected on purpose.  The example shows relays in 50.7/16 that
have been running in the first half of 2014 as part of a larger Sybil
attack on the Tor network before being rejected by most directory
authorities in July 2014.  The example shows that this rejection
caused some minor collateral damage with relay minisausage being
rejected even though it was likely not part of the Sybil group: it was
started before the Unnamed relays, it kept running a bit longer after
being rejected, and it used a nickname that was not Unnamed.  This
visualization provides a first insight into how relays are being
rejected from the Tor network, and it's an attempt to make the
operation of directory authorities more transparent to the community.

The other examples cover different forms of treating bad relays, which
includes a) relays that got the BadExit flag assigned, b) relays that
got the Valid flag removed, and c) relays that got outright rejected
(same as above).  The goal in making all those visualizations was to
only use publicly available data.

All these examples show that the Tor directory authorities sometimes
disagree more than one would expect.  It's possible that one can
derive robust criteria for saying when a relay was exposed as bad
relay, or it might be that machine learning would be necessary to say
this with high enough confidence.

Resources Required:

Degree of Difficulty:

Suitable to be undertaken by more than one student independently? No

Completion Criteria: A plausible algorithm for detecting relays that
have been excluded or otherwise treated as bad relays by the Tor
directory authorities, a free software implementing that algorithm,
and a technical report with an evaluation of the algorithm.

Essential Skills and Knowledge: Data analysis

Desirable Skills and Knowledge: Machine learning; one out of Java,
Python, or Go

Ethical issues involved in this project? No

Plan for ethical issues:

Does this project have any potential health and safety issues? No

Does this project have any additional costs? No








> Cheers, -w
> -- William Waites <wwaites at tardis.ed.ac.uk>  |  School of
> Informatics https://tardis.ed.ac.uk/~wwaites/      | University of
> Edinburgh https://hubs.net.uk/             |      HUBS AS60241
> The University of Edinburgh is a charitable body, registered in 
> Scotland, with registration number SC005336.

Comment: GPGTools - http://gpgtools.org


More information about the metrics-team mailing list