[tor-reports] Philipp's March 2015

Philipp Winter phw at torproject.org
Tue Mar 31 13:41:14 UTC 2015

- I filed bug 15178 for Atlas [0].

- I put more code and thought into the Sybil detector [1].  One aspect
  of it is to find a distance metric that can quantify the similarity
  between two given relay descriptors.  Once we have that, we can use
  nearest neighbour search algorithms to find the k relays that are the
  most similar to a given relay descriptor.

  So far, I experimented with the Levenshtein distance [2] as distance
  metric and a vantage point tree [3] as nearest neighbour search
  algorithm.  My current application of the Levenshtein distance is not
  very smart as it simply takes two raw relay descriptors as input and
  determines their distance.  The distance is the amount of string
  manipulations necessary to turn descriptor A into descriptor B.

  I'm currently looking at ways to preprocess the relay descriptors so
  we can incorporate our experience with past Sybils instead of just
  treating them as opaque string blurbs.  The challenging part is that
  the result still has to be a metric in the mathematical sense that
  satisfies a number of properties.

[0] <https://bugs.torproject.org/15178>
[1] <http://notebooks.nymity.ch/detecting_sybils.html>
[2] <https://en.wikipedia.org/w/index.php?title=Levenshtein_distance&oldid=654093637>
[3] <https://en.wikipedia.org/w/index.php?title=Vantage-point_tree&oldid=643855307>


