[tor-bugs] #2718 [Metrics]: Analyze Tor usage data for ways to automatically detect country-wide blockings

Tor Bug Tracker & Wiki torproject-admin at torproject.org
Fri Mar 25 12:46:57 UTC 2011


#2718: Analyze Tor usage data for ways to automatically detect country-wide
blockings
---------------------+------------------------------------------------------
 Reporter:  karsten  |          Owner:          
     Type:  task     |         Status:  assigned
 Priority:  normal   |      Milestone:          
Component:  Metrics  |        Version:          
 Keywords:           |         Parent:          
   Points:           |   Actualpoints:          
---------------------+------------------------------------------------------

Comment(by karsten):

 Replying to [comment:4 George Danezis]:
 > [Individual relays] Roger mentioned to me that you have data from
 individual relays that is quantised to the closest power of 2. I have not
 yet started thinking about using those. We could pretend indeed that they
 each are a "separate tor" and run the detector on them -- this is still
 likely to miss instances of censorship if the numbers are very low. It is
 something we will have to investigate after seeing the data feeds.

 I'm attaching a CSV file of the directory requests that relays see and
 report.  The format is as follows:

  - fingerprint: Hex-formatted SHA-1 hash of identity fingerprint
  - statsend: ISO-formatted time when the stats interval ends
  - seconds: Stats interval length in seconds, typically 24 hours
  - ??: Directory requests that could not be resolved
  - a1: Directory requests from anonymous proxies
  - a2: Directory requests from satellite providers
  - ad: Directory requests from Andorra
  - ae: Directory requests from the United Arab Emirates
  - [...] See ISO 3166-1 alpha-2 country codes
  - zw: Directory requests from Zimbabwe
  - zy: Total directory requests from all countries

 The request numbers are rounded up to the next multiple of 8, minus 4.
 That is, 1 to 8 requests == "4", 9 to 16 requests == "12", 17 to 24
 requests == "20", etc.

 > [Better model] You are right that we can make a more complex model that
 combines observations from the past 1, 7 or 28 days -- I am happy to work
 on that next.

 Yes, that would be interesting.  I don't think that whatever happend 28
 days ago can have a significant influence on the prediction, but I don't
 really know.

 > [Code] I attach the python code after the most mild clean up. Right now
 it has no interface and you have to manually edit the source to get to the
 variables of interest:

 I added your code to the metrics-tasks repository
 [https://gitweb.torproject.org/metrics-tasks.git/tree/HEAD:/task-2718
 here].

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/2718#comment:6>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list