[tor-bugs] #2718 [Analysis]: Analyze Tor usage data for ways to automatically detect country-wide blockings

Tor Bug Tracker & Wiki torproject-admin at torproject.org
Fri Jul 8 09:08:11 UTC 2011


#2718: Analyze Tor usage data for ways to automatically detect country-wide
blockings
----------------------+-----------------------------------------------------
 Reporter:  karsten   |          Owner:          
     Type:  task      |         Status:  assigned
 Priority:  normal    |      Milestone:          
Component:  Analysis  |        Version:          
 Keywords:            |         Parent:          
   Points:            |   Actualpoints:          
----------------------+-----------------------------------------------------

Comment(by karsten):

 I finally had a closer look at George's code and graphs.  This looks like
 a great start!

 I wonder how we can move this forward.  My suggestion would be to describe
 George's approach in 2--3 pages of LaTeX and put daily updated graphs on
 the metrics website.  This would allow people to compare possible
 censorship events in our results to real-world events and give us some
 feedback.  Here's what this plan involves:

  * Write a tiny tech report describing George's censorship detector.  This
 report would briefly motivate the problem of detecting censorship based on
 our daily user number estimates and then dive into the math behind
 `detector.py`.  I could write this report, but it wouldn't be as accurate
 as if George wrote it.  Or I could make a start and George
 corrects/rewrites the parts that I got wrong.  Or George could write it
 himself.  George?

  * Run `detector.py` on the metrics server.  Should be straight-forward to
 make cron grab the latest `direct-users.csv` and run the script once per
 day.  I can take care of this.

  * Generate graphs using our own graphing engine.  We should use R and our
 own graphing engine to integrate the results more in our website.  I could
 imagine adding a checkbox "[ ] Show possible censorship events if
 available (BETA)" below the Source drop-down box on
 [https://metrics.torproject.org/users.html#direct-users].  The result
 would be that the graph doesn't only contain the estimated user number,
 but also a gray ribbon for the expected range and green/yellow points for
 upturns and downturns.  This would require disabling graph generation in
 `detector.py` and writing the expected user range per country and day to a
 file.  I hope my Python will be sufficient to do this.

  * Show a table of recent possible censorship events.  We could add a
 short table of countries with possible censorship events in the past
 months to the website.  This table would go on the same website below the
 "Update graph" button and have a BETA label, too.  There would also be a
 sentence above the table linking to the report as mentioned above.  The
 table content would be similar to the `summary.txt` file generated by the
 Python script.  I can write the necessary code to parse this file and put
 the content on the website.

 Is this plan reasonable?

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/2718#comment:8>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list