[tor-bugs] #2718 [Metrics]: Analyze Tor usage data for ways to automatically detect country-wide blockings

Thu Mar 24 16:11:19 UTC 2011

#2718: Analyze Tor usage data for ways to automatically detect country-wide
blockings
---------------------+------------------------------------------------------
 Reporter:  karsten  |          Owner:          
     Type:  task     |         Status:  assigned
 Priority:  normal   |      Milestone:          
Component:  Metrics  |        Version:          
 Keywords:           |         Parent:          
   Points:           |   Actualpoints:          
---------------------+------------------------------------------------------

Comment(by karsten):

 George Danezis and I discussed the idea of a censorship detector in
 private mail.  I'm moving that discussion here with his permission:

 I spent the past few days looking at the .csv with the numbers of users of
 Tor from different domains. I think I have a first algorithm that will
 allow you to analyse drops in usage, and raise automatic alarms if they
 are "unexpected". I attach for example the graphs for china (an obvious
 example) but also Tunisia (which is more subtle):
 http://ephemer.sec.cl.cam.ac.uk/~gd216/tor_detect/img/019-cn-censor.png
 http://ephemer.sec.cl.cam.ac.uk/~gd216/tor_detect/img/003-tn-censor.png

 You can find all  the graphs in which I detected possible censorship in
 the past 6 months.
 http://ephemer.sec.cl.cam.ac.uk/~gd216/tor_detect/img/

 How does this work? Here are a few notes:

 - [Wide Variation] It seems that the Tor network is not very stable
 generally -- numbers of users fluctuate from one day to another, sometimes
 with no obvious reason and across jurisdictions. This maybe due to relays
 coming and going or the capacity of the network going up or down for other
 reasons. For this reason it is not reliable to look at the numbers of
 users from each jurisdiction and infer if they are lower than normal --
 since normal varies widely.

 - [Difficult to study each series separately] Given the above we define
 "normal" from one day to the next, as following a global trend i.e.
 changing in a similar fashion with the rest of the network, and other
 jurisdictions. This means the current censorship detection algorithm will
 fail of Tor numbers start going globally down at the same time there is a
 censorship event, of if somehow an attack globally cripples the network.

 - [Modelling inter-domain trends] It turns out there is significant
 variance between jurisdictions as to whether the trends of users are going
 up and down from one time to another. Therefore we look at the trends of
 the 50 jurisdictions with the most users, eliminate outliers, and define a
 normal distribution fitting the rest of the trends (as a percentage
 change). This is part of our model for predicting the number of users at
 any specific jurisdiction: from one time period to the next the numbers
 should fluctuate within the plausible range of the global trend -- i.e.
 within the most likely values (probability 0.9999) of this normal
 distribution.

 - [Small-sample random variation] Some jurisdictions have very few users,
 and this will create false alarms -- a drop of 50% when you have 5000
 users is a significant event, whereas the same drop when you see 10 users
 might just be due to random fluctuation. Therefore we model the number of
 users at any point in time as a Poisson distribution with mean the
 observed number of users. This takes care of the random fluctuation. This
 is the second key part of our prediction model: instead of taking into
 account past numbers at face value, we instead consider that the true
 number of users lies somewhere in the most likely region of the poisson
 distribution (again the region of 0.9999 probability).

 - [Full Model] Given the Poisson model for numbers of users at a given
 time, and the model for trends, we can predict the region within which we
 expect the next value of users to be (with probability about 0.999). In
 the graphs this is the grey area. When the next observation falls within
 the prediction, we do not raise an alarm, when it falls outside a
 prediction we raise an alarm. Green dots are unexpected up-trends, and
 orange dots are unexpected downtrends (possible censorship).

 - [Results] You can see the graphs for all jurisdictions in which I
 detected a potential censorship even in the past 6 months. They are
 lexicographically ordered in terms of number of possible censorship
 events. The top ones are ("down" are potential censorship events, "up"
 unexpected rises, and "affected" the number of users on the last day
 observed in the jurisdiction):

 =======================
 Report for 2010-09-11 to 2011-03-15
 =======================

 cn -- down: 19 (up: 28 affected: 728)
 mm -- down: 18 (up: 14 affected: 50)
 si -- down: 16 (up: 17 affected: 254)
 ir -- down: 13 (up: 23 affected: 8168)
 ph -- down: 13 (up: 32 affected: 4265)
 hr -- down: 13 (up: 15 affected: 284)
 eg -- down: 10 (up:  7 affected: 673)
 kr -- down:  9 (up: 12 affected: 22434)
 pk -- down:  8 (up:  9 affected: 385)
 zw -- down:  8 (up:  8 affected: 18)
 tw -- down:  7 (up:  7 affected: 2178)
 ba -- down:  7 (up:  7 affected: 63)
 ly -- down:  7 (up: 11 affected: 10)
 cm -- down:  5 (up:  2 affected: 23)
 tz -- down:  5 (up:  8 affected: 18)
 ga -- down:  5 (up:  3 affected: 4)
 rs -- down:  4 (up:  3 affected: 250)
 et -- down:  4 (up:  5 affected: 150)
 mk -- down:  4 (up:  5 affected: 49)
 tn -- down:  3 (up:  4 affected: 517)
 lb -- down:  3 (up:  6 affected: 92)
 dj -- down:  3 (up:  1 affected: 19)
 vc -- down:  3 (up:  3 affected: 6)
 fo -- down:  3 (up:  3 affected: 2)
 vn -- down:  2 (up:  3 affected: 1549)
 sy -- down:  2 (up:  0 affected: 569)
 bd -- down:  2 (up:  4 affected: 457)
 aw -- down:  2 (up:  2 affected: 14)
 zm -- down:  2 (up:  2 affected: 7)
 gy -- down:  2 (up:  4 affected: 4)
 ls -- down:  2 (up:  1 affected: 2)

 - [Estimation delay window] One parameter of the model is the length of
 the time periods. In other words: are we trying to model from today's
 numbers of users what is going on tomorrow, or what is going on next week?
 The previous days gives nice tight predictions, BUT some jurisdictions
 show a really freakish weekly pattern -- thus I settled for a 7 day
 window. This means that the value for today is used to predict the value
 of the same day a week in the future.

 - [Freakish weekly patterns] Some jurisdictions show a very strange weekly
 pattern, that even the 7-day window detector sometimes mistakes (?) as an
 attack. Have a look at the series for "Kr" (South Korea): there is a
 weekly variation between 5K to 20K users -- high in week days and low in
 the weekend. This is not typical -- it is the only jurisdictions where
 such a variation is observed. Do you know why that is? Other jurisdictions
 with similar pronounced weekly patterns are: tw (Taiwan), et (Ethiopia),
 id (indonesia). What is going on there?

 - [Blind spots] The detector looks for differences in the numbers of users
 within a jurisdiction as well as across them, and detects any anomalies.
 This means that the alerts are raised when there is a change -- if you
 have been under censorship for ever there is no unexpected drop and no
 alert is raised. Similarly if for the time window chosen the rate of
 change falls within the expected range (which can be significant) no alert
 is raised. A cunning censor (with time in their hands) will lower the
 numbers slowly enough to evade detection given a short window -- I
 recommend you run the algorithm with multiple windows to detect such
 strategies. It is also difficult to tell if tor is blocked or the whole
 country is offline (see Libya over the past few weeks).

 - [Validation] Needless to say I have no labelled data to validate the
 results I get. They vaguely "make sense" but of course how do we know if
 some alerts reported are in fact artefacts of a poor prediction model.
 (See the jurisdictions with weekly trends for example). In some respects
 this does not really matter: in practice the algorithm gives at most a
 handful of reports every day, so it is small enough for a human to "keep
 an eye" on the reports and make a call about their severity (given input
 from other sources as well -- like news for example).

 - [Early warnings] Even events that do not follow a trend might give you
 "early warning" -- Burma for example in April 2010 shown an alert followed
 by a rise in users, then followed by a massive crash and censorship. Iran
 (not shown) also gives a couple of alerts more than a year ago, that may
 have been tests of the system they now use all the time?

 - [Code] All of the above is implemented using an ugly 300-line python
 script with dependencies on scipy, numpy and matplotlib. I am cleaning it
 up and will be happy to pass it on once it is stable and pretty.

 - [Model refinement] This is a first, rough-and-ready model that I plan on
 refining further: (a) automatically select the time-window (b) learn the
 traffic self-similarity (c) offer a full Bayesian model + a particle
 filter based sampler for whether an unexpected event is occurring. I would
 be most happy for any feedback on this initial model -- what is useful,
 what is useless, do you want more / less sensitivity, do you know of
 events not detected, other sources of information for prediction, etc.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/2718#comment:2>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online