[tor-dev] Getting visuals out of bridge bandwidth statistics

George Kadianakis desnacked at riseup.net
Wed Mar 25 23:10:46 UTC 2015


Hello,

I recently discovered that bridges report detailed bandwidth histories
in their extrainfo descriptors.

We should think about the privacy implications of these statistics
since they are quite fine grained (multiple measurements per day) and
some bridges don't have many clients (hence small anonymity set for
them).

Until then, I decided to visualize a bit those bandwidth histories, to
better understand how much bridges are used. This took a bit longer
than I expected, so I'm just going to show you some preliminary
results. Here is a graph!

https://people.torproject.org/~asn/bridges_bureau/bridges_daily_bandwidth.png

To generate this graph, I summed up the reported bandwidth histories
of each bridge descriptor to get that bridge's daily consumption. I
used the number of *read* bytes, and ignored the number of written
bytes. I discarded multiple descriptors of the same bridge and bridges
which did not report 24 hours worth of bandwidth history. I started
with about 7603 descriptors, in the end the graphs include about 4k
bridges.

Basically, this graph tells us that only very few bridges see big
amounts of traffic. Most bridges are concentrated on the left side of
the graph. If you can't see how many bridges contribute big bandwidth
because their column is so small, you can use those spikes that come
from the bottom of the graph. Each such spike is one bridge.

The graph is in megabytes, and its left side is too crowded. Here is
another graph that might be cleaner:

https://people.torproject.org/~asn/bridges_bureau/bridges_daily_bandwidth_pruned.png

This graph was created the same way, but I discarded the 200 highest
observations. This makes the left hand side of the graph cleaner. Here
we can see that most bridges see less than 50 MBs of traffic per day,
and there are even a few hundreds of them who see none or almost none.

Unfortunately, this concludes my studies for today. Maybe in the
future I will work on this a bit more, and also take in account the
pluggable transports that these bridges use as well as how many hours
in a day each bridge did not have any activity.

Questions and feedback on my methodology are welcome. If people want
to work on this, I'm happy to clean up and publish my Python scripts.

Cheers!


More information about the tor-dev mailing list