[tor-relays] Metrics for assessing EFF's Tor relay challenge?

Sat Apr 5 14:58:27 UTC 2014

On 05/04/14 12:19, Lukas Erlacher wrote:
> Hi Karsten,
> 
> On 04/05/2014 09:58 AM, Karsten Loesing wrote:
>> On second thought, and after sleeping over this, I'm less
>> convinced that we should use an external library for the caching.
>> We should rather start with a simple dict in memory and flush it
>> based on some simple rules. That would allow us to tweak the
>> caching specifically for our use case. And it would mean avoiding
>> a dependency. We can think about moving to onion-py at a later
>> point. That gives you the opportunity to unspaghettize your code,
>> and once that is done we'll have a better idea what caching needs
>> we have for the challenger tool to decide whether to move to
>> onion-py or not. Would you still want to help write the simple
>> caching code for challenger?
> 
> I cleaned up the caching code and added a simple in-memory dict
> caching provider that has no further dependencies to onion-py. (it
> also has no provisions for eviction/flushing at all, but I will add
> that next. Right now everything is cached forever, but of course a
> new response from OnionOO replaces an old one.)

Yeah, I think we'll want to define a maximum lifetime of cache
entries, or the poor cache will explode pretty soon.

> I can write the OnionOO API code and caching code for challenger,
> if I can use Python 3 and the requests library. (See below)

Great, your help would be much appreciated!  Want to send me a pull
request whenever you have something to merge?

See my response regarding Python 3 below.

> Of course I'd really like to actually have a user for onion-py,
> since it would help getting the necessary feedback and polish to
> push the library to version 1.0, but I understand if that isn't
> appropriate for this project.

My hope with challenger is that it's written quickly, working quietly
for a year, and then disappearing without anybody noticing.  I'd
rather not want to maintain yet another thing.  So, maybe Weather is a
better candidate for using onion-py than challenger.

>>> I don't really understand what the code does. What is meant by 
>>> "combining" documents? What exactly are we trying to measure?
>>> Once I know that and have thought of a sensible way to
>>> integrate it into onion-py I'm confident I can infact write
>>> that glue code :)
>> Right now, the script sums up all graphs contained in Onionoo's 
>> bandwidth, clients, uptime, and weights documents.  It also
>> limits the range of the new graphs to max(first) to max(last) of
>> given input graphs.
>> 
>> For example, assume we want to know the total bandwidth provided
>> by the following 2 relays participating in the relay challenge:
>> 
>> datetime:  0, 1, 2, 3, 4, 5, ...
>> 
>> relay 1:     [5, 4, 5, 6] relay 2:  [4, 3, 5, 4]
>> 
>> combined:    [8, 9, 9, 6]
>> 
>> This is not perfect for various reasons, but it's the best I came
>> up with yesterday.  Also, as we all know, perfect is the enemy of
>> good.
>> 
>> (If you're curious, reason #1: the graph goes down at the end,
>> and we can't say whether it's because relay 2 disappeared or did
>> not report data yet; reason #2: we're weighting both relays' B/s
>> equally, though relay 1 might have been online 24/7 and relay 2
>> only long enough that Onionoo doesn't put in null; there may be
>> more reasons.)
> 
> Ah, I see! :) So for scalar attributes of relays (such as
> consensus_weight_fraction) it's just a sum, and for histories it's
> the graphs combined as you just outlined. That makes sense, thank
> you!

Right.  Though details documents are not included, so just graphs, no
scalar attributes.

>> I'm not also sure about Python 3.  Whatever we write needs to run
>> on Debian Wheezy with whatever libraries are present there.  If
>> they're all Python 3, great.  If not, can't do.
> 
> I would strongly prefer to use Python 3. I understand wanting to
> use debian stable (I use it myself), but Python 3 is 6 years old
> and Python 2 is completely dead and its use for new projects is not
> recommended. The only mandatory dependency for onion-py, and for
> me, is requests (I really dislike using urllib* directly - if you
> want to know why, check
> https://gist.github.com/kennethreitz/973705), and the
> python3-requests package in Wheezy is from 2012, and there is no
> python3-flask. :-(
> 
> Is there anything standing against using pip (python3-pip package)
> to install requests and flask from pypi?

If there's a way to build it only with packages coming out of Wheezy's
apt-get, our sysadmins will like us more, and that's a good thing.

Installing packages using Python-specific package managers is going to
make our sysadmins sad, so we should have a very good reason for
wanting such a package.  In general, we don't need the latest and
greatest package.  Unless we do.

All the best,
Karsten