[tor-bugs] #12085 [Tor Weather]: Develop a python wrapper for onionoo services

Thu Jun 5 04:23:08 UTC 2014

#12085: Develop a python wrapper for onionoo services
-----------------------------+-----------------------------
     Reporter:  sreenatha    |      Owner:
         Type:  task         |     Status:  needs_review
     Priority:  normal       |  Milestone:
    Component:  Tor Weather  |    Version:
   Resolution:               |   Keywords:  weather-rewrite
Actual Points:               |  Parent ID:
       Points:               |
-----------------------------+-----------------------------

Comment (by meejah):

 This was originally in an email thread; adding to trac for posterity.

 Hi Sreenatha,

 I'm taking a look through the code now, and making some notes /
 comments. Should I just put this on Trac (in future)? I can't recall
 that we talked specifically about code-review type stuff.

 Perhaps it should be on tor-dev if Trac isn't appropriate?
 Anyway.

 Overall, the code reads nice and straight-forward at first glance
 through.

 I like the use of memcached for keeping documents around!

 It would be nice to have some docstrings on at least some of the
 methods. For example, I don't understand the str type-checking
 vs. .encode('utf-8') json serialzation code (as in: why?) That said, I'm
 certainly not suggesting EVERYTHING needs a docstring -- "RelaySummary"
 conveys enough "documentation" by itself ;)

 I would force Python2 to use new-style objects everywhere (that is, all
 classes should derive from object, like "class Foo(object):"

 There should also be some unit-tests for this code. It's good practice
 to write tests. It is important in dynamic languages like Python to run
 every line just to make sure you don't have a typo, etc.

 You're using "requests"! Great :)

 I see that many of the classes are basically "just" copying (and
 possibly also re-arranging) attributes and so forth from "document"
 objects. This seems like a lot of duplicated/"boilerplate" code and it
 would be really nice to reduce this. One idea might be to create a
 base-class that can simply be told what attributes to "forward", and use
 a __getattr__ overide (or maybe @property decorators) to accomplish
 this. Something like:

     class _DocumentForwarder(object):
         doc = None # set by subclass, the document instance
         keys = dict() # from attribute name to document.get() name
         def __getattr__(self, k):
             if k in self.__dict__:
                 return self.__dict__[k]
             # we purposely let KeyError propagate up, if thrown
             doc_key = self.keys[k]
             # this lets us do "keys['foo'] = None" in subclasses instead
             # of "keys['foo'] = 'foo'"
             if not doc_key:
                 doc_key = k
             return self.doc.get(doc_key)

     class RelaySummary(_DocumentForwarder):
         def __init__(self, document):
             self.doc = document
             self.keys = dict(nickname='n', fingerprint='f', addresses='a',
 running='r')

     import unittest
     class Test(unittest.TestCase):
         def test_simple(self):
             mydoc = dict(n='blam', f='floom', a='amamamamama',
 r='ruuuuuuuuuuun')
             x = RelaySummary(mydoc)
             self.assertEqual(x.nickname, 'blam')
             self.assertEqual(x.fingerprint, 'floom')
             self.assertRaises(KeyError, lambda: x.n)

 On the other hand, it kind of seems like you're simply providing
 attribute-style access to the "document" dict, with some multi-level
 objects in a few spots (and possibly renaming a some things). I would
 ask whether this really gains anything over just returning the document,
 or wrapping it directly with attribute-style access provider
 instances. Questions that might help answer this would be:

 1. Will you need to know what type of document you have after calling
 get_response()? If code like "if type(resp) is Summary" can be avoided,
 I think that's A Very Good Thing Indeed :)

 2. Could we just run a couple renaming operations over the dict we get
 back from Onionoo, instead? If this is just convenience, that might work
 fine...

 3. Are the names of things in the Onionoo documents well-known (e.g. to
 Onionoo users, or tor-metrics-people)? If so, it might be better for the
 people most-expected to read this code to see the "real" document
 attribute names (attribute-style access or not).

 Also, if it's kept like it is, you should probably acknowledge in the
 docstring that it started from code by Lukas Erlacher (duk3luk3).

 This is probably more just a style-thing, but like the requests library,
 would a method-based API for "do a request to onionoo please" maybe work
 more naturally? I personally find "build an object, change the object,
 access result from object" less straightfoward than a method that takes
 the args it needs, and returns the result.

 I guess I'm saying a module-level method "get(request_instance)" would
 return a OnionooResponse... or even just "get(a, few, args)" would do
 that, and there's no Request object at all. If the code using this will
 do many of the very same (or really similar) requests, than perhaps
 having a Request object might make sense.

 thanks,
 meejah

--
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/12085#comment:3>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online