[tor-dev] txtorcon [GSoC 2013]

meejah meejah at meejah.ca
Tue Apr 16 18:21:03 UTC 2013


Anshul Singhle <anshul.singhle at gmail.com> writes:

> The aim of this project would be to use the protocol-parsing from stem
> instead of using the "re-implementation" in txtorcon. Since the
> protocol parsing is synchronous anyway(AFAICT) it makes sense to do
> this.

So far so good, yes. 

One thing to note is that while "most" of the parsing is basically
synchronous in txtorcon, some of the replies are huge. The one we care
about being "GETINFO ns/all" which gives TorState the initial list of
Router objects; there are ~3000 routers, so around 4*3000 lines to
parse and that's way too long to pause the reactor. So there is some
incremental parsing stuff in txtorcon (a state machine in TorState),
and this would likely be the "some modifications to stem" bit: to
provide a way to feed some stem object lines of a consensus and get
objects back out representing the routers (which would be given to
txtorcon.Router instead of the current way of passing some strings to
update() in router.py).

> Therefore, it would also make sense to identify other synchronous
> activities being done by txtorcon and use Stem to do those too (since
> we will be instantiating a stem object anyway).

Yes, very possibly!

> So as I see it, the majority of the changes will be in
> torcontrolprotocol.py(https://github.com/meejah/txtorcon/blob/master/txtorcon/torcontrolprotocol.py)
> and in that in the TorControlProtocol and TorProtocolError
> parts(mainly) 

Yes.
There will hopefully be some changes in TorState, to do with the
Router information I mention above (TorState keeps the list of Router
objects, which represent the Tor relays in the consensus).

> These classes have some functions that are used by others and some
> which are for internal use(i guess the ones with _ are internal if I
> got the naming convention right).

Yes.

> So we will just throw away the internal functions(should we?) and
> keep the external function api the same, the difference being these
> external functions will now call stem instead of doing the heavy
> lifting themselves.

Yes.

> I have a question about LineOnlyReciever - Are there other types of
> receivers which txtorcon doesn't implement but stem does? 

LineOnlyReceiver is a Twisted thing; it handles buffering etcetera and
delivers one line at a time to TorControlProtocol (via
lineRecieved). It could be that a different superclass is appropriate.
Stem uses the socket stuff directly, in a synchronous (threaded) way,
so I wouldn't expect any sharing to happen here.

Aside: note that the camelCase methods are all Twisted overrides and
the_underscore_ones are txtorcon's.

> So in essence i guess my questions is - What is the primary focus of
> this project - Is it to bring the protocol specific code in one place
> or to use Stem wherever possible? or both?

The main thrust of the project is to have one Python implmentation of
the protocol parsing code -- namely Stem. The reasoning here being
that Stem already parses a lot more things than txtorcon anyway (and I
have no intensions of re-implementing all that) and so it makes sense
to leverage that in txtorcon. From atagar's perspective, he'd like
more people exercising Stem code and we agree it's somewhat silly to
have two Python implementations of this, even if some of it is rather
simple.

If you can identify other things that make sense for txtorcon to use
from Stem, that's bonus. I would say, however, that I would forsee any
users who want pieces of functionality from Stem to "just use
Stem". That is, I don't want to *add* an API in txtorcon for loading +
parsing consensus files, etc. -- users who want that should just use
Stem (since there's no Python/Twisted async file APIs anyway). A good
example of this is the Twisted version of torperf.

By the same token, there will very likely be users of txtorcon who
don't care about stem functionality so txtorcon users shouldn't be
*forced* to learn about stem (installing it as a dependency is fine). 

I'm getting at the event stuff here: I would like it to remain
optional to receive the event text itself versus a stem
RouterStatusEntryV3 instance in an event callback (i.e. one registered
via TorControlProtocol.add_event_listener). The Twisted version of
torperf would make a fine place to "test out" this event stuff
somewhere that will actually be used/released (e.g. instead of another
example).

Super extra bonus points if we can a) add a zope.interface for the
callback (it lacks one currently) and b) use adaptors to figure out
which kind of object the listener wants. I haven't thought about this,
really, but could be fun (or not possible).

Thanks for the interest,
meejah


More information about the tor-dev mailing list