[tor-dev] Get Stem and zoossh to talk to each other
atagar at torproject.org
Fri Jul 31 17:00:27 UTC 2015
Hi Philipp, sorry about the delay! Spread pretty thin right now. Would you
mind discussing more about the use cases, and give a mockup for what this
new domain specific language would look like in practice?
My first thought is "would such a language be useful enough to be worth
investing time to learn?". I've made lots of things that flopped because
they didn't serve a true need, and while a domain specific language for
descriptors sounds neat I'm not sure if I'm seeing a need for it.
Roger occasionally asks me to write one-off scripts to answer questions
about the tor network, such as "how do the votes of dirauth X compare with Y?"
or "how many relays are unmeasured by the bandwidth auths?"...
These questions generally take me fifteen minutes or so to answer. Yes, yes,
I'm the author of Stem so that's skewed. But still, the descriptor APIs are
simple enough that anyone should be able to do much the same with only a basic
knowledge of Python.
> Ideally, zoossh should do the heavy lifting as it's implemented in a
> compiled language.
This is assuming zoossh is dramatically faster than Stem by virtue of being
compiled. I know we've discussed this before but I forget the results - with
the latest tip of Stem (ie, with lazy loading) how do they compare? I'd expect
time to be mostly bound by disk IO, so little to no difference.
> 1. Let zoossh do the data filtering and then return a list of files that
> are then parsed again by Stem. That's easy to implement, but can be
> quite inefficient if the filtering step still returns plenty of data.
Yup, agreed. This plan would essentially be to double parse the results and
I'd expect it to be far slower than using either library alone.
> 2. Have some IPC mechanism that passes objects from zoossh to Stem.
> Objects could be serialized in some way to minimize unnecessary
> parsing. While that might be the most efficient option for now, it
> probably requires too much work.
Again agreed. Theoretically possible - you could make a blank Stem descriptor
object, then populate its attributes with the zoossh parsed results. However,
this would require you to maintain the hand-built conversion function. And
again, I'm also doubtful it would yield a performance benefit in practice.
None of this is to say 'don't give it a shot'. If you think this would be a
fun project then by all means dig in! Just voicing my two cents that our
efforts might be better spent elsewhere. ;)
More information about the tor-dev