[tor-dev] Get Stem and zoossh to talk to each other
phw at nymity.ch
Tue Jul 28 21:01:05 UTC 2015
I'm interested in building a lightweight, internal domain-specific
language to explore archived Tor data. The goal is to make it easy to
answer questions like the one that recently came up on tor-relays, "how
many guards shift location significantly across the Internet, and how
often?" Combining Stem and zoossh seems like a good solution.
Ideally, zoossh should do the heavy lifting as it's implemented in a
compiled language. For data exploration, however, having a Stem-enabled
Python shell with a set of analysis methods sounds better. Now the
question is how to pass potentially large amounts of readily-parsed
consensuses and descriptors from zoossh to Stem? In a perfect world, we
would have bindings to use zoossh in Python. The gopy  folks are
working on that, but it's a young project; interfaces are not yet
supported. Two workarounds come to mind until gopy catches up, both
requiring some glue code:
1. Let zoossh do the data filtering and then return a list of files that
are then parsed again by Stem. That's easy to implement, but can be
quite inefficient if the filtering step still returns plenty of data.
2. Have some IPC mechanism that passes objects from zoossh to Stem.
Objects could be serialised in some way to minimise unnecessary
parsing. While that might be the most efficient option for now, it
probably requires too much work.
3. ...something else I didn't consider?
Please let me know if you have any thoughts.
More information about the tor-dev