Hi Philipp, sorry about the delay! Spread pretty thin right now. Would you mind discussing more about the use cases, and give a mockup for what this new domain specific language would look like in practice?
My first thought is "would such a language be useful enough to be worth investing time to learn?". I've made lots of things that flopped because they didn't serve a true need, and while a domain specific language for descriptors sounds neat I'm not sure if I'm seeing a need for it.
Roger occasionally asks me to write one-off scripts to answer questions about the tor network, such as "how do the votes of dirauth X compare with Y?" or "how many relays are unmeasured by the bandwidth auths?"...
https://stem.torproject.org/tutorials/examples/compare_flags.html https://stem.torproject.org/tutorials/examples/votes_by_bandwidth_authoritie...
These questions generally take me fifteen minutes or so to answer. Yes, yes, I'm the author of Stem so that's skewed. But still, the descriptor APIs are simple enough that anyone should be able to do much the same with only a basic knowledge of Python.
Ideally, zoossh should do the heavy lifting as it's implemented in a compiled language.
This is assuming zoossh is dramatically faster than Stem by virtue of being compiled. I know we've discussed this before but I forget the results - with the latest tip of Stem (ie, with lazy loading) how do they compare? I'd expect time to be mostly bound by disk IO, so little to no difference.
- Let zoossh do the data filtering and then return a list of files that are then parsed again by Stem. That's easy to implement, but can be quite inefficient if the filtering step still returns plenty of data.
Yup, agreed. This plan would essentially be to double parse the results and I'd expect it to be far slower than using either library alone.
- Have some IPC mechanism that passes objects from zoossh to Stem. Objects could be serialized in some way to minimize unnecessary parsing. While that might be the most efficient option for now, it probably requires too much work.
Again agreed. Theoretically possible - you could make a blank Stem descriptor object, then populate its attributes with the zoossh parsed results. However, this would require you to maintain the hand-built conversion function. And again, I'm also doubtful it would yield a performance benefit in practice.
None of this is to say 'don't give it a shot'. If you think this would be a fun project then by all means dig in! Just voicing my two cents that our efforts might be better spent elsewhere. ;)
Cheers! -Damian