[tor-dev] Guardiness: Yet another external dirauth script
George Kadianakis
desnacked at riseup.net
Wed Sep 17 11:25:22 UTC 2014
Damian Johnson <atagar at torproject.org> writes:
>> - Q: Why do you slow stem instead of parsing consensuses with Python on your own?
>>
>> This is another part where I might have taken the wrong design
>> decision, but I decided to not get into the consensus parsing business
>> and just rely on stem.
>>
>> This is also because I was hoping to use stem to verify consensus
>> signatures. However, now that we might use Daniel's patch to populate
>> our consensus database, maybe we don't need to treat consensuses as
>> untrusted anymore.
>>
>> If you think that I should try to parse the consensuses on my own,
>> please tell me and I will give it a try. Maybe it will be
>> fast. Definitely not as fast as summary files, but maybe we can parse
>> 3 months worth of consesuses in 15 to 40 seconds.
>
> I'm not sure why you think it was the wrong choice. If Stem isn't
> providing you the performance you want then seems like speeding it up
> is the right option rather than writing your own parser. That is, of
> course, unless you're looking for something highly specialized in
> which case have fun.
>
> Nick improved parsing performance by around 30% in response to this...
>
> https://trac.torproject.org/projects/tor/ticket/12859
>
> Between that and turning off validation I'd be a little curious where
> the time is going if it's still too slow for you.
Indeed, our use case is quite specialized. The only thing the
guardiness script cares about is whether relays have the guard
flag. No other consensus parsing actually needs to happen.
However, you have a point that stem performance could be improved and
I will look a bit more into stem parsing and see what I can do.
That said, currently stem parses (with validation enabled) 24
consensuses in 25 seconds. That's one consensus per second.
If we are aiming for 7000 consenuses in less than a minute, we need to
parse 120~ consensuses a second. That will probably require quite some
optimization in stem, I think.
More information about the tor-dev
mailing list