[tor-dev] Fwd: Re: Can we stop sanitizing nicknames in bridge descriptors?

Karsten Loesing karsten at torproject.org
Tue May 29 17:43:54 UTC 2012


On 5/26/12 9:30 AM, Sebastian G. <bastik.tor> wrote:
> Karsten Loesing, 22.05.2012 09:24:
>>> Unless one objects or you disagree I'm going to upload the files I
>>> created and explain how and maybe I can say even why.
>>
>> No objections at all.  Open discussion is good.
>>
>>> I created a Blog, just because I wanted it some when in the past, but
>>> found it silly. That's the channel I planed to use. Maybe it's OK to put
>>> it on a Tor-List as well, but maybe it's considered as noise.
>>
>> I wonder if the Tor wiki would be a better place to collect ideas for
>> reversing the bridge descriptor sanitizing process.  Feel free to grab a
>> new page in doc/ and start describing what you did.
>>
> 
> I did just that.
> 
> https://trac.torproject.org/projects/tor/wiki/doc/DataExtractionForComparison

Thanks for creating that page.  Looks line a fine start, though you'll
want to automate more things when looking at 2012 tarballs.

grep and friends are fine tools to process Tor descriptors.  If you can,
find a Unix/Linux-like environment for Windows (Cygwin?) and combine the
powers of grep with sort, uniq, and maybe sed or awk.  These tools are
friggin' fast!

If you're comfortable with Java and want to do more fancy stuff with Tor
descriptors, take a look at metrics-lib:

https://gitweb.torproject.org/metrics-lib.git

If you're a Python person, you'll like stem, even though it only
implements parsing of a subset of Tor descriptors.  More to come soon:

https://gitweb.torproject.org/stem.git

Best,
Karsten


More information about the tor-dev mailing list