[tor-dev] Can we stop sanitizing nicknames in bridge descriptors?
karsten at torproject.org
Mon May 21 09:05:47 UTC 2012
On 5/19/12 11:41 AM, Sebastian G. <bastik.tor> wrote:
> Karsten Loesing, 16.05.2012 08:47:
>> On 5/2/12 2:30 PM, Karsten Loesing wrote:
>>> If nobody objects within the next, say, two weeks, I'm going to make an
>>> old tarball from 2008 available with original nicknames. And if nobody
>>> screams, I'll provide the remaining tarballs containing original
>>> nicknames another two weeks later.
>> Here we go. These are the sanitized bridge descriptors from May 2008
>> including original bridge nicknames:
> Here we go with the similarities of bridge and relay nicknames.
Thanks for spending this much time on the analysis!
Here's what I did with your findings.txt:
- extract unique fingerprint pairs of relays and bridges that you found
as having similar nicknames,
- look through descriptor archives to see if relay and bridge were
running in the same /24 at any time in May 2008, and
- determine the absolute and relative number of bridges in a given
network status that could have been located via nickname similarity.
Results are that 24 of your 81 guesses (30%) were correct in the sense
that a bridge was at least once running in the same /24 as the relay
with similar nickname. At any time in May 2008, you'd have located
between 1 and 6 bridges (2.5% to 18%) with 3 bridges (10%) in the mean
via nickname similarity.
I think it's acceptable to publish more recent bridge descriptors with
nicknames in a week from now. Results may look quite different with
1000 bridges instead of 30.
Again, thanks for running this analysis! Maybe you're interested in
automating your comparison and re-running it for a 2012 tarball?
More information about the tor-dev