[tor-dev] Can we stop sanitizing nicknames in bridge descriptors?

Mon May 21 09:05:47 UTC 2012

On 5/19/12 11:41 AM, Sebastian G. <bastik.tor> wrote:
> Karsten Loesing, 16.05.2012 08:47:
>> On 5/2/12 2:30 PM, Karsten Loesing wrote:
>>> If nobody objects within the next, say, two weeks, I'm going to make an
>>> old tarball from 2008 available with original nicknames.  And if nobody
>>> screams, I'll provide the remaining tarballs containing original
>>> nicknames another two weeks later.
>>
>> Here we go.  These are the sanitized bridge descriptors from May 2008
>> including original bridge nicknames:
>>
>> http://freehaven.net/~karsten/volatile/bridges-2008-05-nicknames.tar.bz2
>>
> 
> Here we go with the similarities of bridge and relay nicknames.

Thanks for spending this much time on the analysis!

Here's what I did with your findings.txt:

- extract unique fingerprint pairs of relays and bridges that you found
as having similar nicknames,

- look through descriptor archives to see if relay and bridge were
running in the same /24 at any time in May 2008, and

- determine the absolute and relative number of bridges in a given
network status that could have been located via nickname similarity.

Results are that 24 of your 81 guesses (30%) were correct in the sense
that a bridge was at least once running in the same /24 as the relay
with similar nickname.  At any time in May 2008, you'd have located
between 1 and 6 bridges (2.5% to 18%) with 3 bridges (10%) in the mean
via nickname similarity.

I think it's acceptable to publish more recent bridge descriptors with
nicknames in a week from now.  Results may look quite different with
1000 bridges instead of 30.

Again, thanks for running this analysis!  Maybe you're interested in
automating your comparison and re-running it for a 2012 tarball?

Thanks,
Karsten