[tor-bugs] #2372 [BridgeDB]: Export BridgeDB's pool assignments

Tor Bug Tracker & Wiki torproject-admin at torproject.org
Tue Jan 25 09:12:14 UTC 2011


#2372: Export BridgeDB's pool assignments
-------------------------+--------------------------------------------------
 Reporter:  karsten      |       Owner:     
     Type:  enhancement  |      Status:  new
 Priority:  normal       |   Milestone:     
Component:  BridgeDB     |     Version:     
 Keywords:               |      Parent:     
-------------------------+--------------------------------------------------

Comment(by arma):

 I don't see an issue with the sanitization approach you describe. Again,
 the best plan there is probably to write up a quick summary of what
 exactly the transform is, and for the items in the sanitized form, why you
 believe they're safe and/or why you still want them. Then when that
 settles, publish some sample sanitized output and let people pick at it.

 One issue that comes to mind that we might want to research is how often a
 given bridge moves IP address. The method you describe above would lose
 that info, yes? Whereas if we do a keyed hash of the IP address (and never
 disclose the key), we could distinguish "same" from "different". I
 remember we had the keyed hash design in some other sanitization context,
 but I don't remember which one -- how is the idea working out in that
 other context?

 (It's possible that we already do the keyed hash for the regular bridge
 descriptors, so we would just need to match up the sha1(fingerprint) in
 this file with the sha1(fingerprint) in that file and we could look up the
 IP address. In which case maybe there's merit in doing the same keyed hash
 in both places, to ease the job of future researchers.)

 The main question that I want to answer with this data actually is "what's
 the correlation between which pool the bridge is in and whether that
 bridge sees a lot of use from a given country". My guess is there are
 periods of time where the http bridges are wildly popular in China, and
 then periods where they are pretty much unused (e.g. because they're not
 reachable). I wonder how it looks for other countries.

 (There are variations of this question that I also want to know the answer
 to, that don't require this data at all, such as "what's the correlation
 between the bridge's ORPort and its use in various countries".)

 As for changing bridgedb to export its pool assignments in this format,
 that's fine by me. It will be much easier for you to pick through than
 having somebody grep lines from logs. Ask Andrew which python dev person
 you should point at.

-- 
Ticket URL: <https://trac.torproject.org/projects/tor/ticket/2372#comment:1>
Tor Bug Tracker & Wiki <https://trac.torproject.org/>
The Tor Project: anonymity online


More information about the tor-bugs mailing list