Matthew Finkel transcribed 1.6K bytes:
On Sat, Oct 25, 2014 at 01:01:52PM +0200, Karsten Loesing wrote:
On 24/10/14 01:53, isis wrote:
isis transcribed 6.6K bytes:
- The hashed fingerprint (as is the case for bridges in onionoo)
- The hashed ip:port
Actually, my apologies, I was quite tired when I wrote this and totally completely wrong.
A hashed ip:port would be a terrible idea because IPv4 space is only 2^32 and ports are 2^16. In total that's a 2^48 message space. Hashing for a preimage to get the bridge addresses in quite feasible in those constaints, as well as precomputing the attack offline.
We should come up with a different way to hide ip:ports.
I'm lacking context, but just in case this is even remotely relevant, here's how CollecTor sanitizes bridge IP addresses:
https://collector.torproject.org/formats.html#bridge-descriptors
Yes, this is very relevant, thanks! Currently our plan involves keying the JSON dataset using unsanitized "IP Address:port" internally and the sanitized public version will replace this key with H(H(fingerprint)). This seems like the easiest way to avoid the problem of leaking the IP address.
At this point, we don't think we need an IP address in the resulting dataset, so a unique, linkable fingerprint seems sufficient. If we find that IP addresses are useful then Collector's algorithm seems like a good starting point.
I agree that we could probably do without any IP:port information in the resulting reports. The hashed fingerprint is enough for BridgeDB to deduce a bridge's IP:ports; it should also be enough for Metrics to deduce which bridge a particular set of additional reachability information concerns, without needing to do any additional processing of either the IP:ports or the fingerprints.
With respect to CollecTor's algorithms for sanitising bridge IP:ports (should we decide to instead keep the bridge address information in OONI's bridge reachability reports and wish to sanitise those reports), Robert Ransom spoke with me on the 24th of October, and made the following points and suggestions:
Robert Ransom transcribed 1.0K bytes:
The Metrics system currently sanitizes bridge TCP addresses (IP+port) by HMACing them with a secret key stored on the server. That won't work for the reachability testing system for two reasons:
- The reachability-testing bridge clients should not know the key
needed to obfuscate TCP (or UDP, or other) addresses deterministically. (A deterministic public-key encryption would be just as bad as a hash.)
- BridgeDB must be able to learn the address for which a bridge's
reachability test was performed, so that it can decide whether the reachability-test results are valid for the bridge's current address.
I would suggest that the reachability-testing bridge client report a (randomized) public-key encryption of the address, where the decryption key is held by BridgeDB (so it can check whether the reachability test is relevant to the current ‘Bridge line’) and the Metrics sanitization server (so it can compute and publish a deterministically sanitized address, following the current sanitization procedure).