On 28 May 2015, at 12:59, Michael Rogers michael@briarproject.org wrote:
I wasn't thinking about the sizes of the sets so much as the probability of overlap. If the client picks n HSDirs or IPs from the 1:00 consensus and the service picks n HSDirs or IPs from the 2:00 consensus, and the set of candidates is fairly stable between consensuses, and the ordering is consistent, we can adjust n to get an acceptable probability of overlap. But if the client and service (or client and IP) are picking a single RP, there's no slack - they have to pick exactly the same one.
Yes. If I recall, 224 picks HSDirs by selecting node ids nearest to various hashes, so that missing HSDirs elsewhere cause no problems. We could lower the failure probability by dividing each IP into slices by typical availability in consensuses, while retaining this property, like you just proposed doing to rate IPs by bandwidth. Aren’t availability computations done anyways for granting nodes additional flags?
In any case, I suggested this as a way to save half a hop thereby allowing the HS to partially pin its second hop. I certainly do not know if the threat of clients repeatedly dropping circuits to expose an HS’s guard actually warrants the amount of work this approach entails. If so, then maybe it’d warrant some failure probability too, especially if a retry doesn’t cost much or risk exposing anything. I don’t know if HS’s partially pinning their second hop creates a traffic pattern that exposes them more to a GPA either.
In any case, there is a simpler approach : The client sends (IP, t, x, y) to the HS where t is the client’s consensus, x is a random number, and y = hash(x++c++RP++HS) where again c is the global random number in 224. An HS would refuse connections if IP is very far from y, or y is not derived correctly. If IP is near y but not the closest match, and the closest match has existed for a while, then the HS would merely log the suspicious. If hash() is hard to reverse, this proves that y is fairly random, so the HS can have a bit more trust in the IP being selected randomly. I suppose that'd justify the HS changing it’s second hop less often.
Of course, one could always just make the IP the client’s third hop, analogous to when using an exit node, thus giving the HS a full 4 hops to control. I do not actually understand why that’s not the situation anyways. <shrug>
Jeff