-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256
Hello Mike,
Answers inline.
On 1/24/2016 1:56 AM, Mike Perry wrote:
A) Can I deny service to a hidden service by methodically pretending to attack it from each honest relay, one at a time, causing it to become upset at each of these relays?
Only if you are the only one connecting to that hidden service and make 5 rendezvous circuits with each relay as a rendezvous point. But after little time the total number of rendezvous circuits will grow so large that you'll have to exponentially have to build more rendezvous circuits with each relay as a rendezvous point to ban them all. So it's a whole lot of work, you'll DDoS the hidden service guard or a lot of other things first, before hitting the limit of this protection.
Can you give a more rigorous analysis of this in the proposal, perhaps with more examples/math? This seems to suggest that if someone is mounting an attack, it gets harder and harder for the detector to recognize successive malicious RPs. Something seems amiss here.
The system is designed to filter rendezvous points based on the hidden service's popularity within the last 24 hours. It doesn't get necessarily any harder for the hidden service server, it only increases the costs, time required and efforts for an attacker. As mentioned in the proposal, it is not aimed to 100% eliminate the problem (nor we can ever truly eliminate it totally) but it will make it a lot harder for an attacker with very little costs for the Tor network, and here I mean load balancing and capacity, uniform distribution, alternate path design and complexity, layer 2 guards enumeration.
Consider this. We set the REND_FILTER_TOLERANCE to 4. This means that any relay, regardless of its middle probability fraction, will be able to build 4 +1 different rendezvous circuits with a hidden service server, before it gets banned for 24 hours.
So, an attacker will start by, as a client, building 5 rendezvous circuits with each relay in the consensus in order to try to make the hidden service inaccessible.
So he starts, and gets some relays banned. But after the attacker got 300 relays banned as rendezvous points on a hidden service (not affecting the other clients/hidden services in the network in any way) he would have made the hidden service server build 1500 circuits already.
This means that next time he wants to ban a relay as being used for rendezvous by that hidden service, and that relay has a 0.5% middle probability fraction (1% since we allow 2 times more to reduce error margin) he will have to build 1500 x 0.01 = 15 (+1) circuits to get this relay banned also.
After 100 more relays like this getting banned we have at least 1600 more rendezvous circuits, plus the previous 1500 = 3100 rendezvous circuits. And just 300 relays banned (out of ~7000). To continue, now the attacker has to build 3100 x 0.01 = 31 (+1) circuits to get another relay with middle probability fraction 0.5% banned. And so on.
As you can see, it's quite hard to make a hidden service inaccessible via this method, with over 7000 relays in the consensus and only 24 hours of rend circuit history (e.g. relays banned more than 24 hours ago have the ban lifted now).
There are plenty of things which will fail before this protection enables a DDoS attack for a hidden service, like the hidden service's guard which will fail _way_ before an attacker is able to ban ~7000 relays as rendezvous points for a hidden service. This is already an existing problem (bottleneck of hidden service capacity is its guard) so we are not making anything worse, the attack already exists with or without the rendezvous filter.
B) Can I fool your reputation system by raising the total number of rendezvous attempts that I attempt, in effect making the hidden service feel more popular so it's not alarmed as much by any single rendezvous point? I could imagine ways to launch a rendezvous attempt that are quite cheap on the part of a client who has no plans to follow through.
Yes, you could I think. But this has costs and is also visible to the hidden service operator. And we keep count of established rendezvous circuits with streams inside, not failed rendezvous circuits. We only count successes, to make it costly for an attacker.
They may not be able to differentiate this from a sudden spike in interest in the service, or attention from a scraper or DDoS.
Yes, if the attacker acts as a client which just builds proper rendezvous circuits at random rendezvous points, it increases the popularity and we cannot differentiate this from a sudden spike in interest, so the attacker will end up in being able to build little more circuits to a given rendezvous point. (HS popularity in the last 24 hours * relay middle probability fraction / 100 = circuits allowed for that rendezvous point).
Also, hidden services experiencing sudden spikes in interest from genuine clients will not be affected, and will nicely grow popularity with proper distribution among the relays in the consensus acting as rendezvous points, as math, probability theory and statistics tell us.
I'm also curious if you intend to combine this with Rend#1, Rend#2, or some other version than the previous thread? From your writing, my guess is you want this to apply this to paths like:
- C - L - M - R&E -- E - M - L - H
(Let's call this Rend#4).
Is that true?
No, this is what I intend to avoid, all the rend#n alternate paths. I want us to avoid having to do complex alternate paths and maintain different path selection. My idea is to leave clients as they are now - - no change, and just implement 4 medium term layer 2 guards (vanguards) at hidden service server side. So it will look like this:
C - L - E - R -- E - M - L - H
Plain and simple. If one rendezvous point can initiate a LIMITED number of circuits per 24 hours with one hidden service, based on its middle probability fraction, it becomes less likely to also pick this attacker's evil relay as 3rd hop in a rend circuit (from hidden service server side). Combined with a good random rotation period for the layer 2 vanguards we should keep them safe and maintain an acceptable low probability for an attacker to learn them. Of course the risk is non zero and will always be non zero regardless what we do - - we can only put things in balance (load balancing, complexity, etc.) and make sure we make the success rate of an attacker as low as we can and in the same time increase his costs.
If so, I'm a bit worried about the timing attack version of the Sybil attack in that case. We're unlikely to ever want to pad all the way across the circuit, which means that an adversary doesn't have to control R&E, but merely watch for connections to their chosen non-malicious R&E *from* their malicious E. If they always chose R&E to be very low-traffic nodes, there will be a very low base rate of normal circuits from E to R&E, making it a high probability that if malicious E sees an extend from a middle M to R&E, it is in the right position in the circuit and wins the Sybil, discovering M.
This makes me think that while this detector is useful for Rend#2 or Rend#4 (modulo Roger's concerns), it still doesn't allow us to abandon Rend#1 completely as a high security option. Or, at least, the proposal as written also lacks the math for us to make it comparable to Rend#1.
Still, even if it does not make the risk 0, it makes it way smaller than it currently is, which is good. Aside proposal 247 and aside alternate paths, it's not normal for a relay with 0.01% middle probability to be able to act as rendezvous point in UNLIMITED number of rendezvous circuits with the same hidden service. I am excluding clients because they are the ones who pick it, so they are not at risk here. If we can mitigate this by keeping a table with the rendezvous history for the last 24 hours it should be worth the effort I think.
Thanks for your time and looking forward to answer more questions.