George Kadianakis:
Mike Perry mikeperry@torproject.org writes:
George Kadianakis:
I have mixed feelings about this.
- If client guard discovery is the main reason we are doing this, I think we should first look into these guard discovery vectors individually and figure out how concerning they are and if there is anything else we can do to block them,
I agree this is worthwhile, if only to better understand the design space. However, I think we're going to find that most applications we envision can be induced into violating many of the ad-hoc mitigations we try to bake in.
OK. Let's see. I feel that these guard discovery attacks can be blocked with:
a) If an IP listed on an HS descriptor tells you that it doesn't know the HS, then ignore it for this hidden service today.
b) If an HSDir that should have an HS descriptor tells you that it doesn't have it, then don't ask it again this hour.
I think we do both checks right now in the Tor codebase and we also have caches so that we don't retry the same nodes. If we are serious, we could even write those caches on disk.
I feel that if an application restarts Tor or flushes those caches because a hidden service does not work, then the application is doing it wrong.
Ok, well consider the browser. All that has to happen for Guard discovery is a bunch of nested iframes for many different hidden services, perhaps injected by the exit. We protect against this to some degree for non-HS traffic by using SOCKS u+p isolation in combination with keeping circuits open as long as they are used (#15482). But for HS traffic, new circuits will be built for each new HS address that is accessed, so we don't have the same ability to limit circuit creation.
For other things, like Ricochet, subtler failure modes can be introduced to cause circuit churn without repeated hsdir/IP activity, once you bring the full application layer into scope. Say I'm a compromised/malicious Ricochet user looking to track down activists, marginalized folks, whistleblowers, etc. I could rig my Ricochet to fail the rend circuit periodically, waiting for them to reconnect to me as an HS client over and over, until my malicious middle was chosen next to the target's guard. Ricochet (indeed, many P2P protocols) will keep reconnecting in this case.
Maybe this means that Ricochet made a mistake in using HS circuits in "full duplex" mode, where the application is agnostic wrt who initiates the connection, and both sides keep retrying. However, I suspect that all P2P protocols are going to make this mistake. If we manage to get HS endpoints working as WebRTC endpoints, then WebRTC calls/connections will also naturally end up with this problem as well. Probably just about anything designed for symmetric P2P Internet connections will also make this mistake.
Also even with client vanguards I think the checks above will still have to be implemented. I could imagine an application that flushes all the DataDirectory if the hidden service stops working, and then even vanguards won't save them.
In general, I'm not sure how much sanity we can assume from third-party applications.
I think even our own applications are going to surprise us. One of the things I had to repeatedly argue years ago was "Kill .exit notation: Path selection must not be capable of being influenced by untrusted content from the application layer." People whined and cried and whined and cried when .exit finally vanished from TBB, but it was really necessary to prevent all sorts of path manipulation+capture attacks.
Any time where the application can be induced into making new paths through the Tor network, that is vulnerability surface. For some applications, they actually *must* be allowed to make new circuits based on untrusted/semitrusted input, so the only thing we can do at the Tor network layer is to restrict the paths of those circuits to limit exposure.
My current thinking is that long-term, I still like "virtual circuits" for client exit traffic (https://trac.torproject.org/projects/tor/ticket/15458). Maybe that can be used for HS clients, too, but it kinda gets messy in that we'll want to keep re-using HS paths for different HS addrs with the same SOCKS u+p, which may have other problems. I could be talked into it instead of client vanguards, though.
before complicating path selection even more.
I feel like you're actually going to end up complicating the implementation more with this position. If we have to have separate path selection modes for service side and client side, we then have to maintain three different path selection mechanisms in Tor: normal exit, onion services, and onion clients.
If we gave the same options for both hidden services and clients, we are at least down to two systems (exit vs non-exit), with some minor options for each.
Hmmm maybe. But onion clients would look very much like normal exit, but they would connect to RPs/IPs, instead of exits. Just like the code is now.
Also, with vanguards if we end up doing something like:
HSDir: C - L - S - E - HSDir IP: C - L - S - E - IP Rend: C - L - M - RP -- S - M - L - HS
we have three different path types here. We would need to write very beautiful interfaces if we want this to be done by the same code.
- Also, I like symmetry myself, but I wouldn't change path selection and security just for that _if I can help it_.
<snip>
Hsdir post/fetch:
- C - L - M - S - E - H
- C - L - S - E - H
- C - L - S - H
Intro:
- C - L - M - S - E -- I - S - M - L - H
- C - L - S - E -- I - S - L - H
*3. C - L - S -- I&S - L - H (* IP Intersection attack!)
Rend:
- C - L - M - S - R -- E - S - M - L - H
- C - L - S - R -- E - S - L - H
- C - L - R&S -- S - L - H
What is R&S is here? Clients use static short-lifespan rendezvous points?
Yes. Similarly for I&S (which we should not do - it's bad in every variation of Vanguards).
I don't see any such problems with R&S though, since R is not associated with any publicly viewable information, I don't think it is as big of a problem. At best its a linkability risk for the client. But maybe I missed something.
Hmm, the only problem I can see here is that the R&S can link clients based on the L node. So for example, in the crazy edge case where only one client conncets to hidden services through R&S over L, then R&S could count "Ah this client has done 42 rendezvous through me in the past 5 hours". And if that's a ricochet client with 42 contacts maybe it's a selector. But I think this is a pretty far fetched example...
Another _big_ gotcha here is that let's say we end up doing:
HSDir: C - L - M - S - E - HSDir IP: C - L - M - S - E - IP Rend: C - L - S - RP -- S - M - L - HS
and all the 'S' nodes are taken from the same pool, then the 'L' node will be able to learn 'M' by looking at the IP circuits, and learn 'S' by looking at the rend circuit. So it will basically be able to derive the full circuit.
We need to be very careful about which paths we pick, and which "guardsets" we get the nodes from.
Looking at these, we can see that we sacrifice the middle guards in the second option, which will come at the cost of one less compromise attack (but still the need to compromise the long-lived guard). We also lose the unlinkability in the third option, and this actually bites us in Intro 3: the hidden service L guard can perform a long-term intersection attack, watching for published intro points and matching that to the circuits that H makes to them. So that path length probably should not be used.
<snip>
However, I still have mixed feelings about changing client path selection as part of proposal 247:
- My main issue is that I think figuring out the right client path
selection will require a _heavy_ amount of security analysis that will delay prop247 even more. I was hoping that we could treat the client-side as an orthogonal problem and tackle it in the future separately. But maybe I'm totally wrong and should be more patient and these two problems should be handled together.
I think patience is best, because if we don't understand this problem really well, we're liable to miss something. Or cement ourselves off from a potential future of interactive HS voice+video. Neither one is a great failure mode.
Agreed.
I think for many applications (esp the browser and ricochet), we're going to find that we need to protect the client just as much as the server.
- If the above changes only happen to HS circuits, we make it harder to
make HS circuits indistinguishable from normal circuits on the face of traffic analysis. But maybe we have already lost this game.
We already lost that game until we have multihop padding. Proposal 247 already outlines how to use it in section 4.1 to help conceal vanguard usage.
It is also worth pointing out that if we fail to conceal the HS vanguard fingerprint entirely with padding, it will be especially valuable to have more than just 30k service-side instances with the vanguard fingerprint. Far better to have all the clients in that anonymity set, too, I think.
Yes that's true. This seems to be the main argument for doing client vanguards right now for me.
However, to actually achieve any sort of confusion here, we need to ensure that the paths between clients and HSes are symmetric. So for example if we end up doing:
C - L - S - E -- IP - S - M - L - H
then the L guard could distinguish clients from HSes by looking at whether the second hop is short lived ('S') or medium lived ('M').
Ok, I think this, as well as your complexity argument earlier are great reasons not to mix and match strategy #1 with #2 or #3. If we do provide security vs latency tradeoff options, I'm now convinced that tradeoff should be consistent for all paths that an HS uses for all of its circuits.
If we only offered two security level options, I currently like HSDir#1+IP#1+Rend#1 for high security and HSDir#2+IP#2+Rend#3 for low security.
For the low security case, can we think of any reasons to decouple R&S in Rend#3, or to use Rend#2?